ORIGINALRESEARCHGlobalversuslocalQSPRmodelsforpersistentorganicpollutants:balancingbetweenpredictivityandeconomyTomaszPuzynAgnieszkaGajewiczAleksandraRybackaMaciejHaranczykReceived:7January2011/Accepted:12February2011/Publishedonline:9March2011TheAuthor(s)2011.
ThisarticleispublishedwithopenaccessatSpringerlink.
comAbstractExperimentallydetermineddataonthekeyphysicochemicalparametersforhalogenatedcongenersofpersistentorganicpollutants(POPs)areavailableonlyforalimitednumberofcompounds.
Intheabsenceofexperi-mentaldata,arangeofcomputationalmethodscanbeappliedtocharacterizethosespeciesforwhichexperi-mentaldataisnotavailable.
Oneofthetechniqueswidelyusedinthiscontextisquantitativestructure–propertyrelationships(QSPR)approach.
TherearetwowaystodeveloptheQSPRmodels:usingamorecomplexglobalmodelorttingasimplelocalmodelthatcoversaspecicclassofchemicallyrelatedcompounds.
Theessenceofthestudywastoinvestigate,iflocalmodelshavesignicantlybetterexplanatoryandpredictiveabilitythanglobalmodelswithwiderapplicabilitydomains.
Basedontheobtainedresults,weconcludedthatwheneverglobalmodelsfulllallqualityrecommendationsbyOECD,theywouldbeappliedinpracticeasmoreefcientonesinstateofmoretimeconsumingprocedureofmodelingtheparticulargroupsofPOPsone-by-one.
Onthecontrary,localmodelsareapplicabletosolvespecicproblems(i.
e.
,relatedtoonlyonegroupofPOPs),whenhigh-qualityexperimentaldataareavailableforasufcientnumberoftrainingandvalidationcompounds.
KeywordsGlobalmodelsLocalmodelsQSPRPersistentorganicpollutantsIntroductionTheoccurrenceofpolyhalogenatedpersistentorganicpol-lutants(POPs),suchasCl/Br-substitutedbenzenes(CBz/BBz),biphenyls(PCBs/PBBs),diphenylethers(PCDEs/PBDEs),dibenzofurans(PCDFs/PBDFs),dibenzo-p-diox-ins(PCDDs/PBDDs),andnaphthalenes(PCNs/PBNs)inair,water,soil,andsedimentshasbeenidentiedasaseriousenvironmentalthreat[1].
LargeamountsofPOPscomefromvariousanthropogenicsources,includingintentionallysyn-thesizedliquidsutilizedintransformersandcapacitors,plasticisers,ameretardants,aswellasthermalrecyclingofwaste,domesticheating,etc.
Substantialvolumesofthesecompoundsarereleasedineffectofgiantres,asthemostrecentreoftheoilspillattheDeepwaterHorizonplatformintheGulfofMexico[2].
Regardlessoftheirsource,theexposuretoPOPscancauseavastrangeofacuteandchronichealtheffects,includingmutagenic,carcinogenic,andmetabolicones.
Inaddition,aspersistentandliphophilicsubstances,POPscanbebioaccumulatedinbodyandbio-magniedinnaturalecosystems[3].
Hence,thereisanurgentneedtodeterminephysico-chemicalpropertiesrequiredtoperformacomprehensiveriskassessmentforallPOPs.
Unfortunately,thenumberofallpossiblecongeners(similarcompoundsbasedonthesamecarbonskeleton,butdifferbyanumberofchlorine/bromineatomsandthesubstitutionpattern)isextensive.
Intotal,thereare1436structurallydifferentcongenersofElectronicsupplementarymaterialTheonlineversionofthisarticle(doi:10.
1007/s11224-011-9764-5)containssupplementarymaterial,whichisavailabletoauthorizedusers.
T.
Puzyn(&)A.
GajewiczA.
RybackaLaboratoryofEnvironmentalChemometrics,FacultyofChemistry,UniversityofGdansk,Sobieskiego18,80-952Gdansk,Polande-mail:t.
puzyn@qsar.
eu.
orgM.
HaranczykComputationalResearchDivision,LawrenceBerkeleyNationalLaboratory,OneCyclotronRoad,MailStop50F-1650,Berkeley,CA94720,USA123StructChem(2011)22:873–884DOI10.
1007/s11224-011-9764-5polychlorinatedandpolybrominatedbenzenes,biphenyls,dibenzo-p-dioxins,dibenzofurans,diphenylethers,andnaphthalenes(Fig.
1).
Thenumberofpossiblemixedchloro-andbromo-substitutedisatleastoneorderofmagnitudelarger[4].
Forsuchalargenumberofcom-pounds,empiricalmeasurementofthephysicochemicalpropertiesisimpossible,duetohighcostsandtimelimi-tationsoftheanalyticalprocedures.
Therefore,alternativemethodsforphysicochemicalcharacterizationofPOPsarerequired.
Averypromisinggroupofsuchmethodsisthequanti-tativestructure–propertyrelationships(QSPR)approach.
QSPRisbasedontheassumptionthateachphysico-chemicalpropertyinagroupofcompoundscanbeexpressedasamathematicalfunctionoftheirchemicalstructure,representedbyasetofso-calledmoleculardescriptors.
Thus,basedontheexperimentaldata,availableonlyforsomerepresentativesofthegroup,itispossibletointerpolatethelackingdataforcompounds,forwhichsuchdataaremissing,fromthecalculatedmoleculardescriptorsandasuitablemathematicalmodel[5–7].
TwopossibleQSPRmodelingstrategieshavebeendescribedinthelit-erature,namely:localandglobalmodels.
Localmodelsarerestrictedonlytoonespecicclassofchemicallyrelatedcompounds(e.
g.
,PCBs),whereasglobalmodelsaredevelopedforalargenumberofstructurallysimilargroupsofcompounds(e.
g.
,PCBs,PCNs,PCDDs,PCDFs,etc.
).
Itiswidelyacceptedthatthelocalmodelshavebetterpre-dictiveabilityincomparisonwiththeglobalmodels[8].
However,theglobalmodelsseemtobeveryattractivefromaneconomicpointofview,becausesuchamodelingstrategyenablestoadditionallysaveresourcesbypredict-ingnewdataforalargernumberofcompoundsatatime.
Theargumentagainsttheglobalmodelingisthatthisstrategymayleadtomechanisticoversimplicationsand/orhighererrorsinthepredicteddata[9].
Therefore,therearetwofundamentalquestionsrelatedtothetopic.
First:HowsignicantarethedifferencesintheresultsobtainedusinglocalandglobalQSPRsSecond,consequently:Isthereductionofthemodel'sdomain(toonlyonegroupofFig.
1Chemicalstructuresofparentmoleculesofbenzenes,biphenyls,dibenzo-p-dioxins,dibenzofurans,diphenylethers,andnaphthalenesusedtoconstructchlorine-substitutedcongeners874StructChem(2011)22:873–884123POPs)reallynecessarytoimprovethepredictivepowerofaQSPRmodelOurstudywasaimedtoanswerbothquestions.
MaterialsandmethodsGlobalandlocalQSPRsTondtheanswers,weinitiallyselectedonephys/chempropertyandonecongenericgroupofPOPs,namely:watersolubilityin25°Candpolychlorinatednaphthalenes.
Then,weperformedadetailedcomparisonbetweenthepredictionswithlocalandglobalQSPRsforthisgroup.
Thesolubilityhasbeenselected,becauseitisaproperty,importantinestimatingbothenvironmentaltransportandtoxicokineticsafterenteringthebody[3].
ThegroupofPCNs(containing75congeners)hasbeenselectedforthecasestudy,sincetheparentmolecule(naphthalene)isstructurallythesimplestpolycyclicaromatichydrocarbon.
Moreover,polychlorinatednaphthaleneswere,historically,thersteverintentionallysynthesizedPOPs(between1910sand1980s)[10].
Theglobalmodelhasbeendevel-opedtogetherfor11othergroupsofhalogenatedPOPs,namely:CBzs,BBzs,PCBs,PBBs,PCDDs,PBDDs,PCDFs,PBDFs,PCDEs,PBDEs,PBNs,andPCNs(1,436compoundsintotal).
WehypothesizedthatwatersolubilityobtainedfromalocalQSPRmodelshouldnotsubstantiallydifferfromthosepredictedwithaglobalQSPRmodelforPOPs,duetothesimilarityofcarbonskeletons,thelevelofhalogenationandthesubstitutionpatternsofthestudiedcompounds.
Toverify,whetherthehypothesisandconclusionscanbeextendedtotheotherphys/chempropertiesandgroupsofPOPs,weadditionallyperformedacrosscomparisonbetweenfewlocalandglobalQSPRs,collectedfromtheliterature.
DevelopmentoftheglobalQSPRmodelDevelopmentofahigh-qualityQSPRmodelwithgoodpredictiveabilityrequiresreliableexperimentaldata,ononehand,andappropriatemoleculardescriptorsontheotherone.
Theprocedurewefollowedwhenconstructingtheglobalmodelincludedvesteps:Step1:Experimentaldatacollectionandsplittingthecompounds,forwhichthedataareavailable,intoatrainingset(T)andavalidationset(V)ThecrucialconditionthatmustbemettoobtainaplausibleQSPRmodelishomogeneityandhigh-qualityoftheexperimentaldata.
Itisbecausethequalityofthedatasignicantlyinuencesthemodelingresults.
Thus,noonecanexpectfromthedatapredictedwiththemodeltobebetterthantheoriginaldatautilizedtodevelopingthemodel.
Inpractice,thismeansthattheexperimentaldatashouldbeobtainedinasystematicway,accordingtothesamestandardizedprotocol[11].
Thisstageminimizestheriskofobtaininghighlyuncertain,extrapolatedresultsfromtheQSPRmodeling.
ForthepurposeofdevelopingaglobalQSPRmodel,whichquantitativelydescribestherelationshipbetweenthemolecularstructureofthehalogenatedPOPsandwatersolubility(logS),wecollectedtheexperimentaldataonwatersolubilityoriginallydeterminedat25°C.
Thevaluesofsolubilityforpolychlorinatedbiphenyls(PCBs)weretakenfrom[12,13],forpolychlorinateddibenzo-p-dioxins(PCDDs)from[14],forpolychlorinateddibenzofuran(PCDFs)from[14],forpolychlorinated/polybrominateddiphenylethers(PCDEs/PBDEs)from[15,16],forpoly-chlorinatednaphthalenes(PCNs)from[17],andforpoly-chlorinatedbenzenes(CBz)from[18].
Theexperimentaldatahavebeenavailablefor121halogenatedcongenersofPOPsintotal.
Logarithmicvaluesofthesolubilityvariedbetween-2.
58and-10.
83[mol/dm3](formoredetails,pleaserefertotheelectronicSupplementarymaterial).
Next,the121congenersweresortedalongwiththedecreasingvaluesofwatersolubility.
Then,everyfourthcompoundwasmovedtotheso-calledvalidationset(anadditionalsetforfurtherexternalvalidationofthemodel),whiletheremainingcompoundsformedthetrainingset(fordevelopingthemodel).
Theapplicationofthis''three-to-one''splittingalgorithmensuredthatthebothtrainingandvalidationsetswerecontainthecompoundsevenlydistributedwithintherangeofthewatersolubility[19].
Thesplittingprocedureledtoatrainingandavalidationsetconsistedof91(75%)and30(25%)compounds,respectively.
Step2:CalculatingmoleculardescriptorsSimultaneously,wecombinatoriallygeneratedmolecularstructuresofallchloro-andbromo-substitutedcongeners(1436compounds)withtheConGENER[20]softwarepackage,whichisbasedonourearlierworkoncharacter-izationofcombinatoriallygeneratedlibrariesoftautomers[21].
Weutilizedthosestructuresasinputsforquantum-mechanicalcalculationswhichincludedtwostages:(i)optimizationofthemoleculargeometrywithrespecttotheenergygradientand(ii)calculationofthedescriptorsbasedontheoptimizedgeometry.
Thecalculationshavebeenperformedatthesemi-empiricallevelofthetheorywithuseofPM6method[22]inMOPAC2009softwarepackage[23].
Wecalculatedthefollowing26moleculardescriptors:thenumberofatomsinthemolecule(nAT),thenumberofStructChem(2011)22:873–884875123chlorinesubstituents(nX),themolecularweight(MW),thestandardheatofformation(HOF),theelectronicenergy(EE),thecore–corerepulsionenergy(Core),thetotalenergy(TE),thetotalenergyofthecorrespondingcation(TE),thestandardheatofformationinasolutionrepresentedbytheConductor-likeScreeningModel,COSMO(HOFc),thetotalenergyinasolutionrepresentedbyCOMSO(TEc),theverticalionizationpotential(IP),theenergyofthehighestoccupiedmolecularorbital(HOMO),theenergyofthelowestunoccupiedmolecularorbital(LUMO),theXvectorofthedipolemoment(Dx),theYvectorofthedipolemoment(Dy),theZvectorofthedipolemoment(Dz),thetotaldipolemoment(Dtot),thesolventaccessiblesurface(SAS),themolecularvolume(MV),thelowestnegativeMulliken'spartialchargeonthemolecule(Q-),thehighestpositivepartialchargeonthemolecule(Q),theaveragepolariz-abilityderivedfromtheheatofformation(Ahof),theaver-agepolarizabilityderivedfromthedipolemoment(Ad),Mulliken'selectronegativity(EN),ParrandPople'sabsolutehardness(Hard),andSchuurmannMOshiftalpha(Shift).
StepIII:CalibratingandinternalvalidationoftheQSPRmodelHavingboth,high-qualityexperimentaldataandmoleculardescriptors,wedevelopedQSPRmodelfollowingthegoldenstandardsandrecommendationsoftheOrganizationforEconomicCooperationandDevelopment(OECD)[24].
RegardingtotheveOECDrecommendations,anidealQSPRmodelshouldbeassociatedwith:(i)adenedendpoint;(ii)anunambiguousalgorithm;(iii)adenedapplicabilitydomain;(iv)appropriatemeasuresofgoodness-of-t,robustnessandpredictivity;(v)amechanisticinterpretation,ifpossible.
WeemployedthePartialLeastSquaresregressioncombinedwithageneticalgorithm(GA-PLS)astheche-mometricmethodofmodeling.
PLSisbasedonalineartransitionfromalargenumberoforiginaldescriptorstoasmallnumberofneworthogonalvariablesso-called''latentvectors''(LVs),beinglinearcombinationsoftheoriginaldescriptors[25].
InordertoselecttheoptimalcombinationofthemoleculardescriptorstobeutilizedinthenalQSPRmodel,weemployedtheHolland'sgeneticalgorithm(GA)[26].
Thealgorithmminimizesthepredictionerrorbysearchingforthemostoptimalcombinationofthedescriptors.
Thename''genetic''camefromfactthatthismathematicalprocedureusestherulesofDarwiniantheoryofevolution.
However,inthiscase,therulesareappliedto''populations''and''generations''ofmathematicalsolu-tions(i.
e.
,combinationsofthedescriptors),nottopopulationsandgenerationsoflivingorganisms.
Thealgorithmiscontrolledbyasetofsteeringparameters.
Inourstudies,wehavespeciedthefollowingones:thesizeofapopulation:124,thepercentageoftheinitialterms:40%,themaximumnumberofgenerations:100,theper-centageofconvergence:50%,themutationrate:0.
005,doublecross-over:thenumberofrepetitions:7.
GA-PLScalculationswereperformedwithMATLAB7.
6[27]andPLSToolbox5.
2[28].
AnintegralpartofQSPRmodelingistoappropriatelydescribethebordersoftheoptimumpredictionspaceofthemodel.
Thespace,so-calledapplicabilitydomain(AD),isdenedbythenatureofthecompoundsincludedinthetrainingset.
WeveriedtheapplicabilitydomainbyuseoftheWilliamsplot,whichistheplotoftheleveragevaluesversuscross-validatedstandardizedresiduals[29,30].
Theleveragevaluehiforeveryithcompoundiscalculatedasfollows:[31](Eq.
1):hixTiXTX1xi1wherexiisthevectorofdescriptorscalculatedfortheconsideredithcompoundandXisthematrixofdescriptorscalculatedforthewholetrainingset.
Thevalueofhigreaterthanthecriticalone(h*)meansthatthestructureofacompounddiffersfromthetrainingsetsignicantlyand,inconsequence,thecompoundfallsoutsidetheoptimumpredictionspaceofthemodel[32].
Thewarningvalueh*iscalculatedaccordingtothefor-mula(Eq.
2):h3p1n2wherepisthenumberofvariablesusedinthemodelandnisthenumberoftrainingcompounds.
However,factthathi[h*doesnotalwaysindicatethattheithtrainingcompoundisanoutlier.
Ithasbeenshownthattrainingcompoundswithhighleveragesandsmallresiduals(differencesbetweentheobservedandpredictedvalues)stabilizethemodelandmakeitmoreprecise.
Suchpointsareso-called''goodleverages.
''Onlythecom-poundswithhighleveragesandresidualshigherthan±3standarddeviationsunits(so-called''badleverages'')destabilizethemodel[33].
Inordertoproverobustnessofthemodelandreduceprobabilityofthemodel'sovertting,weperformedaninternalvalidation[29,34].
Forthispurpose,weemployedtheleave-one-outcross-validation(CV-LOO)algorithm,inwhichthesamecompoundswereusedalternatingforthetrainingandvalidation[30].
Goodness-of-t(i.
e.
,howwellthemodeltsthedata)wasmeasuredbythedeterminationcoefcientinthetrainingset(R2)andtherootmeansquareerrorofcalibration(RMSEc)876StructChem(2011)22:873–884123(Eqs.
3and4).
WhereasthequantitativeassessmentoftherobustnesswasexpressedbytheCV-LOOdeterminationcoefcient(QCV2),theabsoluteaveragerelativedeviation(AARD),androotmeansquareerrorofcross-validation(RMSECV)(Eqs.
3–7)[30].
R21Pni1yobsiypredi2Pni1yobsi"yobs23RMSECPnn1yobsiypredi2nvuut4Q2CV1Pni1yobsiypredcvi2Pni1yobsi"yobs25AARD100nXni1yobsiyprediyobsi6RMSECVPnn1yobsiypredcvi2nvuut7whereyiobsistheexperimental(observed)valueofthepropertyfortheithcompound,yipredthepredictedvaluefortheithcompound,yipredcvthepredictedvalueforthetem-poraryexcluded(cross-validated)ithcompound,"yobsthemeanexperimentalvalueofthepropertyinthetrainingset,nthenumberofcompoundsinthetrainingset.
StepIV:ExternalvalidationofthedevelopedQSPRmodelToconrmthemodel'spredictivepower,wecarriedouttheexternalvalidationbasedonthecompoundsthatwerenotpreviouslyengagedinthemodel'soptimizationand/orcalibration[30].
Weutilizedtheexternalvalidationcoef-cient(QExt2)andtherootmeansquareerrorofprediction(RMSEP)(Eqs.
8and9)asmeasuresoftheexternalpredictivity.
Q2Ext1Pkj1yobsjypredj2Pkj1yobsj"yobs28RMSEPPkj1yobsjypredj2kvuut9whereyjobsistheexperimental(observed)valueofthepropertyforthejthcompound,yjpredthepredictedvalueforjthcompound,"yobsthemeanexperimentalvalueofthepropertyinthevalidationset,andkthenumberofcom-poundsinthevalidationset.
StepV:ApplyingthemodeltopredicttheendpointvaluesfornewcompoundsWhentheQSPRmodelfulllsallthevalidationcriteria,itcanbeappliedtopredicttheproperty(i.
e.
,watersolubility)ofthosenewcompounds,forwhichtheexperimentaldatahavenotbeenavailable.
MethodologyofcomparinglocalandglobalQSPRmodelsParticularlocalandglobalmodelswerecomparedeachothertakingintoaccounttwoaspects:economyandqualityofeach.
Thenumberoftrainingcompoundsandapplica-bilitydomainofthemodelrepresentedtheeconomicaspect,whereasthemeasuresofgoodness-of-t,robust-ness,andpredictivity—thequalitativeaspect.
Inaddition,weemployedStudent'sttesttoverify,whethertheaverageresidualsfromthepredictionswithlocalandglobalQSPRmodelsdiffersignicantly(p\0.
05).
ResultsanddiscussionComparingglobalandlocalQSPRmodelsofwatersolubilityAsmentioned,atrstweperformedacomparisonbetweentwoQSPRmodelsofwatersolubility(logS)developedbyourgroup.
Therstmodelwasdevelopedwithinthisstudy,whereasthesecondQSPRwastakenfromoneofourpreviouscontributions.
GlobalQSPRmodelofwatersolubilityWhenappliedtheve-stepprocedureofQSPR,includingGA-PLSmethod,weobtainedastatisticallysignicant(p\0.
05)globalmodel,capabletosuccessfullypredictthevaluesoflogSfor1436halogenatedPOPs.
Themodelutilizedthreelatentvectors(LVs)explainingtogether95%(57%17%21%)ofthetotalvarianceinthemolec-ulardescriptorsand93%(90%2%1%)ofthevari-anceinthemodeledendpoint(logS).
AlthoughtheGA-PLSmethodusesorthogonallatentvectorsforregression,itisalsopossibletoderive''quasi-regression''coefcientsfororiginaldescriptors(Eq.
10),keepinginmindthatthesecoefcientscannotbeindividuallyinter-preted,becausetheyarenotindependent[25].
logS0:287nAT0:293nX0:191LUMO0:320SAS0:085Q0:126Shift10TheglobalQSPRwascharacterizedbythesatisfactorygoodness-of-t,therobustness,andtheexternalpredictiveStructChem(2011)22:873–884877123performance(thestatisticalmeasuresaresummarizedinTable1).
AvisualcorrelationbetweentheexperimentalandpredictedvaluesoflogSispresentedinFig.
2a.
Themodelcanbeintuitivelyinterpreted,accordingtothephysicochemicaltheoryofdissolvation.
Thetheorydividesthewholeprocessintosixstages,namely:(i)breakingupsolute–soluteintermolecularbonds;(ii)breakingupsolvent–solventintermolecularbonds;(iii)formationofacavityinthesolventphaselargeenoughtoaccommodatesolutemolecule;(iv)vaporizationofsoluteintothecavity;(v)formingsolute–solventinter-molecularbonds;and(vi)reformingsolvent–solventbondswithsolventrestructuring.
Thus,sinceformationofthecavityappropriateforhighlyhalogenated,largemoleculesrequiresmoreenergy,thesolubilityoflargercongenersislower,whencomparingwithlesshalogenatedandsmallercongeners.
Thisfactorisrepresentedinthemodelequation(Eq.
10)bythreedescriptors:SAS,nAT,andnXthathaveanegativecontributiontothesolubility(i.
e.
,thesolubilityincreaseswhenthesolventaccessiblesurface,thenumberofatoms,andthenumberofhalogensubstituentsdecrea-ses).
Similarly,thedescriptorsthatarerelatedtoelectro-staticinteractions(e.
g.
,forminghydrogenbonds)betweenthesolventandsoluteandchemicalreactivity,namely:LUMO,Q,Shift,positivelycontributethesolubility.
Itisbecausetheprocessofformingsolute–solventintermo-lecularbondsfacilitatesdissolvation.
LocalQSPRmodelofwatersolubilityThelocalmodel,originallycalibratedonlyforagroupof75polychlorinatednaphthalenes,hasbeenadaptedfromourpreviouspaper[35].
Itwasbasedoneighttheoreticalmoleculardescriptors,calculatedexclusivelyfromthechemicalstructuresattheDensityFunctionalTheory(DFT)levelwiththeB3LYPfunctionaland6-311G(d,p)basisset.
Acombinationofthoseeightdescriptorsformedonelatentvector,utilizedthenasanindependentvariabletoconstructaone-variableGA-PLSmodel.
Themodelexplained93%ofthestructuralvariance(varianceinthedescriptors)and96%ofthevarianceinlogS.
Thisone-variablemodelcanbealternativelyexpressedinthequasi-regressionform(Eq.
11):logS0:109nClp10:123HOMO0:131Hard0:129Et0:131SASw0:132SAVw0:131DEw0:129TNEw11wherenClp1isthenumberofchlorineatomsintherstaromaticring,HOMOtheenergyofthehighestoccupiedmolecularorbital,Hardthemolecularhardness,Etthetotalenergyofthemolecule,SASwthesolventaccessiblemolecularsurfaceareainthewater,SAVwthesolventaccessiblemolecularvolumeinthewater,DEwthedis-persionenergyinthewater,andTNEwthetotalnon-electrostaticenergyofsolvation.
HighvaluesofR2,QCV2,andQExt2,aswellaslowvaluesofthesquarederrors:RMSEC,RMSECV,andRMSEP(Table1)conrmedthatthemodelwaswell-tted,robust,anddemonstrateditsgoodpredictiveability.
TheexistenceofastronglinearcorrelationbetweentheobservedandTable1ComparisonofstatisticalparametersbetweenlocalandglobalGA-PLSmodelsoflogSFeatureMeasureLocalQSPRmodelGlobalQSPRmodelGoodness-of-tR20.
960.
93RMSEC0.
280.
46RobustnessQCV20.
960.
93RMSECV0.
350.
48PredictivityQExt20.
960.
95RMSEP0.
200.
39Fig.
2TheexperimentallydeterminedvaluesoflogSversusthevaluesoflogSpredictedbyglobal(a)andlocal(b)QSPRmodels878StructChem(2011)22:873–884123predictedvaluesoflogShasbeengraphicallyproved(Fig.
2b).
DetailsonthelocalQSPR'sdevelopmentcanbefoundintheoriginalpaper[35].
Itshouldbementioned,however,thattheinterpretationoftheuseddescriptorsisverysimilartothosefortheglobalmodel.
Accordingtoourpreviouscontribution[35],thedescriptorsrefertothecavitationprocess(SASwandEt)aswellastothedisper-sive(DEwandTNEw)andelectrostatic(nClp1,HOMO,andHard)interactions.
ResultsofthecomparisonWheneversomeonewantstocomparetwoQSPRmodels,oneusuallystartsfromevaluatingtheirstatisticalcharac-teristics.
Withoutdoubts,themeasuresofgoodness-of-t,robustness,andpredictivity(Table1)favorthelocalQSPR.
Highercorrelationcoefcients(R2,QCV2,andQExt2)anduptotwotimeslowervaluesoftherootmeansquareerrorsforboththetrainingandthevalidationsetsincomparisontotheglobalmodelprovedthatlocalmodelwasmoreaccurateandhadbetterperformanceofexploringrelationshipsbetweenthestructureandwatersolubilityofPOPs.
Thisconclusionisalsosupportedbyanalysisoftwoplots(Fig.
3)presentingresidualscalculatedforchloro-naphthalenesbasedonthepredictionswithglobalandlocalQSPRs.
Note,theresidualswerecalculatedonlyfor15PCNs,forwhichtheexperimentallydetermineddataonwatersolubilityhavebeenavailable.
Incaseofthelocalmodelthatcoveredanarrowcalibrationdomain(consistedofverysimilarchloronaphthalenecongenersonly),thepredictionerrorswereconsiderablylowerthanthepre-dictionerrorsoftheglobalmodelwithawiderdomain(allPOPs).
ByemployingStudent'sttest,weconrmedthattheaverageresiduals(for15PCNs)forbothmodelsdif-feredsignicantly(t=4.
40,p=0.
0006).
Therefore,fromthequalitativepointofview,anappli-cationofthelocalmodelshouldberecommendedasbeingmoreaccurateandprecise.
However,theperformanceoftheevaluatedglobalmodelforPOPswasstillfairlygoodincomparisonwithother,moregeneralQSPRs.
Forinstance,Delaney[36]puttogetherstatisticsof10recentlypub-lishedQSPRmodelsofwatersolubility,calibratedontrainingsetscontainingbetween150and2874compounds.
Then,themodels'predictivitywastestedonthesame21compoundshavingacommonchemicalstructure.
Theauthorfoundthestandarderrorsofpredictionforthose21chemicalsvariedbetween0.
55and0.
91logarithmicunits.
Regardingthatthehigherresidualobservedforour''worse''globalmodelforPOPswasaboutonelogarithmicunit,itmaybeconcludedourglobalmodelpredictswatersolubilityuptothreetimesbetterthanthegeneralmodelsreviewedbyDelaney.
Fromtheeconomicalpointofview,anoptimalQSPRmodelshouldcharacterizebytwofeatures:(i)itshouldbebasedonpossiblysmallnumberoftraining/validationcompounds,withoutnecessitytoperformextensiveexperimentalworkand,simultaneously,(ii)itshouldensuremakingpredictionswithinapossiblywideappli-cabilitydomain.
TheminimalnumberofcompoundsrequiredfordevelopingaQSPRmodelisdenedbytheratiobetweenthenumberofdescriptorsandtrainingcompounds.
AccordingtothecriterionproposedbyTopplissandCos-tello[37],thisratioshouldbeatleast5:1.
Thelocalmodelthatutilizedonevariable(latentvector)hasbeencalibratedon10trainingcompounds,whereastheglobalmodelthatutilizedthreelatentvectorshasbeencalibratedon91trainingcompounds.
Thus,bothstudiedmodelsmetthecriterion,sincetheratioswere10:1forthelocalmodeland30:1fortheglobalmodel,respectively.
Fig.
3Residualvalues(inlogunits)calculatedfor15chloronaphn-thalenecongenersbasedonthepredictionswithglobal(a)andlocal(b)QSPRmodelsStructChem(2011)22:873–884879123Therearenoformalrequirementsrelatedtothenumberofvalidationcompounds,butdifferentauthorsgivesomerecommendations,basedontheirexperience.
Forinstance,Gramatica[30]recommendshavingatleastvecom-poundstoperformanappropriateexternalvalidation.
Bothmodelsfullledthisrecommendation,sincethenumberofvalidationcompoundswas5forthelocaland30fortheglobalQSPR.
However,accordingtoourexperience,whenthevalidationsetissmall(ofabout10compoundsandless),theresultsofexternalvalidationcouldbelessreli-able.
Itisbecause,insuchacase,thevalidationstatistics(QExt2andRMSEP)stronglydependonthesplittingalgo-rithm.
Indeed,theycansignicantlychange,whenonevalidationcompoundisreplacedwithanotherone[38].
Therefore,theexternalvalidationofourglobalmodeloflogSseemstobemorereliableincomparisontotheexternalvalidationofthelocalone.
ApplicabilitydomainsofbothmodelswereveriedbyusingtheWilliamsplots(Fig.
4).
TheglobalmodelhasbeencalibratedandvalidatedoncongenersofCBzs(10trainingand2validationcompounds),PCDEs(25trainingand6validationcompounds),PBDEs(6trainingand3validationcompounds),PCBs(24trainingand10valida-tioncompounds),PCDDs(11trainingand4validationcompounds),PCDFs(6trainingand2validationcom-pounds),andPCNs(9trainingand2validationcom-pounds).
Watersolubilityofallvalidationcompoundswaspredictedwiththeresidualslowerthanthecriticalthresh-oldvalues(0±3standarddeviations).
ThismeansthemodelcanbesuccessfullyappliedforpredictingthevaluesoflogSforallsevengroupsofPOPslistedabove.
Inter-estingly,threecompoundsfromthetrainingset(Fig.
4a)hadtheleveragevalueshigherthanthecriticalone(h*=0.
14).
Thecompoundsareperchlorinatedbenzene(CBz-12),perchlorinatednaphalene(PCN-75),andper-chlorinatedbiphenyl(PCB-209).
But,simultaneously,theirresidualswerelow.
Thissuggeststhemodeliswellstabi-lizedbytheexistenceofso-called''goodleveragepoints.
''Inaddition,themodelisprobablycapabletoperformreliablepredictionsforthecompoundsnotdifferingsub-stantiallyfromthetrainingset,butformallysituatedout-sideoftheapplicabilitydomain.
Thelastconclusion,however,shouldbeconrmedbyanadditionaltestingwithanadditionalvalidationsetofcompoundsthathavehighleveragevalues.
Inasimilarway,lowresidualsandleveragevaluesforall10trainingand5validationcom-pounds(Fig.
4b)conrmedthatthelocalmodelcanbeappliedtomakesatisfyingpredictionsofwatersolubilitywithinthegroupofchloronaphthalenes.
Thelastaspectthatshouldbetakenintoaccountwhencomparingbothmodelsistheselectionofmoleculardescriptorsemployedineachcase.
Onecanbesurprisedthatwecomparedtwomodelsutilizingthedescriptorscalculatedatdifferentlevelsoftheory(i.
e.
,theglobalmodelhasbeendevelopedbasedonmoleculardescriptorsfromsemiempiricalPM6calculations,whereasthelocalmodelusedDFTdescriptors).
However,wepreviouslydemonstrated[39]thateventualdifferencesinthenumer-icalvaluesofmoleculardescriptorsforPOPscalculatedwithbothmethodscouldbeneglected.
WeprovedthatQSPRmodelsemployingthedescriptorscalculatedatthelevelofnovelsemiempiricalmethods(PM6andRM1)wereofsimilaraccuracythatthemodelsutilizingdescriptorsfromDFT(B3LYPfunctionalwith6-311G(d,p)basisset).
Thislevelofaccuracywasoutofreachforthemodelsemployingearliersemiempiricalmethods(e.
g.
,PM3andAM1).
Moreover,itmaybeunclearwhy,whenputtingtogetherbothmodelequations(Eqs.
10and11)forthesameproperty(logS),theselecteddescriptorsaredifferent(e.
g.
,Fig.
4Williamsplotdescribingapplicabilitydomainsofglobal(a)andlocal(b)QSPRmodels.
Dottedlinesrepresenttheresidualthreshold(0±3standarddeviationunits),andthecriticalleveragevalue(h*),respectively880StructChem(2011)22:873–884123LUMOinEq.
10andHOMOinEq.
11)Toclarifytheseapparentcontradictions,oneneedstorefertothetheoryofdissolvation(describedinsection''GlobalQSPRmodelofwatersolubility'')andtokeepinmindthreefollowingimportantissues.
First,thequantum–mechanicaldescriptorsthatweusedareinternallycorrelated.
Thus,theyformgroupsofdescriptorsrelatedtothesame''global''property(latentvectors)and,becauseofthat,havingverysimilarmeaning.
Inconsequence,onedescriptorfromtheparticulargroup(latentvector)canbereplacedwithanotheronefromthesamegroupwithoutchangingoftheglobalinterpretationofthemodel.
Forinstance,inagroupofchlorinatedcong-eners,bothtotalenergyandthesolventaccessiblesurfaceareamainlydependonthenumberofchlorinesubstituentspresentinmoleculeswiththesamecarbonskeleton.
Therefore,inthiscontext,bothdescriptorshaveverysimilarmeaning.
Forthatreason,wedecidedtousePLSmethodofmodelinginsteadofmuchsimplerandmoreintuitivelyinterpretativemultiplelinearregression(MLR).
Second,moleculardescriptorsforbothlocalandglobalmodelswereselectedwithuseofthegeneticalgorithm.
Thealgorithmis,infact,anautomaticprobability-basedprocedure,blindonthemechanisticinterpretation.
Ineffect,whenthealgorithmhasachoicebetweentwostronglycorrelateddescriptorsrelatedtothesame''global''property(seeabove),itmightselecttherstortheseconddescriptoronlybychance.
Third,whenconsideringalocalmodel,developedforonlyonecongenericgroup(i.
e.
,polychlorinatednaphtha-lenes),themodelismuchmoresensitiveonthenumberofsubstituents(chlorineatoms)andthesubstitutionpatternthantheglobalmodelcalibratedformoregroups,inwhichthemaindifferencesbetweenparticularcompoundsarerelatedtotheircarbonskeletons(i.
e.
,thenumberofaro-maticrings,presenceofheteroatoms,etc.
).
Hence,nooneshouldexpectexactlythesamemodelequationsfortheglobalandlocalmodelsbeingcomparedinourstudy.
Inthecontextofthedissolutionmechanism,threestructuralfeatures(''global''properties)ofPOPs'congenersseemtobeveryimportant.
Theyare:(i)thesizeoftheparentmolecule(carbonskeleton),(ii)thetypeandthenumberofsubstituentspresentinthemolecule,and(iii)thesubstitutionpattern.
Therst''global''propertyisobviouslyrelatedtothecavitationprocess.
Weobservedthatthesolubilitydecreaseswiththeincreasingsizeofthemolecule.
Thetypeandthenumberofsubstituentsare,ofcourse,alsostronglyrelatedtothesize,andconsequently,tothecavitationstage.
Generally,moleculesbasedonthesameskeletonandsubstitutedwiththesamenumberofbromineatomsarelesssolublethantheirchlorinatedana-logues,duetolargerradiusofbrominesubstituentsincomparisontochlorineatoms.
Similarly,thesolubilityincreaseswiththeincreasingnumberofhalogensubstitu-ents(e.
g.
,monochloronaphthalenesaremoresolublethandichloronaphthalenes).
Thedescriptorsrelatedtothefac-torsinuencingthecavitationprocess,namely:thesizeofamolecule,thetypeandnumberofsubstituentsarenAT,nX,andSASinEq.
10,aswellasEt,SASw,SAVw,DEw,andTNEwinEq.
11.
Thesubstitutionpatternisthemainfactordecidingondifferencesinsolubilitybetweencongenerscontainingthesamenumberofsubstituentsofthesametype.
Differencesinthedistributionofthesubstituents(overthesamecarbonskeleton)decideondifferencesinpolarityofparticularcongeners.
Forexample,1,2,3,4-tetrachloronaphthaleneismorepolarthan2,3,6,7-tetrachloronaphthalene.
Subse-quently,electrostaticinteractionswithwaterasasolventarestrongerinthesecondcase.
Thus,thesecondcongenerinthispairismoresoluble.
Interestingly,aswedemon-stratedinmanypreviouscontributions[35,40–42]suchdescriptorsasHOMOandLUMOarestronglydependentonthesubstitutionpattern.
Thus,instudytheyshouldnotbeinterpretedasthosedescribingredoxpropertiesofthemolecules(accordingtothewell-knownKoopman'sthe-orem),butrathertheirsubstitutionpatterns.
AnotherdescriptorsrelatedtothesubstitutionpatternareQandShiftinEq.
10,aswellasnClp1andHardinEq.
11.
Therefore,themechanisticinterpretationofbothglobalandlocalQSPRmodelswouldbeverysimilar.
Insummary,fromtheeconomicalpointofview,bothmodelsareacceptable,sincetheyrequirearelativelysmallnumberofexperimentaldata.
Infact,botharebasedonthedatatakenfromtheliterature,thusperformingofanyextensiveempiricalworkwasunnecessary.
However,theuseoftheglobalQSPRwouldbemoreprotable,becauseitenablestomakepredictionsforthosegroupsofPOPs,forwhichthenumberofexperimentaldataisinsufcienttodevelopappropriatelocalmodels.
Forexample,theexperimentallydetermineddataonwatersolubilityareavailableonlyforeightcongenersofPCDFs,whichisevidentlytoosmallforcalibratingandvalidatingalocalmodel.
Moreover,timeand,inconse-quence,costsofobtainingthepredictedvaluesoflogScanbesignicantlyreducedbyemployingtheglobalmodelingscheme.
ComparingotherglobalandlocalQSPRmodelsInaddition,toextendtheinvestigationsontheotherphys/chemproperties,weperformedsimilarpairwisecompari-sonsfortheother,previouslypublishedQSPRmodels.
Weusedtwoofourpreviousglobalmodelsdevelopedforpredictingn-octanol/waterpartitioncoefcient(logKOW)[40]andsubcooledliquidvaporpressure(logPL)[42],StructChem(2011)22:873–884881123respectively,foragroupof1436POPs,includingchloro-andbromo-analoguesofdibenzo-p-dioxins,dibenzofurans,biphenyls,naphthalenes,diphenylethers,andbenzenes.
Themodelswerecomparedwithtwocorrespondinglocalmodelspublishedbyothergroups.
TherstonewasdesignedtopredictlogKOWfor209PCBs[43],whereasthesecondonethevaluesoflogPLfor210congenersofPCDDsandPCDFs[44].
Interestingly,theconclusionsfrombothcomparisons(basedonpredictinglogKOWandlogPL)areevenmoreoptimisticthanthatforlogS.
Thestatisticalmeasuresofgoodness-of-tandrobustnesswereverysimilarinpairsforcorrespondingglobalandlocalmodels(Table2).
Moreover,theobserveddifferencesbetweentheexperi-mentallymeasuredandpredictedvaluesofbybothmeth-odsofmodeling(i.
e.
,localandglobal)werenotstatisticallysignicant(p[0.
05)(Table3),whichwasconsistentwithourassumption.
Regardingthat(i)bothglobalmodelshavebeendevelopedforamuchwiderapplicabilitydomain(coveringofabout85%morecom-pounds)and(ii)theypracticallydidnotdifferfromtheirlocalcounterpartsinquality,weconcludedthattheemploymentofglobalQSPRswouldbemuchmoreef-cientthenthedevelopmentofparticularlocalones.
ConclusionsWehaveveriedtheefciencyoftwomodelingstrategies.
Therstoneassumesthereductionofthemodel'sdomainandthedevelopmentofQSPRbasedonasmallnumberofstructurallysimilarcompounds(localQSPR).
Accordingtothesecondone,themodeliscalibratedwithuseofthewiderandmorestructurallydiversiedtrainingset(globalQSPR),evenifthisleadstoasmalldecreaseofthemodel'spredictivity.
Basedontheobtainedresults,werecommendthatwheneverglobalmodelsfulllallqualitycriteriaproposedbytheOrganizationforEconomicCooperationandDevelopment(OECD),theyshouldbeappliedinpracticewithoutnecessityofdevelopingaseriesoflocalQSPRs.
Sucharecommendationisreasonable,becauseofthreereasons.
First,theglobalmodelsallowforsimultaneouspredictionsofphysicochemicalpropertiesforevenmanyhundredsofcompounds.
Thisfeatureisveryimportantfromtheeconomicpointofview,regardingthatthenumberofnewchemicalssynthesizedand/oridentiedintheenvironmentalcompartmentsisgrowingexponentially.
Second,theglobalmodelingapproachmaybetheonlypossibilityofmodeling,whenthenumberofchemicalsfromonespecicclassofthechemicallyrelatedcom-poundsisinsufcienttocalibrateandappropriatelyvali-datealocalQSPRmodel.
Third,asdemonstrated,theperformance(predictiveability)ofglobalmodelsisnotalwaysworsethantheseoflocalones.
AcknowledgmentsTheauthorsthanktheeditorsforrapidlycon-sideringoursubmissionandtheanonymousreviewerforvaluablecomments,whichhelpedtoimprovescienticqualityofthiscontri-bution.
T.
P.
thankstheFoundationforPolishScienceforgrantinghimwithafellowshipandaresearchgrantinframeoftheHOMINGProgramsupportedbyNorwegianFinancialMechanismandEEAFinancialMechanisminPoland.
ThisworkwassupportedbytheTable2StatisticalparametersoflocalandglobalmodelsoflogPLandlogKOWaTakenfrom[43],btakenfrom[44]n/aNodataprovidedintheoriginalpaperModelFeatureMeasureLocalQSPRmodelGlobalQSPRmodellogPLGoodness-oftR20.
99a0.
97RMSECn/a0.
21RobustnessQCV20.
96a0.
97RMSECVn/a0.
22PredictivityQExt2n/a0.
97RMSEPn/a0.
22logKOWGoodness-oftR20.
91b0.
92RMSEC0.
23b0.
32RobustnessQCV20.
91b0.
92RMSECV0.
23b0.
32PredictivityQExt2n/a0.
92RMSEPn/a0.
30Table3Comparisonbetweentheresidualsderivedfromthepre-dictionsoflogPLandlogKOWwithlocalandglobalGA-PLSmodelsbytheStudentttestModelStatisticslogPLTeststatistic2.
01pValue0.
051logKOWTeststatistic1.
81pValue0.
072882StructChem(2011)22:873–884123PolishMinistryofScienceandHigherEducation,GrantNo.
DS/8430-4-0171-11.
Thisresearchwassupportedinpart(toM.
H.
)bytheU.
S.
DepartmentofEnergyundercontractDE-AC02-05CH11231.
OpenAccessThisarticleisdistributedunderthetermsoftheCreativeCommonsAttributionNoncommercialLicensewhichper-mitsanynoncommercialuse,distribution,andreproductioninanymedium,providedtheoriginalauthor(s)andsourcearecredited.
References1.
YangG,ZhangX,WangZ,LiuH,JuX(2006)Estimationoftheaqueoussolubility(-lgSw)ofallpolychlorinateddibenzo-furans(PCDF)andpolychlorinateddibenzo-p-dioxins(PCDD)cong-enersbydensityfunctionaltheory.
JMolStructTheochem766:25–332.
Rotkin-EllmanM,NavarroKM,SolomonGM(2010)Gulfoilspillairqualitymonitoring:lessonslearnedtoimproveemer-gencyresponse.
EnvironSciTechnol44:8365–83663.
UNEP(2001)Stockholmconventiononpersistentorganicpol-lutants.
UnitedNationsEnvironmentProgramme,Geneva4.
HaranczykM,PuzynT,NgEG(2010)Onenumerationofcongenersofcommonpersistentorganicpollutants.
EnvironPollut9:2786–27895.
SchultzTW,CroninMTD,WalkerJD,AptulaAO(2003)Quantitativestructure–activityrelationships(QSARs)intoxi-cology:ahistoricalperspective.
JMolStructTheochem622:1–226.
CastroEA,ToropovaAP,ToropovAA,MukhamedjanovaDV(2005)QSPRmodelingofGibbsfreeenergyoforganiccom-poundsbyweightingofnearestneighboringcodes.
StructChem16:305–3247.
GolmohammadiH,DashtbozorgiZ(2010)Quantitativestruc-ture–propertyrelationshipstudiesofgas-to-wetbutylacetatepartitioncoefcientofsomeorganiccompoundsusinggeneticalgorithmandarticialneuralnetwork.
StructChem21:1241–12528.
O¨bergT,LiuT(2008)GlobalandlocalPLSregressionmodelstopredictvaporpressure.
QSARCombSci27:273–2799.
LeiB,MaY,LiJ,LiuX,YoX,GramaticaP(2010)Predictionoftheadsorptioncapabilityontoactivatedcarbonofalargedatasetofchemicalsbylocallazyregressionmethod.
AtmosEnviron44:2954–296010.
HaywardD(1998)Identicationofbioaccumulatingpolychlo-rinatednaphthalenesandtheirtoxicologicalsignicance.
EnvironRes76:1–1811.
DeardenJC,CroninMTD,KaiserKLE(2009)Hownottodevelopaquantitativestructure–activityorstructure–propertyrelationship(QSAR/QSPR).
SARQSAREnvironRes20:241–26612.
DunnivantFM,ElzermanAW(1988)AqueoussolubilityandHenry'slawconstantforPCBcongenersforevaluationofquantitativestructure-propertyrelationships(QSPRs).
Chemo-sphere17:525–54113.
MillerMM,GhodbaneS,WasikSP,TewariYB,MartireDE(1984)Aqueoussolubilities,octanol/waterpartitioncoefcients,andentropiesofmeltingofchlorinatedbenzenesandbiphenyls.
JChemEngData29:184–19014.
GoversHAJ,KropHB(1998)Partitionconstantsofsomechlo-rinateddibenzofurans,anddibenzo-p-dioxins.
Chemosphere37:2139–215215.
RuelleP,KesselringUW(1997)Aqueoussolubilitypredictionofenvironmentallyimportantchemicalsfromthemobileorderthermodynamics.
Chemosphere34:275–29816.
TittlemierSA,HalldorsonT,SternGA,TomyGT(2002)Vaporpressures,aqueoussolubilities,andHenry'slawconstantsofsomebrominatedameretardants.
EnvironToxicolChem21:1804–181017.
OpperhuizenA,VeldeEW,GobasFAP,LiemDAK,SteenJMD(1985)Relationshipbetweenbioconcentrationinshandstericfactorsofhydrophobicchemicals.
Chemosphere14:1871–189618.
DoucetteWJ,AndrenAW(1988)Aqueoussolubilityofselectedbiphenyl,furan,anddioxincongeners.
Chemosphere17:243–25219.
HewittM,CroninMTD,MaddenJC,RowePH,JohnsonC,ObiA,EnochSJ(2007)ConsensusQSARmodels:dothebenetsoutweighthecomplexityJChemInfModel47:1460–146820.
HaranczykM,PuzynT,SadowskiP(2008)ConGENER—atoolformodelingofthecongenericsetsofenvironmentalpollutants.
QSARCombSci27:826–83321.
HaranczykM,GutowskiM(2007)Quantummechanicalenergy-basedscreeningofcombinatoriallygeneratedlibraryoftautom-ers.
TauTGen:atautomergeneratorprogram.
JChemInfModel47:686–69422.
StewartJJP(2007)OptimizationofparametersforsemiempiricalmethodsV:modicationofNDDOapproximationsandappli-cationto70elements.
JMolModel13:1173–121323.
StewartJJP(2009)In:ChemistrySC(ed)MOPAC2009http://openmopac.
net/MOPAC2009.
html.
Accessed14April201024.
OECD(2004)OECDprinciplesforthevalidation,forregulatorypurposes,of(quantitative)structure–activityrelationshipmodels.
In:37thjointmeetingofthechemicalscommitteeandworkingpartyonchemicals,pesticidesandbiotechnology.
OrganisationforEconomicCo-OperationandDevelopment,Paris25.
WoldS,Sjo¨stro¨mM,ErikssonL(2001)PLS-regression:abasictoolofchemometrics.
ChemometrIntellLabSyst58:109–13026.
HollandJ(1992)Adaptationinnaturalandarticialsystems.
MITPress,Michigan,MI27.
MATLAB(2008)MATLAB7.
6.
0.
324.
Mathworks28.
PLS_Toolbox(2009)PLS_Toolbox5.
2.
EigenvectorResearchInc.
,Wenatchee,WA29.
TropshaA,GramaticaP,GombarVK(2003)Theimportanceofbeingearnest:validationistheabsoluteessentialforsuccessfulapplicationandinterpretationofQSPRmodels.
QSARCombSci22:69–7730.
GramaticaP(2007)PrinciplesofQSARmodelsvalidation:internalandexternal.
QSARCombSci26:694–70131.
AtkinsonAC(1985)Plots,transformations,andregression.
Anintroductiontographicalmethodsofdiagnosticregressionanal-ysis.
OxfordStatisticalScienceSeries,Oxford32.
NetzevaTI,WorthAP,AldenbergT,BenigniR,CroninMTD,GramaticaP,JaworskaJ,KahnS,KlopmanG,MarchantCA,MyattG,Nikolova-JeliazkovaN,PatlewiczGY,PerkinsR,RobertsDW,SchultzTW,StantonDT,vandeSandtJJM,TongW,VeithG,YangC(2005)Currentstatusofmethodsfordeningtheapplicabilitydomainof(quantitativestructure–activityrelationships).
ThereportandrecommendationsofECVAMworkshop52.
AlternLabAnim33:155–17333.
JaworskaJ,Nikolova-JeliazkovaN,AldenbergT(2005)QSARapplicabilitydomainestimationbyprojectionofthetrainingsetindescriptorspace:areview.
AlternLabAnim33:445–45934.
OECD(2007)Guidancedocumentonthevalidationof(quanti-tative)structure–activityrelationships(QSAR)models.
Organi-sationforEconomicCo-OperationandDevelopment,Paris35.
PuzynT,MostragA,FalandyszJ,KholodY,LeszczynskiJ(2009)Predictingwatersolubilityofcongeners:chloronaphtha-lenes—acasestudy.
JHazardMater170:1014–1022StructChem(2011)22:873–88488312336.
DelaneyJS(2005)Predictingaqueoussolubilityfromstructure.
DrugDiscovToday10:289–29537.
ToplissJG,CostelloRJ(1972)Chancecorrelationsinstructure–activitystudiesusingmultipleregressionanalysis.
JMedChem15:1066–106838.
PuzynT,Mostrag-SzlichtyngA,GajewiczA,SkrzynskiM,WorthAP(2011)InvestigatingtheinuenceofdatasplittingonthepredictiveabilityofQSAR/QSPRmodels.
StructChem39.
PuzynT,SuzukiN,HaranczykM,RakJ(2008)Calculationofquantum-mechanicaldescriptorsforQSPRattheDFTlevel:isitnecessaryJChemInfModel48:1174–118040.
PuzynT,SuzukiN,HaranczykM(2008)HowdothepartitioningpropertiesofpolyhalogenatedPOPschangewhenchlorineisreplacedwithbromineEnvironSciTechnol42:5189–519541.
PuzynT,MostragA,SuzukiN,FalandyszJ(2008)QSPR-basedestimationoftheatmosphericpersistenceforchloronaphthalenecongeners.
AtmosEnviron42:6627–663642.
GajewiczA,HaranczykM,PuzynT(2010)Predictinglogarith-micvaluesofthesubcooledliquidvaporpressureofhalogenatedpersistentorganicpollutantswithQSPR:howdifferentarechlorinatedandbrominatedcongenersAtmosEnviron44:1428–143643.
PadmanabhanJ,ParthasarathiR,SubramanianV,ChattarajPK(2006)QSPRmodelsforpolychlorinatedbiphenyls:n-octanol/waterpartitioncoefcient.
BioorgMedChem14:1021–102844.
YangP,ChenJ,ChenS,YuanX,ScharmmKW,KettrupA(2003)QSPRmodelsforphysicochemicalpropertiesofpoly-chlorinateddiphenylethers.
SciTotalEnviron305:65–76884StructChem(2011)22:873–884123
百纵科技:美国云服务器活动重磅来袭,洛杉矶C3机房 带金盾高防,会员后台可自助管理防火墙,添加黑白名单 CC策略开启低中高.CPU全系列E52680v3 DDR4内存 三星固态盘列阵。另有高防清洗!百纵科技官网:https://www.baizon.cn/联系QQ:3005827206美国洛杉矶 CN2 云服务器CPU内存带宽数据盘防御价格活动活动地址1核1G10M10G10G38/月续费同价点击...
wordpress公司网站模板,wordpresss简洁风格的高级通用自适应网站效果,完美自适应支持多终端移动屏幕设备功能,高级可视化后台自定义管理模块+规范高效的搜索优化。wordpress公司网站模板采用标准的HTML5+CSS3语言开发,兼容当下的各种主流浏览器: IE 6+(以及类似360、遨游等基于IE内核的)、Firefox、Google Chrome、Safari、Opera等;同时...
在六月初的时候有介绍过一次来自中国台湾的PQS彼得巧商家(在这里)。商家的特点是有提供台湾彰化HiNet线路VPS主机,起步带宽200M,从带宽速率看是不错的,不过价格也比较贵原价需要300多一个月,是不是很贵?当然懂的人可能会有需要。这次年中促销期间,商家也有提供一定的优惠。比如月付七折,年付达到38折,不过年付价格确实总价格比较高的。第一、商家优惠活动年付三八折优惠:PQS2021-618-C...
bbzs为你推荐
google竞价排名哪些搜索引擎没有竞价排名?搜搜?谷歌?如何免费开通黄钻怎么免费开黄钻?天天酷跑刷积分教程最近一直有人说天天酷跑刷积分,怎么刷的。快速美白好方法快速美白方法arm开发板新手入门应如何选择 ARM 开发板?显卡温度多少正常显卡温度是多少才算正常的?ps抠图技巧ps抠图多种技巧,越详细越好,急~~~~~~~腾讯文章怎么在手机腾讯网发文章硬盘人克隆一个人需要多少人多长时间啊开机滚动条开机滚动条要很长时间怎么解决?
虚拟主机试用30天 cn域名备案 服务器配置技术网 ddos namecheap 密码泄露 新站长网 空间出租 gspeed 老左来了 网站木马检测工具 双11秒杀 如何安装服务器系统 多线空间 华为云盘 主机管理系统 qq金券 实惠 徐州电信 xshell5注册码 更多