restrictlinuxmint

linuxmint  时间:2021-03-28  阅读:()
CombiningPrototypeSelectionwithLocalBoostingChristosK.
Aridas(B),SotirisB.
Kotsiantis,andMichaelN.
VrahatisComputationalIntelligenceLaboratory(CILab),DepartmentofMathematics,UniversityofPatras,26110Patras,Greecechar@upatras.
gr,{sotos,vrahatis}@math.
upatras.
grAbstract.
Reallifeclassicationproblemsrequireaninvestigationofrelationshipsbetweenfeaturesinheterogeneousdatasets,wheredierentpredictivemodelscanbemoreproperfordierentregionsofthedataset.
Asolutiontothisproblemistheapplicationofthelocalboostingofweakclassiersensemblemethod.
Amaindrawbackofthisapproachisthetimethatisrequiredatthepredictionofanunseeninstanceaswellasthedecreaseoftheclassicationaccuracyinthepresenceofnoiseinthelocalregions.
Inthisresearchwork,animprovedversionofthelocalboostingofweakclassiers,whichincorporatesprototypeselection,ispresented.
Experimentalresultsonseveralbenchmarkreal-worlddatasetsshowthattheproposedmethodsignicantlyoutperformsthelocalboostingofweakclassiersintermsofpredictiveaccuracyandthetimethatisneededtobuildalocalmodelandclassifyatestinstance.
Keywords:Localboosting·Weaklearning·Prototypeselection·Patternclassication1IntroductionInmachinelearning,instance-based(ormemory-based)learnersclassifyanunseenobjectbycomparingittoadatabaseofpre-classiedobjects.
Thefun-damentalassumptionisthatsimilarinstanceswillsharesimilarclasslabels.
Machinelearningmodels'assumptionswouldnotnecessarilyholdglobally.
Locallearning[1]methodscometosolvethisproblem.
Thelatterallowtoextendlearningalgorithms,thataredesignedforsimplemodels,tothecaseofcomplexdata,forwhichthemodels'assumptionsarevalidonlylocally.
Themostcommoncaseistheassumptionoflinearseparability,whichisusuallynotfullledgloballyinclassicationproblems.
Despitethis,anysupervisedlearningalgorithmthatisabletondonlyalinearseparation,canbeusedinsidealocallearningprocess,producingamodelthatisabletomodelcomplexnon-linearclassboundaries.
Atechniqueofboostinglocalweakclassiers,thatisbasedonareducedtrainingsetaftertheusageofprototypeselection[11],isproposed.
Itiscommonthatboostingalgorithmsarewell-knowntobesusceptibletonoise[2].
InthecasecIFIPInternationalFederationforInformationProcessing2016PublishedbySpringerInternationalPublishingSwitzerland2016.
AllRightsReservedL.
IliadisandI.
Maglogiannis(Eds.
):AIAI2016,IFIPAICT475,pp.
94–105,2016.
DOI:10.
1007/978-3-319-44944-99CombiningPrototypeSelectionwithLocalBoosting95oflocalboosting,thealgorithmshouldmanagereasonablenoiseandbeatleastasgoodasboosting,ifnotbetter.
Fortheexperiments,weusedtwovariantsofDecisionTrees[21]asweaklearningmodels:one-levelDecisionTrees,whichareknownasDecisionStumps[12]andtwo-levelDecisionTrees.
Anextensivecomparisonoverseveraldatasetswasperformedandtheresultsshowthattheproposedmethodoutperformssimpleandlocalboostingintermsofclassicationaccuracy.
InthenextSection,specicallyinSubsect.
2.
1,thelocalizedexpertsaredis-cussed,whileboostingapproachesaredescribedinSubsect.
2.
2.
InSect.
3theproposedmethodispresented.
Furthermore,inSect.
4theresultsoftheexper-imentsonseveralUCIdatasets,afterbeingcomparedwithstandardboostingandlocalboosting,areportrayedanddiscussed.
Finally,Sect.
5concludesthepaperandsuggestsfurtherdirectionsincurrentresearch.
2BackgroundMaterialForcompletenesspurposes,localweightedlearning,prototypeselectionmethodsaswellasboostingclassiertechniquesarebrieydescribedinthefollowingsubsections.
2.
1LocalWeightedLearningandPrototypeSelectionSupervisedlearningalgorithmsareconsideredglobaliftheyuseallavailabletrainingsets,inordertobuildasinglepredictivemodel,thatwillbeappliedinanyunseentestinstance.
Ontheotherhand,amethodisconsideredlocalifonlythenearesttraininginstancesaroundthetestinginstancecontributetotheclassprobabilities.
Whenthesizeofthetrainingdatasetissmallincontrasttothecomplexityoftheclassier,thepredictivemodelfrequentlyovertsthenoiseinthetrain-ingdata.
Therefore,thesuccessfulcontrolofthecomplexityofaclassierhasahighimpactinaccomplishinggoodgeneralization.
Severaltheoreticalandexper-imentalresults[23]indicatethatalocallearningalgorithmprovidesareasonablesolutiontothisproblem.
Inlocallearning[1],eachlocalmodelisbuiltcompletelyindependentofallothermodelsinawaythatthetotalnumberoflocalmodelsinthelearningmethodindirectlyinuenceshowcomplexafunctioncanbeestimated-com-plexitycanonlybecontrolledbythelevelofadaptabilityofeachlocalmodel.
Thisfeaturepreventsoverttingifastronglearningpatternexistsfortrainingeachlocalmodel.
Prototypeselectionisatechniquethataimstodecreasethetrainingsizewithoutsurfacingthepredictionperformanceofamemorybasedlearner[18].
Besidesthis,byreducingthetrainingsetsizeitmightdecreasethecomputationalcostthatwillbeappliedinthepredictionphase.
Prototypeselectiontechniquescanbegroupedinthreecategories:preserva-tiontechniques,whichaimtondaconsistentsubsetfromthetrainingdataset,96Ch.
K.
Aridasetal.
ignoringthepresenceofnoise,noiseremovaltechniques,whichaimtoremovenoise,andhybridtechniques,whichperformbothobjectivesconcurrently[22].
2.
2BoostingClassiersExperimentalresearchworkshaveproventhatensemblemethodsusuallyper-formbetter,intermsofclassicationaccuracy,thantheindividualbaseclassier[2],andlately,severaltheoreticalexplanationshavebeenadvisedtoexplainthesuccessofsomecommonlyusedensemblemethods[13].
Inthiswork,alocalboostingtechniquethatisbasedonareducedtrainingset,aftertheusageofprototypeselection[11],isproposedandforthisreasonthissectionintroducestheboostingapproach.
Boostingconstructstheensembleofclassiersbysubsequentlytweakingthedistributionofthetrainingsetbasedontheaccuracyofthepreviouslycreatedclassiers.
Thereareseveralboostingvariants.
Thesemethodsassignaweighttoeachtraininginstance.
Firstly,allinstancesareequallyweighted.
Ineachiter-ationanewclassicationmodel,namedbaseclassier,isgeneratedusingthebaselearningalgorithm.
Thecreationofthebaseclassierhastoconsidertheweightdistribution.
Then,theweightofeachinstanceisadjusted,dependingontheaccuracyofthepredictionofthebaseclassierforthatinstance.
Thus,Boostingattemptstoconstructnewclassicationmodelsthatareabletobet-terclassifythe"hard"instancesforthepreviousensemblemembers.
Thenalclassicationisobtainedfromaweightedvoteofthebaseclassiers.
AdaBoost[8]isthemostwell-knownboostingmethodandtheonethatisusedovertheexperimentalanalysisthatispresentedinSect.
3.
Adaboostisabletouseweightsintwowaystogenerateanewtrainingdatasettoprovidetothebaseclassier.
Inboostingbysampling,thetraininginstancesaresampledwithreplacementwithprobabilityrelativetotheirweights.
In[26]theauthorsshowedempiricallythatalocalboosting-by-resamplingtech-niqueismorerobusttonoisethanthestandardAdaBoost.
Theauthorsof[17]proposedaBoostedk-NNalgorithmthatcreatesanensembleofmodelswithlocallymodieddistanceweightingthathasincreasedgeneralizationaccuracyandneverperformsworsethanstandardk-NN.
In[10]theauthorspresentedanovelmethodforinstanceselectionbasedonboostinginstanceselectionalgo-rithmsinthesamewayboostingisappliedtoclassication.
3TheProposedAlgorithmTwomaindisadvantagesofsimplelocalboostingare:(i)Whentheamountofnoiseislarge,simplelocalboostingdoesnothavethesameperformance[26]asBagging[3]andRandomForest[4].
(ii)Savingthedataforeachpatternincreasesstoragecomplexity.
Thismightrestricttheusageofthismethodtolimitedtrainingsets[21].
Theproposedalgorithmincorporatesprototypeselec-tiontohandle,amongothers,thetwopreviousproblems.
Inthelearningphase,aprototypeselection[11]methodbasedontheEditedNearestNeighbor(ENN)CombiningPrototypeSelectionwithLocalBoosting97[24]techniquereducesthetrainingsetbyremovingthetraininginstancesthatdonotagreewiththemajorityoftheknearestneighbors.
Intheapplicationphase,itconstructsamodelforeachtestinstancetobeestimated,consideringonlyasubsetofthetraininginstances.
Thissubsetisselectedaccordingtothedistancebetweenthetestingsampleandtheavailabletrainingsamples.
Foreachtestinginstance,aboostingensembleofaweaklearnerisbuiltusingonlythetraininginstancesthatarelyingclosetothecurrenttestinginstance.
Thepro-totypeselectionaimstoimprovetheclassicationaccuracyaswellasthetimethatisneededtobuildamodelforeachtestinstanceattheprediction.
Theproposedensemblemethodhassomefreeparameters,suchasthenumberofneighbors(k1)tobeconsideredwhentheprototypeselectionisexecuted,thenumberofneighbors(k2)tobeselectedinordertobuildthelocalmodel,thedistancemetricandtheweaklearner.
Intheexperiments,themostwell-knownEuclideansimilarityfunctionwasusedasadistancemetric.
Ingeneral,thedistancebetweenpointsxandyinaEuclideanspaceRnisgivenby(1).
d(x,y)=xy2=ni=1|xiyi|2.
(1)Themostcommonvalueforthenearestneighborruleis5.
Thus,thek1wassetto5andk2=50.
sinceataboutthissizeofinstances,itisappropriateforasimplealgorithmtobuildaprecisemodel[14].
TheproposedmethodispresentedinAlgorithm1.
Algorithm1.
PSLB(k1,k2,distanceMetric,weakLearner)procedureTraining(k1,distanceMetric)foreachtraininginstancedoFindthek1nearestneighborsusingtheselecteddistanceMetricifinstancedoesnotagreewiththemajorityofthek1thenRemovethisinstancefromthetrainingsetendifendforendprocedureprocedureClassification(k2,distanceMetric,weakLearner)foreachtestinginstancedoFindthek2nearestneighborsusingtheselecteddistanceMetricApplyboostingtothebaseweakLearnerusingthek2nearestneighborsTheansweroftheboostingensembleisthepredictionforthetestinginstanceendforendprocedure98Ch.
K.
Aridasetal.
4NumericalExperimentsInordertoevaluatetheperformanceoftheproposedmethod,aninitialversionwasimplemented1andanumberofexperimentswereconductedusingseveraldatasetsfromdierentdomains.
FromtheUCIrepository[16]severaldatasetswerechosen.
Discretefeaturestransformedtonumericbyusingasimplequantization.
Eachfeatureisscaledtohavezeromeanandstandarddeviationone.
Alsoallmissingvaluesweretreatedaszero.
InTable1thename,thenumberofpatterns,theattributes,aswellasthenumberofdierentclassesforeachdatasetareshown.
Table1.
BenchmarkdatasetsusedintheexperimentsDataset#patterns#attribues#classescardiotocography21262110cylinder-bands512252dermatology366246ecoli33678energy-y176883glass21496low-res-spect5311009magic19020102musk-14761662ozone2536722page-blocks5473105pima76882synthetic-control600606tic-tac-toe95892AllexperimentswererunonanIntelCorei3–3217Umachineat1.
8GHz,with8GBofRAM,runningLinuxMint17.
364bitusingPythonandthescikit-learn[19]library.
Fortheexperiments,weusedtwovariantsofDecisionTrees[25]asweaklearners.
One-levelDecisionTrees[12],alsoknownasDecisionStumps,andtwo-levelDecisionTrees[20].
WeusedtheGiniImpurity[5]ascriteriontomeasurethequalityofthesplitsinbothalgorithms.
TheboostingprocessforallclassiersperformedusingtheAdaBoostalgorithmwith25iterationsineachmodel.
Inordertocalculatetheclassiersaccuracy,thewholedatasetwasdividedintovemutuallyexclusivefoldsandforeachfoldtheclassierwastrainedontheunionofalloftheotherfolds.
Then,cross-validationwasrunvetimesforeachalgorithmandthemeanvalueofthevefoldswascalculated.
1https://bitbucket.
org/chkoar/pslb.
CombiningPrototypeSelectionwithLocalBoosting994.
1PrototypeSelectionTheprototypeselectionprocessisindependentofthebaseclassierandittakesplaceonceinthetrainingphaseoftheproposedalgorithm.
Itdependsonlyonthek1parameter.
Thenumberofneighborstobeconsideredwhentheprototypeselectionisexecuted.
InTable2theaverageoftrainingpatterns,theaverageoftheremovedpatternsaswellastheaveragereductionofeachdatasetispresented.
Theaveragereferstotheaverageofalltrainingfoldsduringthe5-foldcross-validation.
Table2.
AveragereductionDataset#avgtrainingpatterns#avgremovedpatterns%avgreductioncardiotocography170126815,73cylinder-bands4106014,60dermatology293702,53ecoli2692910,86energy-y16141702,77glass1714023,13lowresspect4254711,11magic15216179811,81musk-13812406,25ozone20295002,48page-blocks437811102,53pima61410917,77synthetic-control480801,67tic-tac-toe766100,134.
2UsingDecisionStumpasBaseClassierIntherstpartoftheexperiments,DecisionStumps[12]wereusedasweaklearningclassiers.
DecisionStumps(DS)areone-levelDecisionTreesthatclas-sifyinstancesbasedonthevalueofjustasingleinputattribute.
Eachnodeinadecisionstumprepresentsafeatureinaninstancetobeclassiedandeachbranchrepresentsavaluethatthenodecantake.
Instancesareclassiedstartingattherootnodeandaresortedbasedontheirattributevalues.
Intheworstcase,aDecisionStumpwillbehaveasabaselineclassierandwillpossiblyperformbetter,iftheselectedattributeisparticularlyinformative.
Theproposedmethod,denotedasPSLBDS,iscomparedwiththeBoostingDecisionStumps,denotedasBDSandtheLocalBoostingofDecisionStumps,denotedasLBDS.
Sincetheproposedmethodusesftyneighbors,a50-Nearest100Ch.
K.
Aridasetal.
Neighbors(50NN)classierhasincludedinthecomparisons.
InTable3theaver-ageaccuracyofthecomparedmethodsispresented.
Table3indicatesthatthehypothesesgeneratedbyPSLBDSareapparentlybettersincethePSLBDSalgo-rithmhasthebestmeanaccuracyscoreinnearlyallcases.
Table3.
Averageaccuracyofthecomparedalgorithmsusinganone-leveldecisiontreeasbaseclassierDatasetPSLBDSBDSLBDS50NNcardiotocography0.
682±0.
0280.
548±0.
0880.
659±0.
0150.
607±0.
029cylinder-bands0.
613±0.
0300.
560±0.
0370.
584±0.
0140.
582±0.
017dermatology0.
942±0.
0270.
641±0.
1420.
940±0.
0220.
902±0.
020ecoli0.
821±0.
0290.
622±0.
1290.
794±0.
0260.
780±0.
050energy-y10.
844±0.
0900.
706±0.
0500.
836±0.
0920.
822±0.
091glass0.
582±0.
0850.
285±0.
0940.
568±0.
0650.
446±0.
169low-res-spect0.
850±0.
0250.
584±0.
0690.
846±0.
0120.
851±0.
023magic0.
849±0.
0050.
828±0.
0050.
834±0.
0050.
828±0.
004musk-10.
727±0.
0960.
727±0.
0520.
718±0.
0850.
618±0.
096ozone0.
966±0.
0080.
960±0.
0100.
887±0.
1330.
971±0.
001page-blocks0.
954±0.
0120.
853±0.
1630.
950±0.
0130.
942±0.
007pima0.
757±0.
0280.
755±0.
0240.
685±0.
0240.
749±0.
017synthetic-control0.
947±0.
0110.
472±0.
0740.
943±0.
0200.
887±0.
030tic-tac-toe0.
884±0.
0840.
733±0.
0340.
882±0.
0830.
747±0.
101Demˇsar[6]suggeststhatthenon-parametrictestsshouldbepreferredovertheparametricinthecontextofmachinelearningproblems,sincetheydonotassumenormaldistributionsorhomogeneityofvariance.
Therefore,inthedirec-tionofvalidatingthesignicanceoftheresults,theFriedmantest[9],whichisarank-basednon-parametrictestforcomparingseveralmachinelearningalgo-rithmsonmultipledatasets,wasused,havingasacontrolmethodthePSLBDSalgorithm.
Thenullhypothesisoftheteststatesthatallthemethodsperformequivalentlyandthustheirranksshouldbeequivalent.
Theaveragerankings,accordingtotheFriedmantest,arepresentedinTable4.
Assumingasignicancelevelof0.
05inTable4,thep-valueoftheFriedmantestindicatesthatthenullhypothesishastoberejected.
So,thereisatleastonemethodthatperformsstatisticallydierentfromtheproposedmethod.
Withtheintentionofinvestigatingtheaforementioned,Finner's[7]andLi's[15]posthocprocedureswereused.
InTable5thep-valuesobtainedbyapplyingposthocproceduresovertheresultsoftheFriedmanstatisticaltestarepresented.
Finner'sandLi'sprocedurerejectsthosehypothesesthathaveap-value≤0.
05.
Thatsaid,theadjustedp-valuesobtainedthroughtheapplicationoftheposthocproceduresarepresentedCombiningPrototypeSelectionwithLocalBoosting101inTable6.
Hence,bothposthocproceduresagreethatthePSLBDSalgorithmperformssignicantlybetterthantheBDS,theLBDSaswellasthe50NNrule.
4.
3UsingTwo-LevelDecisionTreeasaBaseClassierAfterwards,two-levelDecisionTreeswereusedasweaklearningbaseclassiers.
Atwo-levelDecisionTreeisatreewithmaxdepth=2.
Theproposedmethod,denotedasPSLBDT,iscomparedtotheBoostingDecisionTree,denotedasBDTandtheLocalBoostingofDecisionTrees,denotedasLBDT.
Sincetheproposedmethodusesftyneighborsa50-NearestNeighbors(50NN)classierhasincludedinthecomparisons.
InTable7theaverageaccuracyofthecom-paredmethodsispresented.
Table7indicatesthatthehypothesesgeneratedbyPSLBDTareapparentlybetter,sincethePSLBDTalgorithmhasthebestmeanaccuracyscoreinmostcases.
Theaveragerankings,accordingtotheFriedmantest,arepresentedinTable8.
Theproposedalgorithmwasrankedintherstplaceagain.
Assumingsignicancelevelof0.
05inTable8,thep-valueoftheFriedmantestindicatesthatthenullhypothesishastoberejected.
So,thereisatleastonemethodthatperformsstatisticallydierentfromtheproposedmethod.
Aimingtoinvestigatetheaforesaid,Finner'sandLi'sposthocprocedureswereusedagain.
InTable9thep-valuesobtainedbyapplyingposthocproceduresovertheresultsofFriedman'sstatisticaltestarepresented.
Finner'sandLi'sprocedurerejectsthosehypothesesthathaveap-value≤0.
05.
Thatsaid,theadjustedp-valuesobtainedthroughtheapplicationoftheposthocproceduresarepre-sentedinTable10.
BothposthocproceduresagreethatthePSLBDTalgorithmperformssignicantlybetterthantheBDTandthe50NNrulebutnotsigni-cantlybetterthantheLBDTasfarasthetesteddatasetsareconcerned.
4.
4TimeAnalysisOneofthetwocontributionsofthisstudywastoimprovetheclassicationtimeoverthelocalboostingapproach.
Inordertoprovethis,thetotaltimethatisrequiredtopredictallinstancesinthetestfoldswasrecorded.
Specically,Table4.
AveragerankingsofFriedmantest(DS)AlgorithmRankingPSLBDS1.
1429LBDS2.
428650NN2.
8571BDS3.
5714Statistic26.
228571p-value0.
000009102Ch.
K.
Aridasetal.
Table5.
PosthoccomparisonfortheFriedmanstest(DS)iAlgorithmz=(R0Ri)/SEpFinnerLi3BDS4.
977090.
0000010.
0169520.
052189250NN3.
513240.
0004430.
0336170.
0521891LBDS2.
634930.
0084150.
050.
05Table6.
Adjustedp-values(DS)iAlgorithmpUnadjustedpFinnerpLi3BDS0.
0000010.
0000020.
000001250NN0.
0004430.
0006640.
0004461LBDS0.
0084150.
0084150.
008415Table7.
Averageaccuracyofthecomparedalgorithmsusingatwo-leveldecisiontreeasbaseclassierDatasetPSLBDTBDTLBDT50NNcardiotocography0.
683±0.
0200.
584±0.
0720.
686±0.
0170.
607±0.
029cylinder-bands0.
609±0.
0490.
608±0.
0340.
564±0.
0250.
582±0.
017dermatology0.
958±0.
0400.
800±0.
0410.
951±0.
0200.
902±0.
020ecoli0.
813±0.
0320.
753±0.
0360.
800±0.
0300.
780±0.
050energy-y10.
845±0.
0600.
830±0.
0640.
844±0.
0710.
822±0.
091glass0.
608±0.
0480.
569±0.
1120.
652±0.
0570.
446±0.
169low-res-spect0.
877±0.
0420.
573±0.
1360.
872±0.
0230.
851±0.
023magic0.
849±0.
0060.
856±0.
0070.
841±0.
0060.
828±0.
004musk-10.
738±0.
0720.
752±0.
0240.
746±0.
0740.
618±0.
096ozone0.
967±0.
0080.
925±0.
0640.
888±0.
1320.
971±0.
001page-blocks0.
960±0.
0100.
924±0.
0230.
956±0.
0100.
942±0.
007pima0.
763±0.
0230.
742±0.
0140.
730±0.
0180.
749±0.
017synthetic-control0.
950±0.
0110.
830±0.
0360.
953±0.
0160.
887±0.
030tic-tac-toe0.
893±0.
0780.
665±0.
1260.
889±0.
0810.
747±0.
101thepredictionofeachtestfoldwasexecutedthreetimesandtheminimumtimewasrecordedforeachfold.
Then,theaverageofallfoldswascalculated.
InTable11theaveragepredictiontimeinsecondsofLBDS,PSLBDS,LBDTandPSLBDTSispresented.
Inthecaseofone-leveldecisiontrees(LBDS,PSLBDS)theproposedmethodreducedtheexpectedpredictiontimeinmorethan15%in6of14cases,whileinthecaseoftwo-leveldecisiontrees(LBDT,PSLBDT)theproposedmethodreducedtheexpectedpredictiontimeinmorethan15%in7of14cases.
InFig.
1theabsolutepercentagechangesarepresented.
CombiningPrototypeSelectionwithLocalBoosting103Table8.
AveragerankingsofFriedmantest(two-leveltree)AlgorithmRankingPSLBDT1.
5LBDT2.
285750NN3.
0714BDT3.
1429Statistic15p-value0.
001817Table9.
PosthoccomparisonfortheFriedmanstest(two-leveltree)iAlgorithmz=(R0Ri)/SEpFinnerLi3BDT3.
3668550.
000760.
0169520.
046982250NN3.
220470.
001280.
0336170.
0469821LBDT1.
6102350.
1073470.
050.
05Table10.
Adjustedp-values(two-leveltree)iAlgorithmpUnadjustedpFinnerpLi3BDT0.
000760.
0022790.
000851250NN0.
001280.
0022790.
0014321LBDT0.
1073470.
1073470.
107347Table11.
Averagepredictiontimes,insecondsDatasetLBDSPSLBDSLBDTPSLBDTcardiotocography33.
8933.
2632.
4329.
20cylinder-bands8.
168.
078.
457.
86dermatology3.
563.
523.
283.
20ecoli5.
003.
614.
662.
92energy-y18.
587.
197.
596.
25glass3.
393.
373.
463.
16low-res-spect6.
746.
385.
773.
77magic257.
14160.
31213.
59107.
98musk-19.
539.
508.
807.
99ozone14.
844.
897.
241.
69page-blocks17.
279.
3412.
284.
27pima11.
728.
9011.
077.
56synthetic-control6.
326.
123.
893.
76tic-tac-toe13.
5613.
5612.
1812.
00104Ch.
K.
Aridasetal.
Fig.
1.
PercentagechangeofpredictiontimebetweenLocalBoostingandtheproposedmethod5SynopsisandFutureWorkLocalmemory-basedtechniquesdelaytheprocessingofthetrainingsetuntiltheyreceivearequestforanactionlikeclassicationorlocalmodelling.
Adatasetofobservedtrainingexamplesisalwaysretainedandtheestimateforanewtestinstanceisobtainedfromaninterpolationbasedonaneighborhoodofthequeryinstance.
Inthisresearchworkathand,alocalboostingafterprototypeselectionmethodispresented.
Experimentsonseveraldatasetsshowthattheproposedmethodsignicantlyoutperformstheboostingandlocalboostingmethod,intermsofclassicationaccuracyandthetimethatisrequiredtobuildalocalmodelandclassifyatestinstance.
Typically,boostingalgorithmsarewellknowntobesubtletonoise[2].
Inthecaseoflocalboosting,thealgorithmshouldhandlesucientnoiseandbeatleastasgoodasboosting,ifnotbetter.
Bymeansofthepromisingresultsobtainedfromperformedexperiments,onecanassumethattheproposedmethodcanbesuccessfullyappliedtotheclassicationtaskintherealworldcasewithmoreaccuracythanthecomparedmachinelearningapproaches.
Inafollowingworktheproposedmethodwillbeinvestigatedasfarasregres-sionproblemsareconcernedaswellastheproblemofreducingthesizeofthestoredsetofinstances,byalsoapplyingfeatureselectioninsteadofsimplepro-totypeselection.
References1.
Atkeson,C.
G.
,Schaal,S.
,Moore,A.
W.
:Locallyweightedlearning.
Artif.
Intell.
Rev.
11(1),11–73(1997)2.
Bauer,E.
,Kohavi,R.
:Anempiricalcomparisonofvotingclassicationalgorithms:bagging,boosting,andvariants.
Mach.
Learn.
36(1),105–139(1999)3.
Breiman,L.
:Baggingpredictors.
Mach.
Learn.
24(2),123–140(1996)4.
Breiman,L.
:Randomforests.
Mach.
Learn.
45(1),5–32(2001)5.
Breiman,L.
,Friedman,J.
,Stone,C.
,Olshen,R.
:ClassicationandRegressionTrees.
Chapman&Hall,NewYork(1993)CombiningPrototypeSelectionwithLocalBoosting1056.
Demar,J.
:Statisticalcomparisonsofclassiersovermultipledatasets.
J.
Mach.
Learn.
Res.
7,1–30(2006)7.
Finner,H.
:Onamonotonicityprobleminstep-downmultipletestprocedures.
J.
Am.
Stat.
Assoc.
88(423),920–923(1993)8.
Freund,Y.
,Schapire,R.
E.
:Others:experimentswithanewboostingalgorithm.
ICML.
96,148–156(1996)9.
Friedman,M.
:Theuseofrankstoavoidtheassumptionofnormalityimplicitintheanalysisofvariance.
J.
Am.
Stat.
Assoc.
32(200),675(1937)10.
Garca-Pedrajas,N.
,deHaro-Garca,A.
:Boostinginstanceselectionalgorithms.
Knowl.
BasedSyst.
67,342–360(2014)11.
Garcia,S.
,Derrac,J.
,Cano,J.
R.
,Herrera,F.
:Prototypeselectionfornearestneighborclassication:taxonomyandempiricalstudy.
IEEETrans.
PatternAnal.
Mach.
Intell.
34(3),417–435(2012)12.
Iba,W.
,Langley,P.
:Inductionofone-leveldecisiontrees.
In:ProceedingsoftheNinthInternationalWorkshoponMachineLearning,pp.
233–240,ML1992.
MorganKaufmannPublishersInc.
,SanFrancisco(1992)13.
Kleinberg,E.
M.
:Amathematicallyrigorousfoundationforsupervisedlearning.
In:Kittler,J.
,Roli,F.
(eds.
)MCS2000.
LNCS,vol.
1857,pp.
67–76.
Springer,Heidelberg(2000)14.
Kotsiantis,S.
B.
,Kanellopoulos,D.
,Pintelas,P.
E.
:Localboostingofdecisionstumpsforregressionandclassicationproblems.
J.
Comput.
1(4),30–37(2006)15.
Li,J.
:Atwo-steprejectionprocedurefortestingmultiplehypotheses.
J.
Stat.
Plann.
Infer.
138(6),1521–1527(2008)16.
Lichman,M.
:UCIMachineLearningRepository(2013)17.
Neo,T.
K.
C.
,Ventura,D.
:Adirectboostingalgorithmforthek-nearestneighborclassiervialocalwarpingofthedistancemetric.
PatternRecogn.
Lett.
33(1),92–102(2012)18.
Olvera-Lpez,J.
A.
,Carrasco-Ochoa,J.
A.
,Martnez-Trinidad,J.
F.
,Kittler,J.
:Areviewofinstanceselectionmethods.
Artif.
Intell.
Rev.
34(2),133–144(2010)19.
Pedregosa,F.
,Varoquaux,G.
,Gramfort,A.
,Michel,V.
,Thirion,B.
,Grisel,O.
,Blondel,M.
,Prettenhofer,P.
,Weiss,R.
,Dubourg,V.
,Vanderplas,J.
,Passos,A.
,Cournapeau,D.
,Brucher,M.
,Perrot,M.
,Duchesnay,D.
:Scikit-learn:machinelearninginPython.
J.
Mach.
Learn.
Res.
12,2825–2830(2011)20.
Quinlan,J.
R.
:Inductionofdecisiontrees.
Mach.
Learn.
1(1),81–106(1986)21.
Rokach,L.
:Ensemble-basedclassiers.
Artif.
Intell.
Rev.
33(1–2),1–39(2010)22.
Segata,N.
,Blanzieri,E.
,Delany,S.
J.
,Cunningham,P.
:Noisereductionforinstance-basedlearningwithalocalmaximalmarginapproach.
J.
Intell.
Inf.
Syst.
35(2),301–331(2010)23.
Vapnik,V.
N.
:StatisticalLearningTheory:AdaptiveandLearningSystemsforSignalProcessing,Communications,andControl.
Wiley,NewYork(1998)24.
Wilson,D.
L.
:Asymptoticpropertiesofnearestneighborrulesusingediteddata.
IEEETrans.
Sys.
ManCybern.
2(3),408–421(1972)25.
Witten,I.
H.
,Frank,E.
,Hall,M.
A.
:DataMining:PracticalMachineLearningToolsandTechniques.
MorganKaufmannSeriesinDataManagementSystems,3rdedn.
MorganKaufmann,Burlington(2011)26.
Zhang,C.
X.
,Zhang,J.
S.
:Alocalboostingalgorithmforsolvingclassicationprob-lems.
Comput.
Stat.
DataAnal.
52(4),1928–1941(2008)

Virtono:€23.7/年,KVM-2GB/25GB/2TB/洛杉矶&达拉斯&纽约&罗马尼亚等

Virtono最近推出了夏季促销活动,为月付、季付、半年付等提供9折优惠码,年付已直接5折,而且下单后在LET回复订单号还能获得双倍内存,不限制付款周期。这是一家成立于2014年的国外VPS主机商,提供VPS和服务器租用等产品,商家支持PayPal、信用卡、支付宝等国内外付款方式,可选数据中心包括罗马尼亚、美国洛杉矶、达拉斯、迈阿密、英国和德国等。下面列出几款VPS主机配置信息,请留意,下列配置中...

HostKvm(4.25美)香港和俄罗斯高防机房云服务器

HostKvm 商家我们算是比较熟悉的国内商家,商家主要还是提供以亚洲数据中心,以及直连海外线路的服务商。这次商家有新增香港和俄罗斯两个机房的高防服务器方案。默认提供30GB防御,且目前半价优惠至4.25美元起步,其他方案的VPS主机还是正常的八折优惠。我们看看优惠活动。香港和俄罗斯半价优惠:2021fall,限购100台。通用优惠码:2021 ,八折优惠全部VPS。我们看看具体的套餐。1、香港高...

PIGYun中秋特惠:香港/韩国VPS月付14元起

PIGYun发布了九月份及中秋节特惠活动,提供8折优惠码,本月商家主推中国香港和韩国机房,优惠后最低韩国每月14元/中国香港每月19元起。这是一家成立于2019年的国人商家,提供中国香港、韩国和美国等地区机房VPS主机,基于KVM架构,采用SSD硬盘,CN2+BGP线路(美国为CUVIP-AS9929、GIA等)。下面列出两款主机配置信息。机房:中国香港CPU:1core内存:1GB硬盘:10GB...

linuxmint为你推荐
mathplayer如何学好理科关键字数据库:什么是关键字?比肩工场比肩是什么意思,行比肩大运的主要意象丑福晋大福晋比正福晋大么porndao单词prondao的汉语是什么mole.61.com谁知道摩尔庄园的网址啊www.544qq.COM跪求:天时达T092怎么下载QQ杨丽晓博客杨丽晓是怎么 出道的ww.66bobo.com谁知道11qqq com被换成哪个网站ww.66bobo.com这个WWW ̄7222hh ̄com是不是真的不太易开了,换了吗?
godaddy域名注册 域名注册使用godaddy 云南服务器租用 提供香港vps 博客主机 国外在线代理 警告本网站美国保护 中国智能物流骨干网 元旦促销 52测评网 工信部icp备案号 徐正曦 域名和空间 网通服务器 国内空间 卡巴斯基官网下载 广州主机托管 hosts文件 asp简介 超低价 更多