legraph

graphsearch  时间:2021-02-11  阅读:()
StructuredDuplicateDetectioninExternal-MemoryGraphSearchRongZhouandEricA.
HansenDepartmentofComputerScienceandEngineeringMississippiStateUniversityMississippiState,MS39762{rzhou,hansen}@cse.
msstate.
eduAbstractWeconsiderhowtouseexternalmemory,suchasdiskstor-age,toimprovethescalabilityofheuristicsearchinstate-spacegraphs.
TolimitthenumberofslowdiskI/Ooper-ations,wedevelopanewapproachtoduplicatedetectioningraphsearchthatlocalizesmemoryreferencesbypartitioningthesearchgraphbasedonanabstractionofthestatespace,andexpandingthefrontiernodesofthegraphinanorderthatrespectsthispartition.
Wedemonstratetheeffectivenessofthisapproachbothanalyticallyandempirically.
IntroductionHeuristicsearchinstate-spacegraphsisawidely-usedframeworkforproblemsolvinginAI,butitsscalabilityislimitedbymemoryrequirements.
Thishasledtoexten-siveresearchonhowtouseavailablememoryefcientlyinsearchinglargestate-spacegraphs.
Theconceptof"availablememory"isambiguous,sincememoryinmostcomputersystemshasahierarchicalstruc-tureinwhichfast,random-accessinternalmemoryiscom-plementedbylow-speedexternalmemory,suchasdiskstor-age.
Althoughexternalmemoryisvastlylarger(andmuchcheaper)thaninternalmemory,mostheuristicsearchal-gorithmsaredesignedtouseinternalmemoryonly,andmostresearchonmemory-efcientheuristicsearchassumesamemorymodelinwhichaccesstoallstoreddataitemsisequallyfast.
Algorithmsthatassumethismemorymodelperformextremelypoorlyifexternalmemoryisused,sincerandomaccessofexternalmemoryrequirestime-consumingI/Ooperationsthatareseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory.
Somerecentworkbeginstoaddressthisissue.
EdelkampandSchr¨odl(2000)consideruseofvirtualmemoryinA*graphsearch,andproposetechniquesthatchangethebest-rstorderofnodeexpansionsinawaythatimprovesrefer-encelocalityandreducesthenumberofpagefaultsinvirtualmemory.
AmoredirectapproachtolimitingslowdiskI/Oistodesignasearchalgorithmthatexplicitlymanagesaccesstoexternalmemory,sincethisapproachcanexploitknowl-edgeofhowthegraphisstructured,andisnotlimitedbythesizeofvirtualmemory,whichisusuallymuchsmallerthanexternalmemory.
InthetheoreticalcomputerscienceCopyrightc2004,AmericanAssociationforArticialIntelli-gence(www.
aaai.
org).
Allrightsreserved.
community,therehasbeenextensiveresearchonexternal-memoryalgorithms,includingdevelopmentofexternal-memorybreadth-rstsearchalgorithmsforexplicitlyrep-resentedgraphs(Munagala&Ranade1999;Mehlhorn&Meyer2002).
IntheAIcommunity,Korf(2003a;2003b)recentlydescribedanexternal-memorybreadth-rstsearchalgorithmforimplicitlyrepresentedgraphs.
Akeyissueindesigninganefcientexternal-memorygraphsearchalgorithmisduplicatedetection.
Ingraphsearch,thesamenodecanbereachedalongdifferentpaths,andpreventingredundantsearcheffortrequiresstoringalready-visitednodes,sothatthesearchalgorithmcanrec-ognizewhenitencountersanodethathasbeenalreadyexplored.
Becauseduplicatedetectionpotentiallyrequirescomparingeachnewlygeneratednodetoallstorednodes,itcanleadtocripplingdiskI/Oifalready-explorednodesarestoredondisk.
Allpreviousexternal-memorybreadth-rstsearchalgorithmslimitdiskI/Obyusingatechniquecalleddelayedduplicatedetectioninwhichanentiresetofnodesisexpandedbeforeperforminganyduplicatedetection.
Inthispaper,weintroduceanewapproach.
Insteadofdelayingduplicatedetection,welimitdiskI/Obylocalizingmem-oryreferencesbasedonthestructureofthesearchgraphasrevealedbyahigh-levelabstractionofthestatespace.
Com-putationalresultsonchallenginggraph-searchproblemsin-dicatethatthisapproach,whichwecallstructuredduplicatedetection,signicantlylimitsdiskI/O.
Forgraphswithsuf-cientlocalstructure,itallowsgraph-searchalgorithmstouseexternalmemoryalmostasefcientlyasinternalmemory.
BackgroundDevelopmentofexternal-memoryalgorithmsforvariouscomputationalproblemsisanactiveareaofresearch.
Be-causediskI/Oisseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory,theconceptofI/Ocom-plexityhasbeenintroducedtoanalyzetheperformanceofexternal-memoryalgorithms(Aggarwal&Vitter1988).
Itassumesatwo-levelmodelofmemoryinwhichinternalmemoryofsizeMissupplementedbyexternalmemory,andanI/OoperationmovesdatainblocksofsizeBbetweenin-ternalmemoryandexternalmemory,where1searchhasbeenconsid-eredbyseveralresearchers.
Inthetheoreticalcomputersci-SEARCH683encecommunity,researchershavefocusedonestablishingboundsonitsI/Ocomplexity.
Agraphisassumedtobeexplicitlyrepresentedusingadjacencylistsandstoredinex-ternalmemoryduetoitslargesize.
Insearchingexplicitgraphs,worst-caseI/Ocomplexitydependsonthemethodofgeneratingsuccessornodesaswellasthemethodofdu-plicatedetection,sinceadjacencylistsmustbereadintointernalmemorytodeterminethesuccessorsofanode.
MunagalaandRanade(1999)describeanexternal-memorybreadth-rstsearchalgorithmsthatadoptsanapproachtosuccessorgenerationthathaslinearcomplexityinthenum-berofedgesinthegraph.
MehlhornandMeyer(2002)achievesublinearcomplexityforsuccessorgenerationbyamoresophisticatedapproachthatinvolvespartitioningtheadjacencylistbasedonadecompositionofthegraphintoconnectedsubgraphs.
Bothalgorithmsusethesamemethodofduplicatedetection,calleddelayedduplicatedetection,inwhichasetofnodesisexpandedwithoutperformingdupli-catedetection,andthemulti-setofgeneratedsuccessorsissortedandcheckedforduplicatesinasingle,moreefcientoperation.
Thealgorithmsalternatebetweenthesetwosteps.
1.
successorgeneration,inwhichthealgorithmgeneratessuccessorsforasetofnodesonthesearchfrontierandappendsthesesuccessorstoale(orles)intheorderinwhichtheyaregenerated,withoutperforminganydupli-catedetection,and2.
delayedduplicatedetection,inwhichthele(s)ofsucces-sornodesaresorted(usuallybyanexternal-memorysortalgorithm)basedontheirindicesorstateencodings,fol-lowedbyascanandcompactionstageinwhichduplicatenodesinthesortedle(s)areeliminated.
IntheAIcommunity,Korf(2003a;2003b)recentlyde-scribedanexternal-memorybreadth-rstsearchalgorithmthatusesthesamemethodofdelayedduplicatedetection.
Hisalgorithmdiffersintwoways.
First,heconsiderssearchinimplicitgraphs,inkeepingwiththestandardAIapproachtostate-spacesearch.
Animplicitgraphisacompactrep-resentationofagraphintheformofastartnode,anodeexpansionfunctionthatgeneratestheimmediatesuccessorsofanode,andapredicatethattestswhetheranodeisagoalnode.
Insearchingimplicitgraphs,successorgener-ationdoesnotrequirediskI/O,andduplicatedetectionistheonlysourceofI/Ocomplexity.
AseconddifferenceisthatKorfbuildshisexternal-memorybreadth-rstsearchalgorithmontopoffrontiersearch,amemory-efcientsearchalgorithmthatdoesnotneedtostoreaClosedlist(Korf&Zhang2000).
StructuredduplicatedetectionWeproposeanalternativeapproachtoduplicatedetectioninexternal-memorygraphsearchthathassomesignicantadvantagesoverdelayedduplicatedetection.
Itcanalsobecombinedwithdelayedduplicatedetectioninsomecases.
Thecentralideaofourapproachistoexploitthestruc-tureofastate-spacegraphinordertolocalizememoryref-erencesduringduplicatedetection,andsowecalltheap-proachstructuredduplicatedetection.
Asanexample,con-siderthe(n21)sliding-tilepuzzle.
Whenanewnodeisgenerated,itcanonlydifferfromitsparentnodewithre-specttothepositionofthe"blank"byeitheroneroworonecolumn.
Therefore,incheckingforduplicates,itisnotnec-essarytocheckstorednodesforwhichthepositionofthe"blank"differsfromthatoftheparentnodebymorethanoneroworonecolumn.
Ifwepartitionthesetofstorednodesinthisway,wecansignicantlylimitthenumberofstorednodesthatneedtobecheckedtoguaranteethatallduplicatesarefound.
Wecontinuetousethismotivatingex-ampleinthefollowing,tohelpexplaintheidea.
Projectionfunctionandabstractstate-spacegraphOurapproachisbasedonusingastate-spaceprojectionfunctiontodecomposeastate-spacegraph,andcreateanab-stractstate-spacegraphthatrevealsthelocalstructureoftheoriginalgraphatahigh-level.
Stateabstractioninheuris-ticsearchiswellstudied,andisusedtocreateadmissibleheuristicsandtoorganizehierarchicalsearch(Holteetal.
1996).
Weuseanapproachtostateabstractionthatisverysimilartopreviousapproaches,buthasadifferentpurpose:localizingmemoryreferencesinduplicatedetection.
Astate-spaceprojectionfunctionisamany-to-onemap-pingfromtheoriginalstatespacetoanabstractstatespace,inwhicheachabstractstatecorrespondstoasetofstatesintheoriginalstatespace.
Ifastatexismappedtoanabstractstatey,thenyiscalledtheimageofx,andxiscalledthepre-imageofy.
Therearemanywaystodeneastate-spaceprojectionfunction.
Acommonapproachistoignoresomestatevariablesintheencodingoftheproblem.
Forexam-ple,asimplestate-spaceprojectionfunctionforthe(n21)sliding-tilepuzzlecanbedenedbyignoringthepositionsofalltilesandconsideringonlythepositionofthe"blank.
"Inthiscase,anabstractstatecorrespondstoallstateswiththesamepositionofthe"blank,"andtherearen2abstractstatescomparedton2!
/2statesintheoriginalstatespace.
State-spaceprojectionfunctionscanbedenedinasimilarwayforothersearchproblems(Klein&Manning2003).
Givenastate-spacegraphandstate-spaceprojectionfunc-tion,anabstractstate-spacegraphisconstructedasfollows.
1.
Thesetofnodes,calledabstractnodes,intheabstractstate-spacegraphcorrespondstothesetofabstractstates.
2.
Anabstractnodeyisasuccessorofanabstractnodeyiffthereexisttwostatesxandx,suchthata.
xisasuccessorofx,andb.
xandxarepreimagesofyandy,respectively.
Ify=y,itmeansthattheabstractnodeyhasaselfloop.
Forexample,Figure1(a)showstheninepossiblepositionsofthe"blank"intheEight-puzzle.
Figure1(b)showstheabstractstate-spacegraphcreatedbythesimplestate-spaceprojectionfunctionthatmapsastateintoanabstractstatebasedonlyonthepositionofthe"blank.
"EachabstractnodeBiinFigure1(b)correspondstothesetofstateswiththe"blank"locatedatpositioniinFigure1(a).
Duplicate-detectionscopeAlthoughtransformingastate-spacegraphintoanabstractstate-spacegraphcanresultinexponentialreductioninthe684SEARCHFigure1:Panel(a)showsallpossiblepositionsofthe"blank"fortheEight-puzzle.
Panel(b)showsanexampleofanabstractstate-spacegraphfortheEight-puzzle.
Thisdenitionofthestate-spaceprojectionfunctionisbasedonthepositionofthe"blank"only.
sizeofthegraph,thelocalstructureoftheoriginalstate-spacegraphcanbelargelypreserved.
Forexample,abstractnodeB0inFigure1(b)hasonlytwosuccessors;abstractnodesB1andB3.
Thiscapturesthefactthatasinglemoveofatilecanonlychangethepositionofthe"blank"byeitheronecolumn(B0→B1inthiscase)oronerow(B0→B3inthiscase).
Denition1Anabstractstate-spacegraphhaslocalstruc-tureifithasboundedout-degree.
Forthe(n21)sliding-tilepuzzleusingthissimplestate-spaceprojectionfunction,theout-degreeoftheabstractstate-spacegraphisboundedby4,foreveryn.
Thelocalstructureofanabstractstate-spacegraphcanbeusedtorestrictthescopeofduplicatedetectionsothatonlyafractionofstorednodesneedstobecheckedforduplicates,whilestillguaranteeingthatallduplicatesarefound.
Forexample,whengeneratingsuccessorsfornodesthatarepre-imagesofabstractnodeB0inFigure2,thealgorithmonlyneedstocheckforduplicatesagainststorednodesthatarepre-imagesofabstractnodeB1orB3.
Letanabstractnodey=p(x)betheimageofanodexunderastate-spaceprojectionfunctionp(·)andletsuccessors(y)bethesetofsuccessorabstractnodesofyintheabstractstate-spacegraph.
Denition2Theduplicate-detectionscopeofanodexun-derastate-spaceprojectionfunctionp(·)correspondstotheunionofsetsofstorednodesthatarepre-imagesofanab-stractnodeysuchthaty∈successors(p(x)),thatis,y∈successors(p(x))p1(y)wherep1(y)isthesetofstorednodesthatarepre-imagesofy.
Theorem1Theduplicate-detectionscopeofanodecon-tainsallstoredduplicatesofthesuccessorsofthenode.
Proof:Supposethatanodexhasapreviously-generatedsuccessornodexthatisnotincludedintheduplicate-detectionscopeofnodex.
Lety=p(x)betheimageofxunderthestate-spaceprojectionfunctionp(·).
Accord-ingtoDenition2,wehavey/∈successors(p(x)).
Butaccordingtothedenitionofanabstractstate-spacegraph,ifxisasuccessorofx,thenp(x)mustbeasuccessorofFigure2:Theduplicate-detectionscopeofnodesthatarepre-imagesofabstractnodeB0includesnodesthatarepre-imagesofabstractnodeB1orB3.
Whenexpandingnodesthatarepre-imagesofabstractnodeB0,thesearchalgorithmcanuseinternalmemorytostorenodesthatarepre-imagesofabstractnodeB0,B1,orB3,anduseexternalmemorytostoretherestofthenodes.
p(x).
Inotherwords,y∈successors(p(x)).
Sincethisleadstoacontradiction,Theorem1musthold.
2Theconceptofduplicate-detectionscopecanbeveryuse-fulinexternal-memorygraphsearch,becauseitsuggeststhatasearchalgorithmuseinternalmemorytostorenodeswithintheduplicate-detectionscopeofasetofexpandingnodes,anduseexternalmemorytostoreothernodes,wheninternalmemoryisfull.
Figure2illustratesthisidea.
Weusethetermnblocktorefertoaset(or"block")ofnodesintheoriginalstatespacethatcorrespondto(i.
e.
,arepre-imagesof)anabstractnode.
Ifanabstractstate-spacegraphhaslocalstructure,thentheduplicate-detectionscopeofanynodeconsistsofaboundednumberofnblocks,andthelargestduplicatedetectionscopecanbeasmallfractionoftheoverallnumberofstorednodes.
Notethatthelargestduplicatedetectionscopeestablishesaminimuminternal-memoryrequirementforstructuredduplicatedetection.
SearchusingstructuredduplicatedetectionWenowdescribehowtointegratestructuredduplicatede-tectionintoastate-spacesearchalgorithm.
Similartodelayedduplicatedetection,structureddupli-catedetectionisusedwithasearchalgorithmthatexpandsasetofnodesatatime,suchthattheunderlyingsearchstrat-egyisconsistentwithexpandingthenodesinthissetinanyorder.
Thisisstraightforwardinbreadth-rstsearch,whereasetofnodescorrespondstoallnodesinthesamelayerofthebreadth-rstsearchgraph.
(Welaterexplainhowitcanapplytobest-rstsearch.
)Giventhissetofnodes,struc-turedduplicatedetectiondeterminesanorderofexpansionthatminimizesI/Ocomplexity,byexploitinglocalstructurerevealedbytheabstractstate-spacegraph.
Tosupportstructuredduplicatedetection,thesetofstorednodes(i.
e.
,nodesintheOpenandClosedlists)ispartitionedbasedonanabstractstate-spacegraph.
Eachnblockinthepartitionconsistsofallstorednodesthatarepre-imagesofthesameabstractnode(i.
e.
,eachnblockcorrespondstoanabstractnode).
Thislocalstructureisexploitedasfollows.
First,nodesinthesamenblockareexpandedtogether,i.
e.
,consecutively.
Thisimproveslocalityofmemoryref-erencebecauseallnodesinthesamenblockhavethesameduplicate-detectionscope.
ButitleavesundeterminedtheSEARCH685orderinwhichtoconsidernblocks.
Ingeneral,iftheduplicate-detectionscopesoftwonblocksarealmostthesame,theyshouldbeexpandedclosetoeachother,inordertominimizethenumberofpossibleI/Ooperations.
Nblockscorrespondtoabstractnodes,andabstractnodesthatareneighborsintheabstractstate-spacegraphtendtohavesim-ilarduplicate-detectionscopes.
Therefore,wechoosetoex-pandnblocksinanorderthatreectsneighborrelationsintheabstractstate-spacespacegraph.
Asimplewaytodothisistoexpandthenblocksinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph.
Wheninternalmemoryisfull,thesearchalgorithmmustremovefrominternalmemoryoneormorenblocksthatdonotbelongtotheduplicate-detectionscopeofnodescurrentlybeingexpanded.
Immediatelybeforeexpandingnodesinadifferentnblock,itmustcheckifsomenblocksintheirduplicate-detectionscopearemissingfrominternalmemory,andifso,readthemfromexternalmemory.
Be-causereadingannblockintointernalmemoryoftenresultsinwritinganothernblocktoexternalmemory,werefertothesepairsofreadandwriteoperationsasnblockreplace-ments.
Exceptfornblocksthatarepartoftheduplicate-detectionscopeofnodesbeingexpanded,anynblockcanpotentiallybemovedfrominternalmemorytoexternalmemory.
De-cidingwhichnblockstoremoveisidenticaltothepage-replacementstrategyofvirtualmemory,exceptthatnowtheunitofreplacementisannblockinsteadofapage.
Thus,wecallitannblock-replacementstrategy.
Itiswell-knownthattheoptimalreplacementstrategyalwaysremovesthepage(ornblock)thatwillnotbeusedforthelongesttime(Be-lady1966).
Implementingsuchastrategyisgenerallyim-possible,asitrequiresfutureknowledgeoftheorderinwhichpages(nblocks)willbeneeded,andleast-recently-usedisoftenthebeststrategyinpractice(Sleator&Tar-jan1985).
However,itispossibletocomputeanoptimalnblock-replacementstrategyinourcase,sincetheorderinwhichnblocksareexpandedisgivenbythebreadth-rsttraversalorderofanabstractstate-spacegraph.
Thus,givenasetofnodestoexpand,structuredduplicatedetectionexpandstheminanordersuchthat(1)nodesinthesamenblockareexpandedtogether,(2)nblocksareconsid-eredinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph,and,(3)wheninternalmemorybecomesfull,selectionofwhichnblockstowritetodiskisbasedonapre-computedoptimalnblock-replacementstrat-egy,orelse,theleast-recentlyusedstrategy.
Ititstraightforwardtousestructuredduplicatedetectionwithbreadth-rstsearch,wherethesetofnodesbeingex-pandedisalayerofthebreadth-rstsearchgraph.
Itisalsopossibletouseitwithabest-rstsearchalgorithmsuchasA*.
Insolvingasearchproblemwithmanyties,theorderinwhichA*expandsnodeswiththesamef-valueisnon-deterministic,andstructuredduplicatedetectioncandeter-mineanorderthatminimizesI/Ocomplexity.
The(n21)sliding-tilepuzzleisanexampleofadomainwithmanyties.
Forsearchproblemsinwhichtherearenotmanyties,itispossibletousestructuredduplicatedetectionaspartofaslightly-modiedversionofA*,inwhichA*selectsasetofnodestoexpandcontainingnodeswithalmostthesamef-value–forexample,thebestknodesontheOpenlist,wherekisasuitablylargenumber.
Structuredduplicatede-tectioncandeterminetheorderinwhichtoexpandthissetofnodes.
Althoughitmayexpandnodesinanorderthatisnotstrictlybest-rst,improvedperformancefromstructureddu-plicatedetectionmayoutweighanincreaseinthenumberofnodesexpanded.
Thismodicationofthebest-rstexpan-sionorderofA*toimprovelocalityofmemoryreferenceswaspreviouslyproposedbyEdelkampandSchr¨odl(2000)asawayofimprovinguseofvirtualmemory.
I/OcomplexityI/Ocomplexityistheprimaryfactorthataffectsthespeedofexternal-memoryalgorithms.
TheI/OcomplexityofsearchinanimplicitgraphdependsentirelyontheI/Ocomplexityofduplicatedetection.
Theworst-caseI/OcomplexityofdelayedduplicatedetectionisO(n+mBlogMB(n+m)),wherenisthenumberofnodesinthestate-spacegraph,misthenumberofedges,Bisthesizeofadiskblock,Misthesizeofinternalmemory,andO(xBlogMBx)istheworst-caseI/Ocomplexityofexternallysortingxelements(Munagala&Ranade1999).
Forstructuredduplicatedetection,wehavethisresult.
Theorem2Ifeveryduplicate-detectionscopetsininter-nalmemory,theworst-caseI/Ocomplexityofstructureddu-plicatedetectionisO(nB·E).
Inthisresult,Eisthenumberofdirectededgesintheab-stractstate-spacegraph.
(Tokeeptheanalysissimple,weconsideranundirectededgeastworeciprocallydirectededges.
)ThetheoremfollowsfromthefactthatEistheworst-casenumberofnblockreplacementsthatmustbedoneintraversingtheabstractstate-spacegraph,andtheworst-casenumberofI/Ooperationsneededpernblockre-placementisthesizeofthelargestnblockdividedbythediskblocksize,wherenboundsthesizeofthelargestnblock.
Thefactor1Bisthesameinbothanalyses.
Sinceanab-stractstate-spacegraphistypicallyexponentiallysmallerthantheoriginalstate-spacegraph,thefactorEforstruc-turedduplicatedetectioniscomparabletologMB(n+m)fordelayedduplicateddetection.
ThedifferenceinI/Ocom-plexityisthedifferencebetweenthefactorn+mfordelayedduplicatedetection,andthefactornforstructuredduplicatedetection.
Thepresenceofn+minthecomplexityanalysisfordelayedduplicatedetection,incontrasttonforstruc-turedduplicatedetection,canbeinterpretedasapenaltyfordelayingduplicatedetection,sinceitboundsthecardinalityofthemulti-setofsuccessorsthatisgeneratedbyexpandingasetofnodesbeforeperformingduplicatedetection.
Inahighly-connectedgraph,mismuchlargerthann,andthepenaltyfordelayingduplicatedetectioncanbese-vere.
Itfollowsthatdelayedduplicatedetectionworksbestinsparsegraphs.
However,manyimportantsearchproblemsdonothavesparsegraphs.
Anexampleismultiplesequencealignment,amotivatingproblemforfrontiersearch(Korf&Zhang2000).
Anadvantageofstructuredduplicatede-tectionisthatitworksequallywellinsparseandhighly-connectedgraphs.
686SEARCH#SolIntMemExtMemExpSecs1766564,62816,389,872279,167,41180149591,382,50420,557,691345,487,8638985364481,53312,588,843224,545,8536415655228,33412,989,362208,969,4456655957414,77513,829,299228,945,35167160661,738,02256,436,901978,819,6463,06966611,355,26420,699,891368,181,73597382621,410,29246,329,201765,608,9892,38988651,808,59177,711,2351,360,544,0934,4569257371,77912,505,737213,445,215642Table1:Performanceonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle.
Columnsshowtheinstancenumber(#);solutionlength(Sol);peaknumberofnodesstoredininternalmemory(IntMem);peaknumberofnodesstoredinexternalmemory(ExtMem),numberofnodeexpansions(Exp),andrunningtimeinCPUseconds(Secs).
Unlikedelayedduplicatedetection,structuredduplicatedetectionhasaminimuminternal-memoryrequirement.
Theorem2holdsonlyifeveryduplicatedetectionscopetsininternalmemory.
So,whetheritholdsdependspartlyonthesizeofinternalmemory,andpartlyonthelocalityofthegraph,andhowwellitiscapturedinanabstractstate-spacegraph.
Ifthelargestduplicatedetectionscopedoesnottininternalmemory,itispossibletocombinestructureddupli-catedetectionwithdelayedduplicatedetection.
Givenasetofnodestoexpand,thesetcanbepartitionedbasedontheabstractstate-spacegraph,anddelayedduplicatedetectioncanbeperformedseparatelyonthenodesineachnblock,asawayofleveraginglocalstructuretoimproveperformance.
ComputationalresultsHowefcientlytheunderlyinggraph-searchalgorithmusesinternalmemoryhasaneffectondiskI/O,sinceitaf-fectshowmuchtotalmemoryisneeded.
Forthisreason,Korf(2003a;2003b)builthisexternal-memorygraphsearchalgorithmontopoffrontiersearch,averymemory-efcientgraph-searchalgorithm.
Inourexperiments,wealsocom-binestructuredduplicatedetectionwithmemory-efcientgraph-searchalgorithms.
Adifferenceisthatwecon-siderheuristicsearch,whereaspreviousworkonexternal-memorygraphsearchconsidersbreadth-rstsearchonly.
FortheresultsinTables1and2,weranthesearchalgo-rithmsinamodethatminimizesuseofinternalmemory.
Sothepeakamountofinternalmemoryusedcorrespondstotheminimuminternal-memoryrequirementsofthealgorithms,usingstructuredduplicatedetectionandagivenabstractionofthestatespace.
Ordinarily,thealgorithmswoulduseallavailableRAMbeforeusingdisk.
Thealgorithmswererunona2.
4GHzIntelPentiumwith2gigabytesofRAManda7200RPMSeagatediskwith120gigabytesofstorage.
Fifteen-puzzleTosolvetheFifteenpuzzle,weusestructuredduplicatedetectiontogetherwithabreadth-rstheuristicsearchal-gorithmthatusesdivide-and-conquersolutionreconstruc-tion,calledBreadth-FirstIterative-DeepeningA*(Zhou&Hansen2004).
Unlikebrute-forcebreadth-rstsearch,thisNameCostIntMemExtMemExpSecs1aboA8,4838K130K4,207K451amk33,96029K555K31,704K3731csy14,1606K52K1,943K251ezm40,73317K154K6,186K731gtr58,010188K7,468K1,094,936K22,3811idy7,88819K415K10,246K1101pfc15,71812K153K6,600K711tis37,58124K383K29,778K3571wit13,97953K1,513K80,591K1,192actin52,117102K2,783K240,168K3,972Table2:Performanceinaligninggroupsof5proteinsequencesfromreferenceset1ofBAliBASE(Thompson,Plewniak,&Poch1999).
Columnsshownameofinstance,costofoptimalalignment,peaknumberofnodesstoredininternalmemory(inthousands),peaknumberofnodesstoredinexternalmemory(inthousands),numberofnodeexpansions(inthousands),andCPUseconds.
algorithmusesupperandlowerboundstoprunenodesthatcannotbepartofanoptimalsolution.
Table1showshowthealgorithmperformsonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle(Korf1985).
Weusedastate-spaceprojectionfunctionthatgroupsto-getherstatesbasedonthepositionsoftiles15and8,plusthepositionofthe"blank,"creatinganabstractstate-spacegraphwith16·15·14=3360nodes.
Basedonthispartitionofthestatespace,structuredduplicatedetectionreducestheinternal-memoryrequirementofthesearchalgorithmbyafactorofbetween16and58times.
Inexchange,itincreasesrunningtimebyonlyabout24%insolvingtheseexam-ples.
Usingbothstructuredduplicatedetectionanddivide-and-conquersolutionreconstruction,all100instancesofthe15-puzzlearesolvedusingnomorethan42megabytesofRAM,withouteverre-expandinganodeinthesameitera-tion.
Basedonthenumberofdistinctnodesexpanded,A*wouldneed28gigabytesofRAMjusttostoretheClosedlistinsolvinginstance88.
MultiplesequencealignmentThemultiplesequencealignmentproblemisanexampleofasearchproblemthatcannotbesolvedefcientlyus-ingdelayedduplicatedetectionbecauseitssearchgraphisveryhighly-connected.
Thesearchgraphforthemulti-plesequencealignmentproblemisann-dimensionalhyper-lattice,wherenisthenumberofsequencesbeingaligned.
WeimplementedstructuredduplicatedetectionontopofasearchalgorithmcalledSweepA*,whichusesdivide-and-conquersolutionreconstructiontolimituseofmem-ory(Zhou&Hansen2003).
1Wetestedtheresultingal-gorithmonaselectionofdifcultmultiplesequencealign-mentproblemsfromreferenceset1ofBAliBASE,awidely-usedbenchmark(Thompson,Plewniak,&Poch1999).
All1SweepA*isaspecializedsearchalgorithmformultiplese-quencealignmentthatexpandsnodesonalayer-by-layerbasis,whichmakesitagoodtforstructuredduplicatedetection.
Itisalsoverymemory-efcient.
Forcomparison,frontiersearch(Korf&Zhang2000)cannotsolveinstances1gtr,1wit,oractininTa-ble2within2gigabytesofRAM,andusesanaverageof25timesmorememorythanSweepA*insolvingtheotherinstances.
SEARCH687problemsinvolvedaligning5proteinsequences.
OurcostfunctionwasaDayhoffsubstitutionmatrixwithlineargappenaltyof8,andweusedapairwisealignmentadmissibleheuristic.
Tocreateanabstractstate-spacegraph,weusedastate-spaceprojectionfunctionthatignoresallbut3se-quences.
ThiscreatesanabstractstatespacewithO(l3)ab-stractnodes,wherelistheaveragelengthofasequence,comparedtoO(l5)nodesintheoriginalstate-spacegraph.
Table2showsthatstructuredduplicatedetectionreducestheinternal-memoryrequirementsofSweepA*byafactorofbetween10and40times.
Usingdisk,thealgorithmneedsonly20megabytesofRAMtosolveallinstancesinTable2,includingthespaceforthepairwiseheuristic.
Interestingly,theexternal-memoryversionofSweepA*runsfasterthananinternal-memoryversionofSweepA*thatdoesnotusestructuredduplicatedetection,byanaverageof72%!
Thereasonisthatstructuredduplicatedetectionrunsfasterinin-ternalmemorythanun-structuredduplicatedetectionduetolocalityofmemoryreferences,andthisspeedupoutweighstheextratimefordiskI/O.
Thespeedupismoreevidentinmultiplesequencealignmentthanthesliding-tilepuz-zlebecausethemultiplesequencealignmentsearchgraphismuchmorehighly-connected,whichsignicantlyincreasesthenumberofduplicatesgeneratedpernodeexpansion.
Speedingupinternal-memorysearchWeemphasizethatforboththeFifteenpuzzleandmulti-plesequencealignment,structuredduplicatedetectionim-provestheperformanceofinternal-memorygraphsearch,inadditiontousingexternalmemoryefciently.
Thisisbe-causeeachtimethesearchalgorithmchecksforduplicates,itonlyneedstocheckasmallsubsetofthestorednodes.
FortheFifteenpuzzle,thisleadstoa16%speedupininternal-memoryperformance.
Formultiplesequencealignment,itcutstherunningtimeofinternal-memorysearchinhalf.
Again,thespeedupismuchgreaterformultiplesequencealignmentbecauseinaveryhighly-connectedgraph,manymoreduplicatesaregenerated.
Thisunderscoresabenetofstructuredduplicatedetection.
Unlikedelayedduplicatedetection,itismoreeffectiveinhighly-connectedgraphs,wheretheproblemofduplicatedetectionismorecrucial.
ConclusionWehaveintroducedstructuredduplicatedetection,anovelapproachtolocalizingmemoryreferencesinduplicatede-tection,andshowedthatitcanreducetheI/Ocomplexityofexternal-memorygraphsearch,aswellasincreasethespeedofduplicatedetectionininternal-memorysearch.
Structuredduplicatedetectionhassomeadvantagesoverdelayedduplicatedetection,theonlypreviousapproachtoexternal-memorygraphsearch.
Inparticular,itismoreef-fectiveinhighly-connectedgraphs,anditcanexploitlocalgraphstructurethatisnotconsideredbydelayedduplicatedetection.
ButstructuredduplicatedetectionanddelayedduplicatedetectioncanalsobeviewedascomplementaryapproachestoreducingI/Ocomplexity,andcanbeusedto-getherfordifcultsearchproblems.
Ifasearchalgorithmcanleverageenoughlocalityinastate-spacegraphsothateveryduplicate-detectionscopetsininternalmemory,itisunnecessarytoeverdelayduplicatedetection,andstruc-turedduplicatedetectioncanmanageexternalmemorybyitself.
Butifanyduplicate-detectionscopedoesnottinin-ternalmemory(perhapsduetolackofsufcientlocalstruc-tureinthegraph),structuredduplicatedetectionbyitselfisnotsufcient.
Sincedelayedduplicatedetectiondoesnothaveaminimummemoryrequirement,itcanbeusedinthiscase.
Structuredduplicatedetectioncanbeusedtogetherwithdelayedduplicatedetectiontoimproveitsperformance,byleveraginggraphlocalityasmuchaspossible.
AcknowledgmentsWethanktheanonymousreviewersforhelpfulcomments.
ThisworkwassupportedinpartbyNSFgrantIIS-9984952andNASAgrantNAG-2-1463.
ReferencesAggarwal,A.
,andVitter,J.
1988.
Theinput/outputcomplexityofsortingandrelatedproblems.
Comm.
oftheACM31(9):1116–27.
Belady,L.
1966.
Astudyofreplacementalgorithmsforvirtualstorage.
IBMSystemsJournal5:78–101.
Edelkamp,S.
,andSchr¨odl,S.
2000.
LocalizingA*.
InProc.
ofthe17thNationalConferenceonArticialIntelligence,885–890.
Holte,R.
;Mkadmi,T.
;Zimmer,R.
;andMacDonald,A.
1996.
Speedingupproblemsolvingbyabstraction:Agraph-orientedapproach.
ArticialIntelligence85(1–2):321–361.
Klein,D.
,andManning,C.
2003.
FactoredA*searchformodelsoversequencesandtrees.
InProceedingsofthe17thInternationalJointConferenceonArticialIntelligence,1246–1251.
Korf,R.
,andZhang,W.
2000.
Divide-and-conquerfrontiersearchappliedtooptimalsequencealignment.
InProceedingsofthe17thNationalConferenceonArticialIntelligence,910–916.
Korf,R.
1985.
Depth-rstiterativedeepening:Anoptimalad-missibletreesearch.
ArticialIntelligence27:97–109.
Korf,R.
2003a.
Breadth-rstfrontiersearchwithdelayeddupli-catedetection.
InProceedingsoftheWorkshoponModelCheck-ingandArticialIntelligenceatIJCAI-03,87–92.
Korf,R.
2003b.
Delayedduplicatedetection:Extendedabstract.
InProceedingsofthe17thInternationalJointConferenceonAr-ticialIntelligence,1539–1541.
Mehlhorn,K.
,andMeyer,U.
2002.
External-memorybreadth-rstsearchwithsublinearI/O.
InProceedingsofthe10thAnnualEuropeanSymposiumonAlgorithms,723–735.
Munagala,K.
,andRanade,A.
1999.
I/O-complexityofgraphalgorithms.
InProceedingsofthe10thSymposiumondiscretealgorithms,687–694.
ACM-SIAM.
Sleator,D.
,andTarjan,R.
1985.
Amortizedefciencyoflistupdateandpagingrules.
CommunicationsoftheACM28:202–8.
Thompson,J.
;Plewniak,F.
;andPoch,O.
1999.
BAliBASE:Abenchmarkalignmentdatabasefortheevaluationofmultiplealignmentprograms.
Bioinformatics15(1):87–88.
Zhou,R.
,andHansen,E.
2003.
SweepA*:Space-efcientheuristicsearchinpartiallyorderedgraphs.
InProc.
of15thIEEEInternationalConf.
onToolswithArticialIntelligence,427–434.
Zhou,R.
,andHansen,E.
2004.
Breadth-rstheuristicsearch.
InProceedingsofthe14thInternationalConferenceonAutomatedPlanningandScheduling.
688SEARCH

PQ.hosting:香港HE/乌克兰/俄罗斯/荷兰/摩尔多瓦/德国/斯洛伐克/捷克vps,2核/2GB内存/30GB NVMe空间,€3/月

PQ.hosting怎么样?PQ.hosting是一家俄罗斯商家,正规公司,主要提供KVM VPS和独立服务器,VPS数据中心有香港HE、俄罗斯莫斯科DataPro、乌克兰VOLIA、拉脱维亚、荷兰Serverius、摩尔多瓦Alexhost、德国等。部分配置有变化,同时开通Paypal付款。香港、乌克兰、德国、斯洛伐克、捷克等为NVMe硬盘。香港为HE线路,三网绕美(不太建议香港)。免费支持wi...

青果云(59元/月)香港多线BGP云服务器 1核 1G

青果云香港CN2_GIA主机测评青果云香港多线BGP网络,接入电信CN2 GIA等优质链路,测试IP:45.251.136.1青果网络QG.NET是一家高效多云管理服务商,拥有工信部颁发的全网云计算/CDN/IDC/ISP/IP-VPN等多项资质,是CNNIC/APNIC联盟的成员之一。青果云香港CN2_GIA主机性能分享下面和大家分享下。官方网站:点击进入CPU内存系统盘数据盘宽带ip价格购买地...

CYUN(29元/月)美国、香港、台湾、日本、韩国CN2,续费原价

关于CYUN商家在之前有介绍过一次,CYUN是香港蓝米数据有限公司旗下的云计算服务品牌,和蓝米云、蓝米主机等同属该公司。商家主要是为个人开发者用户、中小型、大型企业用户提供一站式核心网络云端部署服务,促使用户云端部署化简为零,轻松快捷运用云计算。目前,CYUN主要运营美国、香港、台湾、日本、韩国CN2线路产品,包括云服务器、站群服务器和独立服务器等。这次看到CYUN夏季优惠活动发布了,依然是熟悉的...

graphsearch为你推荐
徐州微信5Applicationto支持ipaditunes备份怎样用itunes备份iphonegoogle中国地图谷歌退出中国,地图要是关了就太可惜了!手机谷歌地图还能用吗?googleadsencegoogle adsense打不开怎么办css选择器CSS中的选择器分几种?google统计怎样获得google ga 统计代码苹果5.1.1越狱iphone5.1.1越狱老是越狱失败,说要抹掉数据,怎么抹掉数据不懂,接下来该怎么弄 求大神指教android5.1安卓系统5.1好吗
vps代理 查询ip地址 中文域名申请 t牌 kddi godaddy优惠码 监控宝 抢票工具 sub-process godaddy 一元域名 500m空间 最好的空间 免费mysql web服务器搭建 789 godaddyssl fatcow phpwind论坛 防盗链 更多