legraph

graphsearch  时间:2021-02-11  阅读:()
StructuredDuplicateDetectioninExternal-MemoryGraphSearchRongZhouandEricA.
HansenDepartmentofComputerScienceandEngineeringMississippiStateUniversityMississippiState,MS39762{rzhou,hansen}@cse.
msstate.
eduAbstractWeconsiderhowtouseexternalmemory,suchasdiskstor-age,toimprovethescalabilityofheuristicsearchinstate-spacegraphs.
TolimitthenumberofslowdiskI/Ooper-ations,wedevelopanewapproachtoduplicatedetectioningraphsearchthatlocalizesmemoryreferencesbypartitioningthesearchgraphbasedonanabstractionofthestatespace,andexpandingthefrontiernodesofthegraphinanorderthatrespectsthispartition.
Wedemonstratetheeffectivenessofthisapproachbothanalyticallyandempirically.
IntroductionHeuristicsearchinstate-spacegraphsisawidely-usedframeworkforproblemsolvinginAI,butitsscalabilityislimitedbymemoryrequirements.
Thishasledtoexten-siveresearchonhowtouseavailablememoryefcientlyinsearchinglargestate-spacegraphs.
Theconceptof"availablememory"isambiguous,sincememoryinmostcomputersystemshasahierarchicalstruc-tureinwhichfast,random-accessinternalmemoryiscom-plementedbylow-speedexternalmemory,suchasdiskstor-age.
Althoughexternalmemoryisvastlylarger(andmuchcheaper)thaninternalmemory,mostheuristicsearchal-gorithmsaredesignedtouseinternalmemoryonly,andmostresearchonmemory-efcientheuristicsearchassumesamemorymodelinwhichaccesstoallstoreddataitemsisequallyfast.
Algorithmsthatassumethismemorymodelperformextremelypoorlyifexternalmemoryisused,sincerandomaccessofexternalmemoryrequirestime-consumingI/Ooperationsthatareseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory.
Somerecentworkbeginstoaddressthisissue.
EdelkampandSchr¨odl(2000)consideruseofvirtualmemoryinA*graphsearch,andproposetechniquesthatchangethebest-rstorderofnodeexpansionsinawaythatimprovesrefer-encelocalityandreducesthenumberofpagefaultsinvirtualmemory.
AmoredirectapproachtolimitingslowdiskI/Oistodesignasearchalgorithmthatexplicitlymanagesaccesstoexternalmemory,sincethisapproachcanexploitknowl-edgeofhowthegraphisstructured,andisnotlimitedbythesizeofvirtualmemory,whichisusuallymuchsmallerthanexternalmemory.
InthetheoreticalcomputerscienceCopyrightc2004,AmericanAssociationforArticialIntelli-gence(www.
aaai.
org).
Allrightsreserved.
community,therehasbeenextensiveresearchonexternal-memoryalgorithms,includingdevelopmentofexternal-memorybreadth-rstsearchalgorithmsforexplicitlyrep-resentedgraphs(Munagala&Ranade1999;Mehlhorn&Meyer2002).
IntheAIcommunity,Korf(2003a;2003b)recentlydescribedanexternal-memorybreadth-rstsearchalgorithmforimplicitlyrepresentedgraphs.
Akeyissueindesigninganefcientexternal-memorygraphsearchalgorithmisduplicatedetection.
Ingraphsearch,thesamenodecanbereachedalongdifferentpaths,andpreventingredundantsearcheffortrequiresstoringalready-visitednodes,sothatthesearchalgorithmcanrec-ognizewhenitencountersanodethathasbeenalreadyexplored.
Becauseduplicatedetectionpotentiallyrequirescomparingeachnewlygeneratednodetoallstorednodes,itcanleadtocripplingdiskI/Oifalready-explorednodesarestoredondisk.
Allpreviousexternal-memorybreadth-rstsearchalgorithmslimitdiskI/Obyusingatechniquecalleddelayedduplicatedetectioninwhichanentiresetofnodesisexpandedbeforeperforminganyduplicatedetection.
Inthispaper,weintroduceanewapproach.
Insteadofdelayingduplicatedetection,welimitdiskI/Obylocalizingmem-oryreferencesbasedonthestructureofthesearchgraphasrevealedbyahigh-levelabstractionofthestatespace.
Com-putationalresultsonchallenginggraph-searchproblemsin-dicatethatthisapproach,whichwecallstructuredduplicatedetection,signicantlylimitsdiskI/O.
Forgraphswithsuf-cientlocalstructure,itallowsgraph-searchalgorithmstouseexternalmemoryalmostasefcientlyasinternalmemory.
BackgroundDevelopmentofexternal-memoryalgorithmsforvariouscomputationalproblemsisanactiveareaofresearch.
Be-causediskI/Oisseveralordersofmagnitudeslowerthanrandomaccessofinternalmemory,theconceptofI/Ocom-plexityhasbeenintroducedtoanalyzetheperformanceofexternal-memoryalgorithms(Aggarwal&Vitter1988).
Itassumesatwo-levelmodelofmemoryinwhichinternalmemoryofsizeMissupplementedbyexternalmemory,andanI/OoperationmovesdatainblocksofsizeBbetweenin-ternalmemoryandexternalmemory,where1searchhasbeenconsid-eredbyseveralresearchers.
Inthetheoreticalcomputersci-SEARCH683encecommunity,researchershavefocusedonestablishingboundsonitsI/Ocomplexity.
Agraphisassumedtobeexplicitlyrepresentedusingadjacencylistsandstoredinex-ternalmemoryduetoitslargesize.
Insearchingexplicitgraphs,worst-caseI/Ocomplexitydependsonthemethodofgeneratingsuccessornodesaswellasthemethodofdu-plicatedetection,sinceadjacencylistsmustbereadintointernalmemorytodeterminethesuccessorsofanode.
MunagalaandRanade(1999)describeanexternal-memorybreadth-rstsearchalgorithmsthatadoptsanapproachtosuccessorgenerationthathaslinearcomplexityinthenum-berofedgesinthegraph.
MehlhornandMeyer(2002)achievesublinearcomplexityforsuccessorgenerationbyamoresophisticatedapproachthatinvolvespartitioningtheadjacencylistbasedonadecompositionofthegraphintoconnectedsubgraphs.
Bothalgorithmsusethesamemethodofduplicatedetection,calleddelayedduplicatedetection,inwhichasetofnodesisexpandedwithoutperformingdupli-catedetection,andthemulti-setofgeneratedsuccessorsissortedandcheckedforduplicatesinasingle,moreefcientoperation.
Thealgorithmsalternatebetweenthesetwosteps.
1.
successorgeneration,inwhichthealgorithmgeneratessuccessorsforasetofnodesonthesearchfrontierandappendsthesesuccessorstoale(orles)intheorderinwhichtheyaregenerated,withoutperforminganydupli-catedetection,and2.
delayedduplicatedetection,inwhichthele(s)ofsucces-sornodesaresorted(usuallybyanexternal-memorysortalgorithm)basedontheirindicesorstateencodings,fol-lowedbyascanandcompactionstageinwhichduplicatenodesinthesortedle(s)areeliminated.
IntheAIcommunity,Korf(2003a;2003b)recentlyde-scribedanexternal-memorybreadth-rstsearchalgorithmthatusesthesamemethodofdelayedduplicatedetection.
Hisalgorithmdiffersintwoways.
First,heconsiderssearchinimplicitgraphs,inkeepingwiththestandardAIapproachtostate-spacesearch.
Animplicitgraphisacompactrep-resentationofagraphintheformofastartnode,anodeexpansionfunctionthatgeneratestheimmediatesuccessorsofanode,andapredicatethattestswhetheranodeisagoalnode.
Insearchingimplicitgraphs,successorgener-ationdoesnotrequirediskI/O,andduplicatedetectionistheonlysourceofI/Ocomplexity.
AseconddifferenceisthatKorfbuildshisexternal-memorybreadth-rstsearchalgorithmontopoffrontiersearch,amemory-efcientsearchalgorithmthatdoesnotneedtostoreaClosedlist(Korf&Zhang2000).
StructuredduplicatedetectionWeproposeanalternativeapproachtoduplicatedetectioninexternal-memorygraphsearchthathassomesignicantadvantagesoverdelayedduplicatedetection.
Itcanalsobecombinedwithdelayedduplicatedetectioninsomecases.
Thecentralideaofourapproachistoexploitthestruc-tureofastate-spacegraphinordertolocalizememoryref-erencesduringduplicatedetection,andsowecalltheap-proachstructuredduplicatedetection.
Asanexample,con-siderthe(n21)sliding-tilepuzzle.
Whenanewnodeisgenerated,itcanonlydifferfromitsparentnodewithre-specttothepositionofthe"blank"byeitheroneroworonecolumn.
Therefore,incheckingforduplicates,itisnotnec-essarytocheckstorednodesforwhichthepositionofthe"blank"differsfromthatoftheparentnodebymorethanoneroworonecolumn.
Ifwepartitionthesetofstorednodesinthisway,wecansignicantlylimitthenumberofstorednodesthatneedtobecheckedtoguaranteethatallduplicatesarefound.
Wecontinuetousethismotivatingex-ampleinthefollowing,tohelpexplaintheidea.
Projectionfunctionandabstractstate-spacegraphOurapproachisbasedonusingastate-spaceprojectionfunctiontodecomposeastate-spacegraph,andcreateanab-stractstate-spacegraphthatrevealsthelocalstructureoftheoriginalgraphatahigh-level.
Stateabstractioninheuris-ticsearchiswellstudied,andisusedtocreateadmissibleheuristicsandtoorganizehierarchicalsearch(Holteetal.
1996).
Weuseanapproachtostateabstractionthatisverysimilartopreviousapproaches,buthasadifferentpurpose:localizingmemoryreferencesinduplicatedetection.
Astate-spaceprojectionfunctionisamany-to-onemap-pingfromtheoriginalstatespacetoanabstractstatespace,inwhicheachabstractstatecorrespondstoasetofstatesintheoriginalstatespace.
Ifastatexismappedtoanabstractstatey,thenyiscalledtheimageofx,andxiscalledthepre-imageofy.
Therearemanywaystodeneastate-spaceprojectionfunction.
Acommonapproachistoignoresomestatevariablesintheencodingoftheproblem.
Forexam-ple,asimplestate-spaceprojectionfunctionforthe(n21)sliding-tilepuzzlecanbedenedbyignoringthepositionsofalltilesandconsideringonlythepositionofthe"blank.
"Inthiscase,anabstractstatecorrespondstoallstateswiththesamepositionofthe"blank,"andtherearen2abstractstatescomparedton2!
/2statesintheoriginalstatespace.
State-spaceprojectionfunctionscanbedenedinasimilarwayforothersearchproblems(Klein&Manning2003).
Givenastate-spacegraphandstate-spaceprojectionfunc-tion,anabstractstate-spacegraphisconstructedasfollows.
1.
Thesetofnodes,calledabstractnodes,intheabstractstate-spacegraphcorrespondstothesetofabstractstates.
2.
Anabstractnodeyisasuccessorofanabstractnodeyiffthereexisttwostatesxandx,suchthata.
xisasuccessorofx,andb.
xandxarepreimagesofyandy,respectively.
Ify=y,itmeansthattheabstractnodeyhasaselfloop.
Forexample,Figure1(a)showstheninepossiblepositionsofthe"blank"intheEight-puzzle.
Figure1(b)showstheabstractstate-spacegraphcreatedbythesimplestate-spaceprojectionfunctionthatmapsastateintoanabstractstatebasedonlyonthepositionofthe"blank.
"EachabstractnodeBiinFigure1(b)correspondstothesetofstateswiththe"blank"locatedatpositioniinFigure1(a).
Duplicate-detectionscopeAlthoughtransformingastate-spacegraphintoanabstractstate-spacegraphcanresultinexponentialreductioninthe684SEARCHFigure1:Panel(a)showsallpossiblepositionsofthe"blank"fortheEight-puzzle.
Panel(b)showsanexampleofanabstractstate-spacegraphfortheEight-puzzle.
Thisdenitionofthestate-spaceprojectionfunctionisbasedonthepositionofthe"blank"only.
sizeofthegraph,thelocalstructureoftheoriginalstate-spacegraphcanbelargelypreserved.
Forexample,abstractnodeB0inFigure1(b)hasonlytwosuccessors;abstractnodesB1andB3.
Thiscapturesthefactthatasinglemoveofatilecanonlychangethepositionofthe"blank"byeitheronecolumn(B0→B1inthiscase)oronerow(B0→B3inthiscase).
Denition1Anabstractstate-spacegraphhaslocalstruc-tureifithasboundedout-degree.
Forthe(n21)sliding-tilepuzzleusingthissimplestate-spaceprojectionfunction,theout-degreeoftheabstractstate-spacegraphisboundedby4,foreveryn.
Thelocalstructureofanabstractstate-spacegraphcanbeusedtorestrictthescopeofduplicatedetectionsothatonlyafractionofstorednodesneedstobecheckedforduplicates,whilestillguaranteeingthatallduplicatesarefound.
Forexample,whengeneratingsuccessorsfornodesthatarepre-imagesofabstractnodeB0inFigure2,thealgorithmonlyneedstocheckforduplicatesagainststorednodesthatarepre-imagesofabstractnodeB1orB3.
Letanabstractnodey=p(x)betheimageofanodexunderastate-spaceprojectionfunctionp(·)andletsuccessors(y)bethesetofsuccessorabstractnodesofyintheabstractstate-spacegraph.
Denition2Theduplicate-detectionscopeofanodexun-derastate-spaceprojectionfunctionp(·)correspondstotheunionofsetsofstorednodesthatarepre-imagesofanab-stractnodeysuchthaty∈successors(p(x)),thatis,y∈successors(p(x))p1(y)wherep1(y)isthesetofstorednodesthatarepre-imagesofy.
Theorem1Theduplicate-detectionscopeofanodecon-tainsallstoredduplicatesofthesuccessorsofthenode.
Proof:Supposethatanodexhasapreviously-generatedsuccessornodexthatisnotincludedintheduplicate-detectionscopeofnodex.
Lety=p(x)betheimageofxunderthestate-spaceprojectionfunctionp(·).
Accord-ingtoDenition2,wehavey/∈successors(p(x)).
Butaccordingtothedenitionofanabstractstate-spacegraph,ifxisasuccessorofx,thenp(x)mustbeasuccessorofFigure2:Theduplicate-detectionscopeofnodesthatarepre-imagesofabstractnodeB0includesnodesthatarepre-imagesofabstractnodeB1orB3.
Whenexpandingnodesthatarepre-imagesofabstractnodeB0,thesearchalgorithmcanuseinternalmemorytostorenodesthatarepre-imagesofabstractnodeB0,B1,orB3,anduseexternalmemorytostoretherestofthenodes.
p(x).
Inotherwords,y∈successors(p(x)).
Sincethisleadstoacontradiction,Theorem1musthold.
2Theconceptofduplicate-detectionscopecanbeveryuse-fulinexternal-memorygraphsearch,becauseitsuggeststhatasearchalgorithmuseinternalmemorytostorenodeswithintheduplicate-detectionscopeofasetofexpandingnodes,anduseexternalmemorytostoreothernodes,wheninternalmemoryisfull.
Figure2illustratesthisidea.
Weusethetermnblocktorefertoaset(or"block")ofnodesintheoriginalstatespacethatcorrespondto(i.
e.
,arepre-imagesof)anabstractnode.
Ifanabstractstate-spacegraphhaslocalstructure,thentheduplicate-detectionscopeofanynodeconsistsofaboundednumberofnblocks,andthelargestduplicatedetectionscopecanbeasmallfractionoftheoverallnumberofstorednodes.
Notethatthelargestduplicatedetectionscopeestablishesaminimuminternal-memoryrequirementforstructuredduplicatedetection.
SearchusingstructuredduplicatedetectionWenowdescribehowtointegratestructuredduplicatede-tectionintoastate-spacesearchalgorithm.
Similartodelayedduplicatedetection,structureddupli-catedetectionisusedwithasearchalgorithmthatexpandsasetofnodesatatime,suchthattheunderlyingsearchstrat-egyisconsistentwithexpandingthenodesinthissetinanyorder.
Thisisstraightforwardinbreadth-rstsearch,whereasetofnodescorrespondstoallnodesinthesamelayerofthebreadth-rstsearchgraph.
(Welaterexplainhowitcanapplytobest-rstsearch.
)Giventhissetofnodes,struc-turedduplicatedetectiondeterminesanorderofexpansionthatminimizesI/Ocomplexity,byexploitinglocalstructurerevealedbytheabstractstate-spacegraph.
Tosupportstructuredduplicatedetection,thesetofstorednodes(i.
e.
,nodesintheOpenandClosedlists)ispartitionedbasedonanabstractstate-spacegraph.
Eachnblockinthepartitionconsistsofallstorednodesthatarepre-imagesofthesameabstractnode(i.
e.
,eachnblockcorrespondstoanabstractnode).
Thislocalstructureisexploitedasfollows.
First,nodesinthesamenblockareexpandedtogether,i.
e.
,consecutively.
Thisimproveslocalityofmemoryref-erencebecauseallnodesinthesamenblockhavethesameduplicate-detectionscope.
ButitleavesundeterminedtheSEARCH685orderinwhichtoconsidernblocks.
Ingeneral,iftheduplicate-detectionscopesoftwonblocksarealmostthesame,theyshouldbeexpandedclosetoeachother,inordertominimizethenumberofpossibleI/Ooperations.
Nblockscorrespondtoabstractnodes,andabstractnodesthatareneighborsintheabstractstate-spacegraphtendtohavesim-ilarduplicate-detectionscopes.
Therefore,wechoosetoex-pandnblocksinanorderthatreectsneighborrelationsintheabstractstate-spacespacegraph.
Asimplewaytodothisistoexpandthenblocksinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph.
Wheninternalmemoryisfull,thesearchalgorithmmustremovefrominternalmemoryoneormorenblocksthatdonotbelongtotheduplicate-detectionscopeofnodescurrentlybeingexpanded.
Immediatelybeforeexpandingnodesinadifferentnblock,itmustcheckifsomenblocksintheirduplicate-detectionscopearemissingfrominternalmemory,andifso,readthemfromexternalmemory.
Be-causereadingannblockintointernalmemoryoftenresultsinwritinganothernblocktoexternalmemory,werefertothesepairsofreadandwriteoperationsasnblockreplace-ments.
Exceptfornblocksthatarepartoftheduplicate-detectionscopeofnodesbeingexpanded,anynblockcanpotentiallybemovedfrominternalmemorytoexternalmemory.
De-cidingwhichnblockstoremoveisidenticaltothepage-replacementstrategyofvirtualmemory,exceptthatnowtheunitofreplacementisannblockinsteadofapage.
Thus,wecallitannblock-replacementstrategy.
Itiswell-knownthattheoptimalreplacementstrategyalwaysremovesthepage(ornblock)thatwillnotbeusedforthelongesttime(Be-lady1966).
Implementingsuchastrategyisgenerallyim-possible,asitrequiresfutureknowledgeoftheorderinwhichpages(nblocks)willbeneeded,andleast-recently-usedisoftenthebeststrategyinpractice(Sleator&Tar-jan1985).
However,itispossibletocomputeanoptimalnblock-replacementstrategyinourcase,sincetheorderinwhichnblocksareexpandedisgivenbythebreadth-rsttraversalorderofanabstractstate-spacegraph.
Thus,givenasetofnodestoexpand,structuredduplicatedetectionexpandstheminanordersuchthat(1)nodesinthesamenblockareexpandedtogether,(2)nblocksareconsid-eredinanorderthatreectsabreadth-rsttraversaloftheabstractstate-spacegraph,and,(3)wheninternalmemorybecomesfull,selectionofwhichnblockstowritetodiskisbasedonapre-computedoptimalnblock-replacementstrat-egy,orelse,theleast-recentlyusedstrategy.
Ititstraightforwardtousestructuredduplicatedetectionwithbreadth-rstsearch,wherethesetofnodesbeingex-pandedisalayerofthebreadth-rstsearchgraph.
Itisalsopossibletouseitwithabest-rstsearchalgorithmsuchasA*.
Insolvingasearchproblemwithmanyties,theorderinwhichA*expandsnodeswiththesamef-valueisnon-deterministic,andstructuredduplicatedetectioncandeter-mineanorderthatminimizesI/Ocomplexity.
The(n21)sliding-tilepuzzleisanexampleofadomainwithmanyties.
Forsearchproblemsinwhichtherearenotmanyties,itispossibletousestructuredduplicatedetectionaspartofaslightly-modiedversionofA*,inwhichA*selectsasetofnodestoexpandcontainingnodeswithalmostthesamef-value–forexample,thebestknodesontheOpenlist,wherekisasuitablylargenumber.
Structuredduplicatede-tectioncandeterminetheorderinwhichtoexpandthissetofnodes.
Althoughitmayexpandnodesinanorderthatisnotstrictlybest-rst,improvedperformancefromstructureddu-plicatedetectionmayoutweighanincreaseinthenumberofnodesexpanded.
Thismodicationofthebest-rstexpan-sionorderofA*toimprovelocalityofmemoryreferenceswaspreviouslyproposedbyEdelkampandSchr¨odl(2000)asawayofimprovinguseofvirtualmemory.
I/OcomplexityI/Ocomplexityistheprimaryfactorthataffectsthespeedofexternal-memoryalgorithms.
TheI/OcomplexityofsearchinanimplicitgraphdependsentirelyontheI/Ocomplexityofduplicatedetection.
Theworst-caseI/OcomplexityofdelayedduplicatedetectionisO(n+mBlogMB(n+m)),wherenisthenumberofnodesinthestate-spacegraph,misthenumberofedges,Bisthesizeofadiskblock,Misthesizeofinternalmemory,andO(xBlogMBx)istheworst-caseI/Ocomplexityofexternallysortingxelements(Munagala&Ranade1999).
Forstructuredduplicatedetection,wehavethisresult.
Theorem2Ifeveryduplicate-detectionscopetsininter-nalmemory,theworst-caseI/Ocomplexityofstructureddu-plicatedetectionisO(nB·E).
Inthisresult,Eisthenumberofdirectededgesintheab-stractstate-spacegraph.
(Tokeeptheanalysissimple,weconsideranundirectededgeastworeciprocallydirectededges.
)ThetheoremfollowsfromthefactthatEistheworst-casenumberofnblockreplacementsthatmustbedoneintraversingtheabstractstate-spacegraph,andtheworst-casenumberofI/Ooperationsneededpernblockre-placementisthesizeofthelargestnblockdividedbythediskblocksize,wherenboundsthesizeofthelargestnblock.
Thefactor1Bisthesameinbothanalyses.
Sinceanab-stractstate-spacegraphistypicallyexponentiallysmallerthantheoriginalstate-spacegraph,thefactorEforstruc-turedduplicatedetectioniscomparabletologMB(n+m)fordelayedduplicateddetection.
ThedifferenceinI/Ocom-plexityisthedifferencebetweenthefactorn+mfordelayedduplicatedetection,andthefactornforstructuredduplicatedetection.
Thepresenceofn+minthecomplexityanalysisfordelayedduplicatedetection,incontrasttonforstruc-turedduplicatedetection,canbeinterpretedasapenaltyfordelayingduplicatedetection,sinceitboundsthecardinalityofthemulti-setofsuccessorsthatisgeneratedbyexpandingasetofnodesbeforeperformingduplicatedetection.
Inahighly-connectedgraph,mismuchlargerthann,andthepenaltyfordelayingduplicatedetectioncanbese-vere.
Itfollowsthatdelayedduplicatedetectionworksbestinsparsegraphs.
However,manyimportantsearchproblemsdonothavesparsegraphs.
Anexampleismultiplesequencealignment,amotivatingproblemforfrontiersearch(Korf&Zhang2000).
Anadvantageofstructuredduplicatede-tectionisthatitworksequallywellinsparseandhighly-connectedgraphs.
686SEARCH#SolIntMemExtMemExpSecs1766564,62816,389,872279,167,41180149591,382,50420,557,691345,487,8638985364481,53312,588,843224,545,8536415655228,33412,989,362208,969,4456655957414,77513,829,299228,945,35167160661,738,02256,436,901978,819,6463,06966611,355,26420,699,891368,181,73597382621,410,29246,329,201765,608,9892,38988651,808,59177,711,2351,360,544,0934,4569257371,77912,505,737213,445,215642Table1:Performanceonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle.
Columnsshowtheinstancenumber(#);solutionlength(Sol);peaknumberofnodesstoredininternalmemory(IntMem);peaknumberofnodesstoredinexternalmemory(ExtMem),numberofnodeexpansions(Exp),andrunningtimeinCPUseconds(Secs).
Unlikedelayedduplicatedetection,structuredduplicatedetectionhasaminimuminternal-memoryrequirement.
Theorem2holdsonlyifeveryduplicatedetectionscopetsininternalmemory.
So,whetheritholdsdependspartlyonthesizeofinternalmemory,andpartlyonthelocalityofthegraph,andhowwellitiscapturedinanabstractstate-spacegraph.
Ifthelargestduplicatedetectionscopedoesnottininternalmemory,itispossibletocombinestructureddupli-catedetectionwithdelayedduplicatedetection.
Givenasetofnodestoexpand,thesetcanbepartitionedbasedontheabstractstate-spacegraph,anddelayedduplicatedetectioncanbeperformedseparatelyonthenodesineachnblock,asawayofleveraginglocalstructuretoimproveperformance.
ComputationalresultsHowefcientlytheunderlyinggraph-searchalgorithmusesinternalmemoryhasaneffectondiskI/O,sinceitaf-fectshowmuchtotalmemoryisneeded.
Forthisreason,Korf(2003a;2003b)builthisexternal-memorygraphsearchalgorithmontopoffrontiersearch,averymemory-efcientgraph-searchalgorithm.
Inourexperiments,wealsocom-binestructuredduplicatedetectionwithmemory-efcientgraph-searchalgorithms.
Adifferenceisthatwecon-siderheuristicsearch,whereaspreviousworkonexternal-memorygraphsearchconsidersbreadth-rstsearchonly.
FortheresultsinTables1and2,weranthesearchalgo-rithmsinamodethatminimizesuseofinternalmemory.
Sothepeakamountofinternalmemoryusedcorrespondstotheminimuminternal-memoryrequirementsofthealgorithms,usingstructuredduplicatedetectionandagivenabstractionofthestatespace.
Ordinarily,thealgorithmswoulduseallavailableRAMbeforeusingdisk.
Thealgorithmswererunona2.
4GHzIntelPentiumwith2gigabytesofRAManda7200RPMSeagatediskwith120gigabytesofstorage.
Fifteen-puzzleTosolvetheFifteenpuzzle,weusestructuredduplicatedetectiontogetherwithabreadth-rstheuristicsearchal-gorithmthatusesdivide-and-conquersolutionreconstruc-tion,calledBreadth-FirstIterative-DeepeningA*(Zhou&Hansen2004).
Unlikebrute-forcebreadth-rstsearch,thisNameCostIntMemExtMemExpSecs1aboA8,4838K130K4,207K451amk33,96029K555K31,704K3731csy14,1606K52K1,943K251ezm40,73317K154K6,186K731gtr58,010188K7,468K1,094,936K22,3811idy7,88819K415K10,246K1101pfc15,71812K153K6,600K711tis37,58124K383K29,778K3571wit13,97953K1,513K80,591K1,192actin52,117102K2,783K240,168K3,972Table2:Performanceinaligninggroupsof5proteinsequencesfromreferenceset1ofBAliBASE(Thompson,Plewniak,&Poch1999).
Columnsshownameofinstance,costofoptimalalignment,peaknumberofnodesstoredininternalmemory(inthousands),peaknumberofnodesstoredinexternalmemory(inthousands),numberofnodeexpansions(inthousands),andCPUseconds.
algorithmusesupperandlowerboundstoprunenodesthatcannotbepartofanoptimalsolution.
Table1showshowthealgorithmperformsonthe10mostdifcultinstancesofKorf's100randominstancesofthe15-puzzle(Korf1985).
Weusedastate-spaceprojectionfunctionthatgroupsto-getherstatesbasedonthepositionsoftiles15and8,plusthepositionofthe"blank,"creatinganabstractstate-spacegraphwith16·15·14=3360nodes.
Basedonthispartitionofthestatespace,structuredduplicatedetectionreducestheinternal-memoryrequirementofthesearchalgorithmbyafactorofbetween16and58times.
Inexchange,itincreasesrunningtimebyonlyabout24%insolvingtheseexam-ples.
Usingbothstructuredduplicatedetectionanddivide-and-conquersolutionreconstruction,all100instancesofthe15-puzzlearesolvedusingnomorethan42megabytesofRAM,withouteverre-expandinganodeinthesameitera-tion.
Basedonthenumberofdistinctnodesexpanded,A*wouldneed28gigabytesofRAMjusttostoretheClosedlistinsolvinginstance88.
MultiplesequencealignmentThemultiplesequencealignmentproblemisanexampleofasearchproblemthatcannotbesolvedefcientlyus-ingdelayedduplicatedetectionbecauseitssearchgraphisveryhighly-connected.
Thesearchgraphforthemulti-plesequencealignmentproblemisann-dimensionalhyper-lattice,wherenisthenumberofsequencesbeingaligned.
WeimplementedstructuredduplicatedetectionontopofasearchalgorithmcalledSweepA*,whichusesdivide-and-conquersolutionreconstructiontolimituseofmem-ory(Zhou&Hansen2003).
1Wetestedtheresultingal-gorithmonaselectionofdifcultmultiplesequencealign-mentproblemsfromreferenceset1ofBAliBASE,awidely-usedbenchmark(Thompson,Plewniak,&Poch1999).
All1SweepA*isaspecializedsearchalgorithmformultiplese-quencealignmentthatexpandsnodesonalayer-by-layerbasis,whichmakesitagoodtforstructuredduplicatedetection.
Itisalsoverymemory-efcient.
Forcomparison,frontiersearch(Korf&Zhang2000)cannotsolveinstances1gtr,1wit,oractininTa-ble2within2gigabytesofRAM,andusesanaverageof25timesmorememorythanSweepA*insolvingtheotherinstances.
SEARCH687problemsinvolvedaligning5proteinsequences.
OurcostfunctionwasaDayhoffsubstitutionmatrixwithlineargappenaltyof8,andweusedapairwisealignmentadmissibleheuristic.
Tocreateanabstractstate-spacegraph,weusedastate-spaceprojectionfunctionthatignoresallbut3se-quences.
ThiscreatesanabstractstatespacewithO(l3)ab-stractnodes,wherelistheaveragelengthofasequence,comparedtoO(l5)nodesintheoriginalstate-spacegraph.
Table2showsthatstructuredduplicatedetectionreducestheinternal-memoryrequirementsofSweepA*byafactorofbetween10and40times.
Usingdisk,thealgorithmneedsonly20megabytesofRAMtosolveallinstancesinTable2,includingthespaceforthepairwiseheuristic.
Interestingly,theexternal-memoryversionofSweepA*runsfasterthananinternal-memoryversionofSweepA*thatdoesnotusestructuredduplicatedetection,byanaverageof72%!
Thereasonisthatstructuredduplicatedetectionrunsfasterinin-ternalmemorythanun-structuredduplicatedetectionduetolocalityofmemoryreferences,andthisspeedupoutweighstheextratimefordiskI/O.
Thespeedupismoreevidentinmultiplesequencealignmentthanthesliding-tilepuz-zlebecausethemultiplesequencealignmentsearchgraphismuchmorehighly-connected,whichsignicantlyincreasesthenumberofduplicatesgeneratedpernodeexpansion.
Speedingupinternal-memorysearchWeemphasizethatforboththeFifteenpuzzleandmulti-plesequencealignment,structuredduplicatedetectionim-provestheperformanceofinternal-memorygraphsearch,inadditiontousingexternalmemoryefciently.
Thisisbe-causeeachtimethesearchalgorithmchecksforduplicates,itonlyneedstocheckasmallsubsetofthestorednodes.
FortheFifteenpuzzle,thisleadstoa16%speedupininternal-memoryperformance.
Formultiplesequencealignment,itcutstherunningtimeofinternal-memorysearchinhalf.
Again,thespeedupismuchgreaterformultiplesequencealignmentbecauseinaveryhighly-connectedgraph,manymoreduplicatesaregenerated.
Thisunderscoresabenetofstructuredduplicatedetection.
Unlikedelayedduplicatedetection,itismoreeffectiveinhighly-connectedgraphs,wheretheproblemofduplicatedetectionismorecrucial.
ConclusionWehaveintroducedstructuredduplicatedetection,anovelapproachtolocalizingmemoryreferencesinduplicatede-tection,andshowedthatitcanreducetheI/Ocomplexityofexternal-memorygraphsearch,aswellasincreasethespeedofduplicatedetectionininternal-memorysearch.
Structuredduplicatedetectionhassomeadvantagesoverdelayedduplicatedetection,theonlypreviousapproachtoexternal-memorygraphsearch.
Inparticular,itismoreef-fectiveinhighly-connectedgraphs,anditcanexploitlocalgraphstructurethatisnotconsideredbydelayedduplicatedetection.
ButstructuredduplicatedetectionanddelayedduplicatedetectioncanalsobeviewedascomplementaryapproachestoreducingI/Ocomplexity,andcanbeusedto-getherfordifcultsearchproblems.
Ifasearchalgorithmcanleverageenoughlocalityinastate-spacegraphsothateveryduplicate-detectionscopetsininternalmemory,itisunnecessarytoeverdelayduplicatedetection,andstruc-turedduplicatedetectioncanmanageexternalmemorybyitself.
Butifanyduplicate-detectionscopedoesnottinin-ternalmemory(perhapsduetolackofsufcientlocalstruc-tureinthegraph),structuredduplicatedetectionbyitselfisnotsufcient.
Sincedelayedduplicatedetectiondoesnothaveaminimummemoryrequirement,itcanbeusedinthiscase.
Structuredduplicatedetectioncanbeusedtogetherwithdelayedduplicatedetectiontoimproveitsperformance,byleveraginggraphlocalityasmuchaspossible.
AcknowledgmentsWethanktheanonymousreviewersforhelpfulcomments.
ThisworkwassupportedinpartbyNSFgrantIIS-9984952andNASAgrantNAG-2-1463.
ReferencesAggarwal,A.
,andVitter,J.
1988.
Theinput/outputcomplexityofsortingandrelatedproblems.
Comm.
oftheACM31(9):1116–27.
Belady,L.
1966.
Astudyofreplacementalgorithmsforvirtualstorage.
IBMSystemsJournal5:78–101.
Edelkamp,S.
,andSchr¨odl,S.
2000.
LocalizingA*.
InProc.
ofthe17thNationalConferenceonArticialIntelligence,885–890.
Holte,R.
;Mkadmi,T.
;Zimmer,R.
;andMacDonald,A.
1996.
Speedingupproblemsolvingbyabstraction:Agraph-orientedapproach.
ArticialIntelligence85(1–2):321–361.
Klein,D.
,andManning,C.
2003.
FactoredA*searchformodelsoversequencesandtrees.
InProceedingsofthe17thInternationalJointConferenceonArticialIntelligence,1246–1251.
Korf,R.
,andZhang,W.
2000.
Divide-and-conquerfrontiersearchappliedtooptimalsequencealignment.
InProceedingsofthe17thNationalConferenceonArticialIntelligence,910–916.
Korf,R.
1985.
Depth-rstiterativedeepening:Anoptimalad-missibletreesearch.
ArticialIntelligence27:97–109.
Korf,R.
2003a.
Breadth-rstfrontiersearchwithdelayeddupli-catedetection.
InProceedingsoftheWorkshoponModelCheck-ingandArticialIntelligenceatIJCAI-03,87–92.
Korf,R.
2003b.
Delayedduplicatedetection:Extendedabstract.
InProceedingsofthe17thInternationalJointConferenceonAr-ticialIntelligence,1539–1541.
Mehlhorn,K.
,andMeyer,U.
2002.
External-memorybreadth-rstsearchwithsublinearI/O.
InProceedingsofthe10thAnnualEuropeanSymposiumonAlgorithms,723–735.
Munagala,K.
,andRanade,A.
1999.
I/O-complexityofgraphalgorithms.
InProceedingsofthe10thSymposiumondiscretealgorithms,687–694.
ACM-SIAM.
Sleator,D.
,andTarjan,R.
1985.
Amortizedefciencyoflistupdateandpagingrules.
CommunicationsoftheACM28:202–8.
Thompson,J.
;Plewniak,F.
;andPoch,O.
1999.
BAliBASE:Abenchmarkalignmentdatabasefortheevaluationofmultiplealignmentprograms.
Bioinformatics15(1):87–88.
Zhou,R.
,andHansen,E.
2003.
SweepA*:Space-efcientheuristicsearchinpartiallyorderedgraphs.
InProc.
of15thIEEEInternationalConf.
onToolswithArticialIntelligence,427–434.
Zhou,R.
,andHansen,E.
2004.
Breadth-rstheuristicsearch.
InProceedingsofthe14thInternationalConferenceonAutomatedPlanningandScheduling.
688SEARCH

CloudCone:$14/年KVM-512MB/10GB/3TB/洛杉矶机房

CloudCone发布了2021年的闪售活动,提供了几款年付VPS套餐,基于KVM架构,采用Intel® Xeon® Silver 4214 or Xeon® E5s CPU及SSD硬盘组RAID10,最低每年14.02美元起,支持PayPal或者支付宝付款。这是一家成立于2017年的国外VPS主机商,提供VPS和独立服务器租用,数据中心为美国洛杉矶MC机房。下面列出几款年付套餐配置信息。CPU:...

819云互联(800元/月),香港BGP E5 2650 16G,日本 E5 2650 16G

819云互联 在本月发布了一个购买香港,日本独立服务器的活动,相对之前的首月活动性价比更高,最多只能享受1个月的活动 续费价格恢复原价 是有些颇高 这次819云互联与机房是合作伙伴 本次拿到机房 活动7天内购买独立服务器后期的长期续费价格 加大力度 确实来说这次的就可以买年付或者更长时间了…本次是5个机房可供选择,独立服务器最低默认是50M带宽,不限制流量,。官网:https://ww...

CYUN专注海外精品服务器资源 国庆钜惠 最低5折起 限量促销

国庆钜惠 最低5折起 限量促销CYUN专注海外精品服务器资源,主营香港CN2 GIA、美国CERA、美国高防服务器资源,实体公司,ISP/IDC资质齐全,客服配备齐全。本次针对国庆推出非常给力的促销活动,旗下所有平台同享,新老客户同享,限时限量,售完截止。活动截止时间:2021年10月9日官网地址:www.cyun.net参与机型:香港CN2 GIA云服务器、香港双程CN2云服...

graphsearch为你推荐
Committeesios11内存nod32支持ipad买家google流量支付宝重庆宽带测速重庆云阳电信宽带测速网址谁知道,帮个忙?ipad如何上网苹果ipad无线上网卡怎么设置?ipad上网为什么ipad网速特别慢ipad上网ipad上网速度很慢怎么回事?win7telnet怎样开启Windows7系统中的Telnet服务
虚拟空间哪个好 免费名片模板 ubuntu更新源 java虚拟主机 闪讯官网 网站加速软件 我的世界服务器ip 西安主机 supercache 广东主机托管 免备案cdn加速 SmartAXMT800 hosts文件修改 卡巴斯基免费版下载 vpsaa kosskeb4 xendesktop 大容量存储方案 竞彩论坛空间 上海服务器托管 更多