interacp2pover

p2pover  时间:2021-05-21  阅读:()
JInternetServAppl(2012)3:329–346DOI10.
1007/s13174-012-0072-0SI:DATAINTENSIVECOMPUTINGLarge-scalevolunteercomputingovertheInternetFernandoCosta·JooNunoSilva·LuísVeiga·PauloFerreiraReceived:10July2012/Accepted:4October2012/Publishedonline:25October2012TheBrazilianComputerSociety2012AbstractCyclesharingovertheInternethasincreasedinpopularityduringthelastdecade,withincreasinglypowerfulmachinesbeingmadeavailabletoexistingprojects.
Inthispaper,wepresentGiGi-MR,aframeworkthatallowsnon-expertuserstorunCPU-intensivejobsontopofvolunteerresourcesovertheInternet.
GiGi-MRhasseveraldistinctivefeatures:itallowsnon-expertuserstoeasilypartitiontheirjobsinseveralparalleltasks;suchBag-of-Tasks(BoT)areexecutedinparallelasasetofMapReduceapplications;thevolunteerresourcesthatprovidethebestmatchforthetasksbeingexecutedarechosen(usingattenuatedbloomlters);itprovidesaportablecheckpointingfault-tolerancemecha-nismbasedonvirtualization;itdoesnotrelyexclusivelyonacentralserver(orservers)atalltimes(thusminimizingthebottleneckeffect);itdealswithmaliciousparticipants(pos-siblybyzantine)usinganefcientpartialreplicationmecha-nismtovalidatetheresultsobtained;anditiscompatiblewithBOINC(oneofthemostpopularopen-sourcesoft-wareplatformsforcomputingusingvolunteeredresources).
WedescribeGiGi-MR'sarchitectureandevaluateitsperfor-mancebyexecutingseveralMapReduceapplicationsonawideareatestbed.
Furthermore,weusemicro-benchmarkstoassesseachoneofGiGi-MR'scomponentsindependently.
Thesystem'soverheadisminimal.
Whencomparedtoanunmodiedvolunteercomputingsystem,GiGi-MRobtainsaperformanceincreaseofover60%inapplicationturnaroundThisworkwaspartiallysupportedbynationalfundsthroughFCT-FundaoparaaCiênciaeTecnologia,underprojectsPTDC/EIA-EIA/102250/2008,PTDC/EIA-EIA/108963/2008,PTDC/EIA-EIA/113993/2009andPEst-OE/EEI/LA0021/2011.
F.
Costa·J.
N.
Silva·L.
Veiga·P.
Ferreira(B)DistributedSystemsGroup,INESC-ID,TechnicalUniversityofLisbon,R.
AlvesRedol,9,1000-029Lisboa,Portugale-mail:paulo.
ferreira@inesc-id.
pttime,whilereducingthebandwidthusedbyanorderofmag-nitude.
KeywordsVolunteercomputing·Distributedsystems·MapReduce·Adaptivemiddleware1IntroductionTheuseofvolunteerPCsacrosstheInternettoexecutedistributedapplicationshasbeenincreasinginpopularitysinceitsinceptionintheearly1990s,withthecreationofprojectssuchasDistributed.
net,1Seti@home[3]orFold-ing@home[19].
TheseVolunteerComputing(VC)systemsharnesscomputingresourcesfrommachinesrunningcom-modityhardwareandsoftware,andperformhighlyparallelcomputations,alsocalledBag-of-Tasks(BoT),thatdonotrequireanyinteractionbetweennetworkparticipants.
ExistingVCsystemssupportover60scienticprojects,2andhaveoveramillionparticipants,rivalingsupercomputersincomputingpower.
Themostpopularmiddleware,BOINC[2],iscurrentlybeingusedbyover40projects,fromscienticeldsrangingfromclimatepredictiontoproteinfolding.
Projectsmusthavealargevisibilitytoattractenoughcycledonorsandbecomposedofhundredsofindividualtasksorworkunits.
Furthermore,projectcreatorsmusthavealargeknowledgeonC++orFortranprogramming.
Toachievefaulttoleranceduringtaskexecution,developersmustmod-ifytheirapplicationcodeandinsertexplicitcheckpoints.
Usersnotsatisfyingtheserequirementscannottakeadvan-tageofavailableremotecycles.
Eveniftheuserhasenough1Distributed.
netwebsite.
http://www.
distributed.
net.
2ListofactiveVCprojects.
http://www.
distributedcomputing.
info/projects.
123330JInternetServAppl(2012)3:329–346programmingknowledgetocreateaproject,iftheprojectisshortlengthenedornotcapableofattractingenoughdonors,thegainswillbelow.
Thiskindofoperationgreatlylimitsthescopeofuserscapableofcreatingprojectstoberemotelyexecuted.
1.
1GoalOurgoalistocreateaframework(calledGiGi-MR)thatallowsnon-expertuserstocreatejobsandsubmitthecor-respondingBag-of-TaskstoaVCsystem,supportingtheMapReduceparadigmandmakinganefcientusageoftheresourcesavailable,whilebeingfault-tolerantandresilienttobyzantineclientsandcompatiblewithBOINC.
Thereareseveralchallengesandrequirementstoconsider,inordertoachieveourgoal.
Firstandforemost,GiGi-MRmustbeabletotakeadvantageofthehugeamountofVCresourcesthatwepreviouslymentioned.
Wemustconsiderboththehardwarecapabilitiesofindividualmachinesandthenetworkbandwidththatisatourdisposal,atthelastmileoftheInternet.
Theplatformneedstobeportable,inordertohandletheheterogeneityofmachines,andadaptabletoenvi-ronmentalchanges(i.
e.
,resourceavailability).
Tothatend,itmustabletoorganizeclientsintoavirtualnetwork,andhavethemexchangeinformationthatisthenusedbytheserver.
OursystemmustalsobecompatiblewithexistingVCsolutions(e.
g.
,BOINC[2]).
Developingawholenewplat-formfromscratchwouldbeofnopracticaluse.
Therefore,wemusttakeintoaccountexistingsystemsandusetheirexistinginfrastructuretocomeupwithanalprototypethatcanactuallybeused,inareal-worldscenario.
Infact,oursolutionwouldundoubtedlybringsignicantdisadvantagesifitrequiredthatonlyoursystem'sclientswereattachedtoaproject.
3Toavoidthissituation,wemustguaranteecom-patibilitywithexistingprojects.
Anyclientmustbeabletorunanyprojectapplication.
Ontheotherhand,oursolutionmustsupportexistingapplications,andsuccessfullysched-uletasksonexistingclients.
Toincludenon-expertusersasjobcreators,twokeyrequirementsaretobemet:(1)theusersshouldbeallowedtousetheapplicationsorprogramminglanguagestheyareliterateon,and(2)thereshouldbeenoughcycledonorstospeedevensmalljobs.
ThesystemmustalsobeabletotakesequentialapplicationsrepresentativeofBoTproblems(withiterationsthatprocessdifferentdatasets)asinput,andmodifythemintoparalleltaskswithoutuserintervention.
Someapplications,duetobeingmorecomplexandnoteasily3AVCProjectrunsontopofexistingmiddleware(e.
g.
,BOINC)bydevelopinganapplicationanddeningallparametersconcerningitsexecution.
ProjectdevelopersonlyhavetomakesuretheirtasksareproperlyconguredandprovideapubliclyaccessiblemachinetoactastheVCserver.
transformedintoasetofmapandreducetasks,dorequiresomemanualintervention.
Thisisprovidedbymeansofasimpleinterfacethatnon-expertsuserscanuse(e.
g.
,todenewhichexecutableshouldrunwithwhichsetofdata).
Theexecutionofoursystemonunreliable,non-dedicatedresourcesrequiresfaulttolerancemechanisms.
Thismeansitmustaccountforunreachableclients,whichhavediscon-nectedfromtheserver,oraresimplyofine.
Oursolutionmustbeabletowithstandtransientserverfailures.
Thisisparticularlyimportantinourcasebecausewewillbedealingwithlongrunningapplications,withapotentiallyhighlevelofserverinteractions.
Weneedtopreventtheexecutionontheclientstocometoahalt,astheywaitfortheservertocomebackup.
Finally,wemustalsoconsiderbyzantinebehaviour.
Clientsmaymaliciouslyreturnincorrectresults,orinadver-tentlyproduceanincorrectoutputbyencounteringerrorsduringthecomputationordatatransfers.
Therefore,wemustprovideresultvalidationthataccountsforthisenvironmentandprovidesreliability.
1.
2ShortcomingsofcurrentsolutionsExistingsolutionsdonotfulllourgoalwhileensuringtherequirementsmentionedabove.
Wehighlightsomeofthoseshortcomingsinthissection(moredetailsinSect.
4).
Althoughcreation,distributionandexecutionoftasksovertheInternetarehandledbyexistingmiddleware,thereisstillasteepentrybarrierforanyonetryingtostartaVCproject.
ThismakescyclesharingovertheInternetaone-waydeal.
Computerownersonlyhaveoneroleintheprocess:todonatetheircomputers'idletime.
ThedevelopmentofBag-of-Tasksapplicationsforexecu-tiononmultiprocessorsorclustersrequirestheuseofAPIsnotdesignedforthiskindofproblem.
Forinstance,MPI[29]allowstheparallelexecutionoftasks,butwasdevelopedformuchmorecomplexparallelapplications,withhighdatacommunicationbetweentasks.
TheuseofsuchAPIsrequirestheprogrammerstolearnthem,andaddcomplexitytothenalparallelsolution.
ExistingVCsystemstypicallydonotprovideanytooltoconvertsimpler,sequentialapplicationstoparallelBoT.
AconsiderablelimitationofexistingVCsystemsistheirfocusonBoTapplications,withlittlecommunicationandwithoutdependenciesbetweenthetasks.
Asparallelanddis-tributedcomputingbecomestheanswerforincreasedscala-bilityforvariedcomputationalproblems,severalparadigmsandsolutionshavebeencreatedduringthelastdecade.
Inparticular,MapReduce[11]hastakenitsplaceasoneofthemostwidelyusedparadigmsincloudcomputingenvi-ronments,suchasAmazon'sEC2.
4Itswideuse,simplicity,andscalabilitymakeitaprimecandidateforexecutionon4AmazonEC2.
http://aws.
amazon.
com/ec2.
123JInternetServAppl(2012)3:329–346331VCsystems.
NoneofthecurrentVCplatformssupportMapReduce,aprogrammingmodelthatadaptswelltoadata-intensiveclassofapplications.
SupportingMapReducerequiresfundamentalchangesonexistingalgorithms,andtheintroductionofon-the-ytaskcreation.
Thisiscurrentlynotavailableonanypresentsystem.
TodealwithBoTapplications,schedulingandresourcediscoveryalgorithmsaredesignedwiththeleastcomplex-itypossible.
Despitereducingtheprobabilityofintroducingerrorsincomputationorvalidation,thisapproachunderes-timatesthebenetsoftakingadvantageofuserresources.
Currentsystemsarelimitedtospecifyingtheminimumhard-warerequirementsforeachcomputation,andtypicallydonotconsideradaptivealgorithmstodealwithever-changingmachineavailabilityandresources.
AserverinexistingVCsystemsisonlycapableofusinghostinformationperiodicallyreportedbyeachclientwhenrequestingwork.
Afterassigningaworkunit,theschedulercanmakeaneducatedguessonwhentheclientwillnishexecutionandrequestfurtherwork,basedonpastbehaviourandtaskdeadlines.
However,thereisnofurtherupdateofthisscheduleuntilthereisanotherrequest.
Thisgreatlyreducesthesystem'scapacitytopredictfutureworkrequestsandscheduletasksaccordingly.
MostVCsystemshaveacentralizedarchitecture,withallcommunicationgoingthroughasingleserver(orclus-ter).
Therearefewexceptionsandtheywerecreatedwithasmallerscopeorenvironmentinmind[8].
InBOINC[2],XtremWeb[5]andFolding@home[19],theserverorcoordi-natormustfullltheroleofjobscheduler,byhandlingallthetaskdistributionaspectsandresultvalidation.
Thisapproachinevitablycreatesabottleneck,asprojectsexpandandstor-ageandnetworkrequirementsbecomemoredemanding.
ExistingprojectssuchasClimateprediction.
netandMilky-Way@homehaveencounteredscalabilityproblemswhendealingwithlargelesorhavingthesamedatasharedbymanyclients[9].
Althoughsomepotentialsolutionshavebeenproposed[10,13],theyhavenotbeendeployedinthemostwidelyusedsystems.
Faulttoleranceismostlyconnedtotheclient-sideincurrentVCsystems.
Althoughsomeprojectsdohaveasetofmirrorsthatactasdatarepositories,allclientrequestsandtaskschedulinggoesthroughthecentralserver.
Therefore,anyserverfaultthatpreventsitfromcommunicatingwithclientshasaveryhighprobabilityofdisruptingclientsandstoppingfurthertaskexecution.
Finally,thereisaconsiderablelimitationwithrespecttoresultvalidationmechanisms.
Mostexistingsystemsarecontentwithprovidingintegralreplicationofdata,withoutconsideringcommunicationoverheadorpotentiallymoreattractivealternatives.
Thereisalsolittleornouseofredun-danttaskexecution(wecallthissamplingtechnique-moredetailsinSect.
2.
3)whichcanconstitutedeniteproofincasesofmaliciousbehaviour(userreturninganincorrectresult).
Insummary,existingVCsolutionsallowtheexecutionofBoTinamaster/workermodel,withsimplereplicationandfaulttolerancemechanisms.
Theyguaranteevalidresultsbutdonottakeadvantageoftheampleclientresources,andcre-ateahighentrybarrierforanyonewishingtotakeadvantageoftheirplatform.
1.
3Oursolution:GiGi-MRInthispaperwepresentGiGi-MR,aframeworkthatallowsordinaryuserstoexecuteMapReducetasksoverthelargescaleInternet,ontopofvolunteerresources.
MapReduceisattingchoiceforrunningdata-intensiveapplicationsontopofvolunteerresources,sinceitisapopularparadigm,representativeofdifferenttasks.
MapReduceleveragestheconceptofMapandReducecommonlyusedinfunctionallanguages:amaptaskrunsthrougheachelementofalistandproducesanewlist;reduceappliesanewfunctiontoalist,reducingittoasinglenalvalueoroutput.
InMapReduce,theuserspeciesamapfunc-tionthatprocessestuplesofkey/valuesgivenasinput,andgeneratesanewintermediatelistofkey/valuepairs.
Thismapoutputisthenusedasinputbyareducefunction,alsopredenedbytheuser,thatmergesallintermediatevaluesthatbelongtothesamekey.
Therefore,allreduceinputsareoutputsfromthepreviousmaptask.
Throughouttherestofthepaper,wewillrefertothemasmapoutputs.
Oursystemiscompatiblewithexistingsolutions(inpar-ticularBOINC),andprovidesuserswiththeabilitytosub-mitjobsthroughawebinterface.
GiGi-MRsupportsclienttoclienttransfers,thusminimizingthevolumeofdatasentthroughtheserver.
ThisalsoallowsGiGi-MRtotoleratetran-sientserverfailures,astheclientsdependmerelyonotherpeersfordata.
ItisalsocapableoftoleratingVCclients'failureusingreplication(i.
e.
,runningthesametaskonsev-eralVCmachines).
Byincreasingthereplicationfactor,theprobabilityofafailureofallclientsrunningacertaintaskislowered.
Byzantineclientbehaviouriscontrolledthroughtheuseoftaskvalidationintheserver.
Differentdatapartitioningavoursamongthetasksaresupported,andtheuseofsam-plingontheserverfurtherincreasessecurity.
Byreplicatingeachtask,itispossibletocomparetheoutcomeandacceptonlytheresultsinwhichaquorumhasbeenreached.
Ourframeworkfollowsalayeredapproach,rangingfromtop-leveluserinteractiontoolstolower-levelmodicationsthatarrangeclientsintoaconnectedtopology.
Wedecen-tralizesomeofthemechanismsofexistingsystemsthatplaceanexcessiveburdenonthecentralserver,bytakingadvantageofuserresources.
Additionally,weintroducenew123332JInternetServAppl(2012)3:329–346algorithmsforschedulingandvalidationthatincreaseoursystem'sadaptabilityandusefulness.
Taskschedulingisimprovedthroughtheuseofinforma-tionprovidedbyrunningclients,whichareorganizedinanoverlaynetwork[27].
Severalcriteria,suchasbandwidthorresourceavailability,aresubjecttoanalysisforthechoiceofneighbours.
Bloomlters[4]areusedtoidentifydifferenttypesofresources,fromapplicationstolibrariesorservices.
Thesystem'sresourcediscoverymechanismiscoupledwitharesourceevaluationalgorithmthatusesfuzzylogicandcombinedutilityfunctionstoprioritizehosts[28].
Thispaperisorganizedasfollows:GiGi-MRispresentedinmoredetailinSect.
2;Sect.
3describessomeimplementa-tiondetails,andpresentsmicro-benchmarksandexperimen-talresults,conductedwithseveralMapReduceapplications,onalargescaletestbed[7];relatedworkisdiscussedinSect.
4;andSect.
5concludes.
2GiGi-MRarchitectureGiGi-MR'shigh-levelarchitectureispresentedinFig.
1.
Aserverisresponsibleforschedulingandvalidatingtasks,whiletakingadvantageofinformationprovidedbyhostclients.
Clientsareorganizedintoanetworkoverlay,whichallowsthemtoexchangeinformationindependentlyfromtheserver.
GiGi-MRiscompatiblewithBOINC(BerkeleyOpenInfrastructureforNetworkComputing),themostsuccessfulandpopularvolunteercomputingmiddlewaretodate.
Con-sequently,ourclientcanparticipateinGiGi-MRaswellasinBOINCprojects,andborrowsmanyprimitivesandalgo-rithmsavailabletoBOINCclients.
TheGiGi-MRclientsoftwareinshowninFig.
1.
Thetoplayer,UserInterface,isresponsibleforuserinterfaceontheclient.
Userscanuseittotransformsequentialapplicationsintoparalleltasks,thusmakingthemrunnableonGiGi-MR.
Inaddition,thislayeralsoletsordinaryuserssubmittheirjobsfromtheirmachine,byregisteringtheapplication'sexe-cutablele.
Ontheserver,theWebInterfaceprovidesawebpageforuserstosubmitjobs,anddenetheirparametersandinputles(whicharethenuploadedtotheDataServer).
TheRPCInterfaceisresponsibleforinteractingwiththeclientwhenregisteringnewapplications.
TheMapReduceVClayerenablestheexecutionofMapReducetasksonthesystem.
Theserverstoresinforma-tiononeachjob'sparameters(e.
g.
,numberofmapandreducetasks)inacongurationle,whichisaccessedwhencreat-ingtasks.
Maptasksaredistributedtoclients,andonceallmappershavereturnedtheirresult,thereducetasksarecre-atedandscheduledforexecutiononreducers.
Aspreviouslymentioned,thetransferofmapoutputstoreducersisdonethroughinter-clienttransfers,withoutserverinterference.
Fig.
1GiGi-MRmodelThefollowinglayer,CheckpointandReplication,pro-videsacheckpointingmechanism,throughtheuseofvirtualmachines(VMs),andprovidesseveraloptionsforpartition-ingandreplicatinginputdata.
UsingVMsremovestheneedforchangestotheapplicationsourcecodetoachievetaskfaulttolerance.
ResourceDiscoveryisusedforenhancingtheserver'sschedulingperformance.
Clientsexchangemessageswithintheiroverlaynetwork,concerningtheircurrentavailabilityandvolunteeredresources.
ThisinformationisthensenttotheResourceUpdatesmodulewheneverthereisaninterac-tionwiththeserver(e.
g.
,workrequest).
Thebottomlayer,OverlayManagement,isresponsibleforroutingandaddressingintheoverlaynetwork.
Whenchangesinvolunteeredresourcesoccur,theyareannouncedtothenodesofthelocalnodeneighboursetthroughoutupdatemessages.
Theneighboursetisestablishedandmanagedatthislevel.
Moreover,thislayermaintainsalltheinforma-tionabouttheavailabilityofresourcesthateachnodeofitsneighboursethas.
Thislayerseparatesthesystemfromtheoverlaynetworkused,thusprovidingthefreedomofchoos-ingthemostappropriatesolution(e.
g.
,CAN[25],Chord[30],Pastry[27],etc).
123JInternetServAppl(2012)3:329–346333Eachlayerisdescribedinmoredepthinthefollowingsubsections.
2.
1UserInterfaceThetoplayerprovidestwofeatures:(1)transformationofsequentialapplicationsintoparalleltasks,and(2)theirsub-missiontoGiGi-MRbyordinaryusers.
Toperformatransformation,theusermustdenewhichmethodsandclassesshouldbeparallelized.
Thisinformationissavedinacongurationle,whichisreadbytheGiGi-MRclient.
Afterwards,itloadstheapplication,andtransformsitinrun-timesothatthespeciedmethodsareexecutedcon-currently.
Thetransformationitselfisperformedwithoutuserintervention.
Theresultingtasksaresubmittedtothesystemandexecutedremotely.
Thislayerisresponsibleforspawn-ingthenecessarythreads,andsynchronizingtheinvocationofthemethods.
TheproposedsolutionisimplementedinPythonandusesmetaclasses,allowingthemodicationofthecodetobedoneinrun-time,withoutanyneedtotransformandrecompilethesourcecode.
Thedevelopedmetaclassinterceptsallclasscreationsandmodiestheimplementationofthosethataretobeparallel,withoutanyuserintervention:theusermustonlystatewhatclasseshavemethodsthatcanbeexecutedconcurrentlywiththerestofthecode.
ThedistributionofworkamongseveralcomputersorprocessorsbyexistingsystemscanbedoneusinglibrariessuchasMapReduce,butrequirestheprogrammertoknowtheirAPI.
Oursystemremovesthisburdenfromtheapplica-tiondeveloperthroughrun-timecodeadaptation,andallowsthesubmissionofsequentialapplications.
Itisworthnotingthatusersmayskipthetransformationstep,asoursystemsupportsthedeploymentofparalleltasksandMapReduceapplications.
Tosubmittasks,andmakethemavailableforexecution,adeveloperwouldtypicallyhavetorunscriptsandconsolecommandsfromtheserver.
However,anordinaryusercantakeadvantageoftheUserInterfacelayer,whichprovidesaclientGUIandawebinterfaceontheservertofacilitatethesubmissionprocess.
GiGi-MRsupportsefcientexecutionofusersubmittedjobs,whileallowinganyusertohavetwocomplementaryroles:ownerofthejobsthatareexecutedonremotecom-putersandownerofthecomputerswherejobswillbeexe-cuted.
Toaccomplishthis,wemodiedboththeclientandserversoftware,anddevelopedacustomapplication.
Thedataprocessingcodeusedbythesejobscomprisescommod-ityapplicationsthatareinstalledintheremotecomputers,onlyaftertheirownersallowtheiruse.
ThejobsubmissionprocessisshowninFig.
2.
Tosubmitandcreatenewjobs,usersmust:(1)selectthecommodityapplicationthatshouldbeusedtoprocessthedataandregisterFig.
2UserjobsubmissiontoGiGi-MRFig.
3UserjobsubmissioninterfaceitthroughtheApplicationRegistrarGUI;(2)providetheinputles(dataorcode)totheDataServer,and(3)usetheserver'sWebInterfacetodenethenumberoftaskstocreate,thenameoftheoutputlesandtheargumentsthatshouldbeusedtoinvokethecommodityapplication.
ForaMapReducejob,theusermustprovideboththemapandreduceapplicationtobeused.
ThewebinterfaceisshowninFig.
3.
Inthispage,theuseruploadstheinputlesandselectstheapplicationthatshouldbeusedtoprocessthem.
Intheexample,theuserwantstoprocessale(anim.
pov)withthePOVrayraytracerandgen-erateamoviewith200frames.
InordertosubmitaMapRe-ducejob,theusermustprovideadditionalinformationsuchasthenumberofmapandreducetasks.
Aftercreatingandstoringtheinformationforeachjob,theserverwaitsforclientworkrequeststodistributetasks.
Onceitreceivesaworkrequestfromaclientwiththerequiredcom-modityapplication,itreplieswithtaskinformation(inputlesandarguments).
Onceallrequiredleshavebeendown-123334JInternetServAppl(2012)3:329–346loadedfromtheserver,theclientinvokesthecorrectcom-modityapplicationtoprocesstheinputles.
Aftereachjobcompletion,theclientsubmitstheoutputtotheserver,asanormalapplication.
Ingeneral,theapplicationsthatoursystemhandlesbestarethosewhichcanbeeasilydecomposedinasetofmapandreducetasks;thus,asanexample,Monte-Carlobasedapplicationsaregoodcandidates.
2.
2MapReduceVCThislayerisresponsibleforhandlingallaspectsofexecu-tionandmanagementofMapReducejobsonthesystem.
Aspreviouslymentioned,ausermustdenetheparametersoftheMapReducejobthroughtheUserInterfacelayer.
ThisinformationisstoredintheGiGi-MRserver.
OncealltheMapReducejobcharacteristicshavebeendened,theservercreatesthemaptasks,andstoresthisinformationintheGiGi-MRserver'sdatabase—theGiGi-MRdatabaseisresponsibleforholdingallpersistentinformationontasks,clients,andapplicationsbeingexecuted.
TheoverallGiGi-MRexecutionmodelforaMapReducejobispresentedinFig.
4.
WeconsidertwotypesofclientsinGiGi-MR:mappers,whichareresponsibleforbag-of-tasksinthemapstage;andreducers,whichperformtheaggregationofallmapoutputinthereducestep.
Agroupofmappersrstrequestsworkfromtheserver(1).
Theserverfollowsaschedulingprocedurewhichtakesintoaccounthostresourcesandavailability(seeSect.
2.
4forfurtherdetails)whenselectingwhichavailabletaskisassigned.
Wheneveritreceivesaworkrequest,itmatcheseachtask'spredenedhardwareorsoftwarerequirementstotheclient'smachinecharacteristics.
Iftheclientisthemostsuitableforthetask,theserverassignsitthetaskandsavesthisinformationinitsdatabase.
Afterselectinganappropriatemaptaskfortherequestingmapper,theserversendsbackinformationonthetaskthatthemappermustexecute.
Thisinformationincludesthelocationofinputandexecutableles,thedeadlinefortaskcompletionandthepreviouslymentionedtaskrequirements.
Themachinesholdinginputandexecutablelesarecalleddataservers.
AlthoughsomeVCprojectsdouseasetofmir-rorstoactasdataservers,moststorethedatainthecentralserver,asrepresentedinFig.
4.
Themappermustthendownloadtherequireddatafromthedataserver(2)beforestartingthecomputation(3).
Afterthetaskexecutioniscompleted,themappercreatesanMD5hashforeachofthemapoutputles.
Therefore,attheendofthecomputation,eachmapperisleftwithboththemapoutputlesandthesamenumberofcorrespondinghashes.
Thesehashsumsaresentbacktotheserverinplaceoftheoutputles(4)(soitiscompatiblewithcurrentVCsolutions,e.
g.
,BOINC).
It'sworthytonotethatthisgreatlyreducestheuploadvolumefrommapperstotheserver.
Fig.
4GiGi-MRMapReducejobexecutionThehashesarecomparedattheservertovalidateeachcorrespondingtask(5).
Iftheresultisvalid,themapper'saddressisstoredinGiGi-MR'sdatabase(6).
Eachtimeamapresultisvalidated,theGiGi-MRserverchecksifallmaptaskshavebeenexecutedandvalidated.
Oncethisconditionismet,theservercreatesthepredenednumberofreducetasks.
Uponreceivingaworkrequestfromareducer(7),theserverfollowstaskschedulingprocedurementionedearlierinthissectionandlooksthroughthedatabasetondataskthatcanbeassigned.
Onceithasascertainedthatthereducermeetsallthehardwareandavailabilityrequirements,theserverreplieswithareducetaskthattstherequest.
MapReducejobsrequirecommunicationbetweenmapandreducestagessincemapoutputsareusedasinputforreducetasks.
Inthereducestep,eachtaskperformsjoinoper-ationsonthemapoutputs.
Therefore,eachreducetaskmustobtainallthemapoutputsthatcorrespondtothekeyrangeitisresponsiblefor.
ToachievegoodperformanceinMapReducejobs,weleverageclients'resourcesbymovingasmuchofthecommunicationaspossibletotheclient-side.
Thishelpsreducetheloadonthecentralserver,andcreatesamoresuit-abledecentralizedmodelfordata-intensivescenarios,typicalofMapReduce.
Notethat,aspreviouslystated,incurrentVCsystemsalldatawouldhavetobeuploadedanddownloadedfromtheserver.
However,theGiGi-MRserverstorestheaddressofallmappersthatreturnedvalidmapresults.
Thisinformationisincludedintheworkrequestreply,andallowsreducerstodownloadthemapoutputdirectlyfromthemappers,withouthavingtogothroughtheserver(8).
Oncetheinputleshavebeendownloaded,thereducetaskisexecuted(9)andthenalresultisreturnedtotheserver(10)forvalidation.
2.
3CheckpointandreplicationThislayerisresponsiblefor:(1)checkpointingtaskstoaccountfortaskfailureandallowrestartsinremotenodes,123JInternetServAppl(2012)3:329–346335withoutanysourcecodemodication,throughtheuseofVMs;(2)providingdifferentoptionsforpartitioninginputdata,chosenbytheuser;(3)localsamplingattheserver(moredetailsafterwardsinthissection),forvalidationpur-poses.
InFig.
1,theReplicationandSamplingmodulesrep-resentthislayerintheserver.
Anapplicationcanbecheckpointedifwerunitontopofavirtualmachine(VM)withcheckpoint/restartcapabilities(e.
g.
,qemu5),astheapplication'sstateissavedwithinthevirtualmachine'sstate.
Thisalsoprovidessomeextrasecu-ritytotheclients,sincetheywillbeexecutinguntrustedcodewithahighlevelofconnement.
Furthermore,usingvirtualmachinesallowsustoreducetheimpactofbyzantinebehaviour,causedbythedifferentsoftwareandhardwarecongurationsfoundateachmachine.
ByrunningtasksontopofVMs,thesamesoftwaredriversandprogramsareusedduringexecution.
Thisguaranteesthateachtaskproducesthesameresultregardlessoftheunder-lyingsystem.
VMsalsohelpdevelopersbyremovingtheneedforbuildingmultipleapplicationversionsfordifferentarchitectures.
Currently,therearemanyVMsavailablethatcanbeusedindesktopcomputers.
TheoverheadofsuchaVM,whencomparedtoacaseinwhichthereisnosuchsoftwarelayer,isnegligibleasismostlyprovedbythelargeamountofinstal-lationsusedbothinacademicandnon-academicsettings.
Themajordrawbackofthisapproachisthesizeofthecheckpointdata,incurringconsiderabletransmissionover-head.
Toattenuatethis:(1)weassumethatonebase-genericrunningcheckpointimageisaccessibletoalltheclients;(2)theapplicationsstarttheirexecutionontopofthisimageonceitislocallyresumed;and(3)atcheckpointtime,weonlytransmitthedifferencesbetweenthecurrentimageandthebaseimage.
GiGi-MRprovidesredundantcomputinginwhicheachcomputationisperformedonmultipleclientsthroughthereplicationofinputles.
Whenasufcientnumberofsuc-cessfulresultshavebeenreturned,theGiGi-MRservercom-paresthemandseesifthereisaconsensus.
Inthatcase,thecorrespondingoutputsareconsideredvalid.
EachreplicationmethodprovidedbyGiGi-MRisbasedonadifferentdatapartitioningtechnique,whichconsistsofdividingataskintomultiplesubtasksthatexecuteseparately.
Thisisachievedbysplittingtheinitialinputleintoseveralsmallerchunks,andrequirestheapplicationtobecompletelyparallel.
Foranapplicationtobeamenabletodistributedcomputation,itmustbepossibletohaveitsworkpartitionedinmultipletasksthatrunseparately.
Throughtheuseofdatapartitioningandtaskreplication,GiGi-MRisabletodetectcollusionandvalidateresultsby5Qemuisagenericandopensourcemachineemulatorandvirtualizer.
http://wiki.
qemu.
org/Fig.
5Thesameworkdivideddifferently,creatinganoverlappedpar-titioningcomparingtheoutputsofredundantcomputations.
However,thetechniquesusedtoidentifyincorrectresultsincurcon-siderableoverhead.
Noneoftheexistingresultvericationtechniquesisabletoensurewith100%certaintythataresultiscorrect,thoughinsomecasestheycanidentifyanincorrectone.
Thedegreeofcertaintythataresultiscorrectusuallygrowsalongwiththeoverheadthetechniqueincurs.
There-fore,acompromisebetweentheoverheadandthereliabilityoftheresultscanbefound,andmustbedynamicallyadapt-abletothevariableconditions/resourcesofthesystem.
Thislayerproposesanumberofdatapartitioningapproachesandacomplementarysamplingtechnique,whichgivetheuseramplechoiceonhowtoreachthedesiredcom-promise.
Thesupportedpartitioningtechniquesarepresentedinthefollowingsections.
2.
3.
1OverlappedpartitioningUsingoverlappedpartitioning,thetasksareneverexactlyequal,eventhougheachindividualpieceofdataisstillrepli-catedwiththepredeterminedfactor.
Colludersmustalwaysexecutepartofthetask,evenwhentheyaretryingtoreturnforgedresults.
Figure5depictsthesamework(inputle)dividedintotwodifferentoverlappedpartitionings,withtwodifferentsetsofpartitions.
Theleisdividedintosixchunks,butfollowingdifferentdivisionoffsets(wheretosplittheinitialle).
Thereare11differentcomparisonpoints(com-monchunksbetweentwopartitions)betweeneachsetofpartitions,insteadofthetypical6ofanintegralreplication(assumingadivisionoftheleinsixdifferentpartitions).
Theseoverlappedpartitionscanusearandomoffsetandrequirestrongcommunicationamongthecolluderstoiden-tifythecommonpartofthejob.
Althoughitismoreprobableforcolluderstohavecommonpartsofthetasks,thesecom-monpartsaresmaller.
2.
3.
2RelaxedpartitioningOverlappedpartitioningcanbeimplementedinarelaxedavour,whereonlysomepartsofthejobareexecutedredun-dantly.
Thislowerstheoverhead,butalsolowerstherelia-bilityoftheresults.
However,itcanbeusefulifthesystemhaslowcomputationalpoweravailable.
Maliciouspartici-pantsareabletodetectthecommonpartofthejob,howevertheycanneverbesurethatthenon-commonpartisnotbeing123336JInternetServAppl(2012)3:329–346Fig.
6OverlappedtasksforrelaxedreplicationFig.
7Meshedpartitioningusingreplicationfactor2executedredundantly.
Figure6depictsarelaxedoverlappedpartitioning.
Wecanseethatineachpartitioningscheme,theleisdividedintothreepartitions.
However,thosepartitionsdonotencompassthewholele(i.
e.
,therearepartsofthelethatarenotreplicated).
Therefore,thecomparisonpointsinwhichwecanvalidatetheoutputaremuchsmallerthanthewholele.
2.
3.
3MeshedpartitioningSomeapplicationscanhavetheirworkdividedinmorethanonedimension.
Figure7depictsthepartitioningoftheworkforaray-tracer.
Theinitialinputleissplithorizontallytocreatetherstfourpartitions,andthenverticallytocreatetheremaining4.
Whenvalidatingtheresults,thereare16(4*4)comparisonpointsbetweenpartitions.
Liketheover-lappedpartitioning,thisinuencesthewaycolludersareabletointroducebadresults:morepointswheretheycancollude,withasmallersizetoo.
Thispartitioningprovidesanum-berofpointsofcomparison,whichareusedtoestablishthe"reputation"ofaresult.
Eachtask'soutputiscomparedto4othertasks'outputs,accordingtotheexistingcomparisonpoints,andisevaluatedaccordingtothenumberofconsen-sualresults.
Forexample,inFig.
7the1stpartitionof"Par-titioning1"willhavecomparisonpoints1,2,3and4(eachoneforadifferentpartitionof"Partitioning2").
Thealgorithmforcalculatingthe"reputation"ofaresultmusttakeintoaccounttheoutcomeofcomparisonpoints(i.
e.
,equalordifferentoutput).
Sincethemajorityofthepar-ticipantsisexpectedtobehonest,ndingthesameresult(equal)addspositivereputationwhileadifferentoutcomeaddsnegativereputation.
Fortheacceptanceofeachpoint,equaloutputsfrombothtasksareacceptedonthey,whiledisparateoutputsaredisambiguatedaccordingtothecom-binedreputationofthetwotasksthatproducedit.
Forexample,iftask1produces4correctoutcomes,whiletask2produces2incorrectand2correct,thentask1wouldhaveabetterreputation(consideringothertasksalsoproducedcorrectresults).
Thus,inthediscrepantcomparisonpointbetweentask1and2,task1'soutputwouldbeaccepted.
Ifthereputationofbothtasksisthesame,thecommonpor-tionofbothresultsmustbere-executedtoachieveavotingquorum.
2.
3.
4SamplicationSamplingconsistsonthelocal(intheserver,inourcase)executionofafragment,assmallaspossible,ofeachtasktobecomparedwiththereturnedresult.
Inessence,samplingpointsactashiddenembeddedquizzes.
Replicationbasesallitsresultvericationdecisionsinresults/infoprovidedbythirdparties,i.
e.
,theparticipantworkers.
Inanunreli-ableenvironment,thismaynotbeenough.
Therefore,localsamplingbytheservercanhaveanimportantplaceinthevericationofresults.
Samplingensuresthatthemaliciousparticipantsexecutepartofthetaskforthistohaveanychanceofbeingaccepted.
Althoughrandomsamplingcanonlyensurethataresultiscorrectwithagivenprobability(basedonthesizeofthework,thenumberofsamplesandthepercentageoftheworkthatiscorrupted),itcanidentifywrongresultswithcertaintyanddeliververyusefulinformationtoareputationmechanism.
WedeneSamplicationasthecombinationofreplicationandrandomsampling,usedsequentiallytoachievehigherreliabilityoftheresults:thewinningresultofthevotingquo-rumsisconsideredcorrectifitmatchesarandomsamplethatwasexecutedbytheserver.
ThistechniqueisapplicabletoMapReducejobsbyhavingtheserverrunasmallpartofamapinputandthencheckagainstthereturnedoutputs.
Forexample,ifrunningawordcountapplication,theserverwouldcountthewordspresentinasmallpartofaninputle,andcheckiftheywerepresentinatleastthesamenumberinsidethereturnedoutputles.
Samplicationallowsthesys-temtotakeadvantageofthebestofbothmechanisms,whileaddingonlymarginaloverhead(denedbytheapplicationowner).
Finally,samplicationisalsousedtomakesurethattheparallelizationofsequentialtasks(providedintheUserInter-facelayer,Sect.
2.
1)doesnotaltertheexpectedresult.
Tothatend,theGiGi-MRperiodicallyexecutestheoriginalappli-cationinthebackground,ofine,sequentiallyandcomparesitsresultswiththedistributedversion.
2.
4ResourcediscoveryThislayerisresponsibleforimplementingtheResourceDis-coverymechanismsontheGiGi-MRclients(thatexecuteeithermaporreducetasks).
Itisextremelyimportantfortheschedulingalgorithmusedbytheserversincetheinformation123JInternetServAppl(2012)3:329–346337obtainedbytheclients,throughtheexchangeofresourceandavailabilitydata,issentbacktotheserver,totheResourceUpdatesmodule.
Thismoduleupdatestheserverdatabasewithhosts'updateddata,andisaccessedbytheSchedulerwheneverreplyingtoaclientworkrequest.
Thisway,theserverismorefrequentlyupdatedwithcurrentknowledgeonhosts,andisabletoperformmorereliableschedulingdecisions.
OurResourceDiscoverylayerisalsocapableofsearchingnotonlyforphysicalresources(e.
g.
,CPU,Memory,etc.
),butalsoservices(e.
g.
,facialrecognition,high-resolutionren-dering,etc.
)andapplications(e.
g.
,ffmpegvideoencoder,programminglanguagecompilers,etc.
)IntheGiGi-MRclient,eachtypeofresourceisassignedavaluefrom0to1,where0meansthattheresourceisunavailableand1thattheresourceispowerfulandhasgoodavailability.
Theglobal(amongalltypesofresources)avail-abilityvalueofaremotenodemaybeobtainedthroughasimpleadditivemodel[15].
Inthisway,wedenetherela-tiveimportanceofeachtypeofresourcebydeningweights(usingmethodsliketheswingweights).
Withthem,itisthenpossibletomakeaweightedsumandobtaintheglobalavail-abilityvalue,whichwouldbethenoderate.
AsalreadymentionedinSect.
2.
2,GiGi-MRsupportsinter-clienttransfers,whichreducetheburdenontheserver,andimproveperformanceonmoredata-intensivescenar-ios,suchasMapReducejobs.
Therefore,determiningtheavailablebandwidthbetweennodescanbeoftheutmostimportance(e.
g.
,whenschedulingreducetasks).
However,measuringbandwidthofasinglenodeintheseenvironmentscanyielddisparagingresults.
Ourapproachistocheckthetimeforamessagetotravelfromonenodetoanotherandbackagain(i.
e.
,theround-triptime,RTT).
Toavoidoodingthenetwork,weonlyhaveeachclientcontactasmallsub-setofremotenodes,calleditsneighbourset.
Withinashortperiodoftime,theminimumRTTvalueobtainediskeptandthebandwidthiscalculated.
Theresultsobtainedfromthisprocessarethenpassedontheserver.
Withoutproperneighbourselection,thisinformationwouldnotbeveryhelpful.
Slowernodescouldbecoupledwithfarawaynodes,ormachineswithfasterconnectionsthatwouldnotbetakenadvantageof.
Therefore,thislayerprovidesGiGi-MRclientswithaneighbourselectionmech-anismthatmaximizesthesystemperformancemetrics.
Ouralgorithmconsiderstwoparametersassignicant:proximityandresourceavailability.
ProximityismeasuredthroughRTT,andincludesbandwidth.
Eachpeercontactsothernodesuponbootstrapand,periodically,onceithasenteredthenetwork,recordstheRTT.
Theavailableband-widthisinferredfromthesecontacts,aswellasfromanyinter-clienttransfersthatoccurwhenexecutingaMapRe-ducejob.
Theresourceavailabilityparameterisdenedthroughthepreviouslymentionednoderate(additivemodelofaremotenode'sresources),andisincludedinthesecon-tactmessages.
Theselectionofneighboursisthenbasedonaweightedmeasureofbothproximityandnoderate.
Theweightofeachparameterisdenedbytheapplicationdevel-oper(defaultsto0.
5each).
Oncereportedtotheserver,theneighboursetinformationisextremelyusefulfortheserverwhenschedulingtasks.
Asanexample,whensubmittingareducetask,theserverisabletocheckifanyoftheneighboursofthenoderequestingworkisexecutingamaptask.
Ifthisistrue,andtheavailablebandwidthbetweenbothislargeenough,theservercanmakethisnodeareducer.
If,ontheotherhand,therequestingnodehasverylowbandwidthtoallitsneighbours,theserverisabletodeducethatthisnodehaslowuploadbandwidth.
Itismarkedasuntforadata-intensivereducetask,andamorecomputeintensiveapplicationisselectedinstead.
2.
4.
1UsingbloomltersAttenuatedbloomlters(ABF)wereproposedin[26]tooptimizelocationperformance.
ItusesanarrayofBloomFilterswithdepthd,whereeachrowi,for1≤i≤d,cor-respondstotheinformationstoredatnodesihopsaway.
Asthedepthincreases,moreinformationwillbestoredinthatBloomFilterrow,makingtherespectiveltermoreattenu-atedandresultinginahigherprobabilityoffalsepositives.
Therefore,informationclosesttothenodeismoreaccurate,andbecomeslesssoasthedistancebetweennodesincreases.
Usingitinoursystem,eachnodeinthenetworkkeepsacachedversionoftheABFofitsneighbours.
Thisinforma-tionisthencombinedintoonesingleABFbycalculatingtheunionofeachBloomFilteratthesamedepthfromallneigh-bours.
Forinstance,saynodeAreceivesthefollowingABFfromitsneighbourswithdepthd=2:(00011,10000)and(11001,00001).
Tocombinetheinformation,theORopera-tionisperformedforeachdepth.
So,ford=1,theresultinginformationis11011,andford=2itis10001.
TheseaggregatedABFaresenttotheserver,onceineverynworkrequests(iftherehavebeennochangessincethelastrequests,theyarenotincluded),andsavedbytheResourceUpdatesmodule.
Thismoduleordersthemaccordingtothenode'sexpectedavailability(howsoonitisexpectedtobeavailableforexecution),andtheFilters'depth(lowertohigher).
SavingallthereceivedABFwouldbeimpossible,andcreateincredibleoverhead.
Therefore,theserverusestimestampstomarkthevalidityofeachone.
WheneveranABFhasbeeninthesystemformorethanthetime-outinter-valspecied,itisdiscarded.
ThiskeepsthenumberofABFtoareasonablenumber,whilestillbeingusefulforscheduling.
Whenevertheserverreceivesaworkrequest,itcheckstheavailabletasksand,ifthereareanygoodmatcheswiththerequestinghost,theyaresentinreply.
However,inthecaseofamismatch,theSchedulercontactsResourceUpdatesand123338JInternetServAppl(2012)3:329–346checksifthereisanynodewhichisabettermatch,andthatisexpectedtobecomeavailablewithinashorttimeframe.
ThissearchisconductedbystartingwithanABFwithadepthof1(neighbourstothenodethatsubmittedthem).
Ifthereisahit,theserverlooksintheDatabase(DB)forothertasksmoresuitableforthishost.
However,ifthequerydoesnotreturnanymatches,thetaskswhoseminimumrequirementsarefullledbytherequestingclientaresubmitted.
Inthisway,thetypicalschedulingalgorithmservesasafail-safemethod,ensuringthattasksareexecutedeveniftherearenooptimalhoststorunthem.
Informationaboutresources,applications,andservicesofferedbyeachnodearerepresentedinsideaBloomFilter.
However,becauseaBloomFilterisonlycapableofperform-ingmembershiptestsgivenakey,weneedtostoreinforma-tionaboutthoseresourcesintheactualkey.
Forexample,sayanodehasaCPUof3GHz,wecannotsimplystorethename"CPU"intheBloomFilter,astheonlyinformationwecanextractfromthatisthatanodehasaCPU.
Weneedtoaddinformationabouttheactualresource(e.
g.
,itsvalue:3,000MHz)tothekeythatisinsertedintheBloomFilterforittobeuseful.
BloomFilterkeysstoreresourceinformationbyfollowinganamingconvention,andareusedtodifferentiatebetweenresourcesandtheirvalues.
Ournamingconventionusesa3-levelnamespace,eachseparatedusingthecolon(":")asadelimiter,withthefollowingrules:Level1—NameoftheResource,Service,orApplication(e.
g.
,CPU,ffmpeg,etc);Level2—TypeoftheResource,Service,orApplica-tion(e.
g.
,MHz,version,etc.
);Level3—ActualvalueoftheResource,Service,orApplication.
Forinstance,ifwewantedtostorethefactthatanodehasaCPUof3GHz,thekeywewouldinsertintotheBloomFilterwouldbe:"CPU:GHz:3".
Thenamespacedenitioninstoredinacongurationleintheserver,whichisprovidedtotheclients.
Someresourcesaremostlystaticanddonotchangeoften,liketheOperatingSystem,orCPUandDiskspeed.
How-ever,thereareotherresourceswhosevaluescanchangequiteoften,suchasamountofRAMoccupied,ortheamountofCPUinuse.
Forthosecases,ifweusedaclassicBloomFilterthenitwouldneedtoberebuiltperiodicallysinceitdoesnotsupporttheremovalofelements.
Moreover,thisrebuildingprocedurewouldrequireresendinginformationaboutresourcesthatarenotexpectedtochange,thuswast-ingbandwidth.
Therefore,insteadofusingaclassicABFtostoretheinformationaboutthedynamicresources,aseparateCountingABF[12]isused.
2.
5OverlaymanagementAswementionedpreviously,thislayeractsasaninterfacebetweenthesystemandanunderlyingnetworkthatconnectsGiGi-MRclients.
ThisrequirestheuseofarobustP2Pover-lay.
Inourexample,weusePastry[27],ageneric,scalableandefcientDistributedHashTable(DHT),butanyothercouldbeused.
NodeidentiersarerandomlygeneratedandassignedtoapreciselocationonthecircularaddressingspaceofPastry.
Bydoingso,themachinesholdingadjacentnodescouldbecompletelygeographicallydispersed.
Asabootstrapmechanism,theGiGi-MRserverprovidestonewclientsalistofentrypoints(bootnodes'IPaddressandport),correspondingtosomehostswithhighuptime(possi-blyservers).
Eachnodeinsidetheoverlayreceivesinforma-tionontheresourcesofasmallnumberofremotepeers,partoftheirneighbourset.
Theneighboursetisextremelyimpor-tantforoursystemsince,aswementionedbefore,itidentieswhichremotenodes'informationissentbacktotheserveroneachworkrequest.
Nodesadvertisethemselvesbysendingupdatemessagestotheirneighbourswheneverthereisasig-nicantchangeinresourceavailability.
Thesemessagesarealsosentperiodicallytokeepthemupdated.
Therefore,anychangesinresourceavailabilityareannouncedtothenode'sneighbours.
Thesemessagescontainthesendernode'srelatedinfor-mation:itsidentier,itssupportedapplicationidentiers,thetimerequiredforthisinformationtoexpire,anditsresourceavailability(e.
g.
,CPU,bandwidth).
Uponreceivingthisinformation,aneighbournodecalculates,withitsownjudgement,theglobalrateoftheannouncernode.
Thisjudge-ment,asdescribedbefore,consistsofassociatingweightswiththemeasuredavailabilityofeverysingleresource.
Theproximitylevelbetweentheannouncernodeanditsneigh-bourisalsotakenintoaccount.
Insummary,thislayerhandlesallcommunicationbetweenGiGi-MRclientsandtheoverlaynetwork.
Allmessagesreceivedfromtheupperlayersaresenttothenetwork.
Theoverlaycontactsthislayerwheneverthereisamessagemeantforthenoderelatedtoresourceupdates.
Finally,allchangestothenode'sneighbourset(e.
g.
,remotenodeleaving)arereported.
TheResourceDiscoverylayerdealswiththosechangesappropriately.
3ImplementationandevaluationThissectionrevealssomeoftheimplementationdetails,presentstheresultsofourexperimentsanddescribestheapplicationsweuse.
3.
1ImplementationGiGi-MRisdesignedontopofaBOINCclientversion6.
11.
1andserverversion6.
11.
0.
Forthenetworkmanagement,theOverlayManagementlayerusestheFreePastry6toolwhichisaJavaimplementa-tionofthePastryoverlay.
6FreePastry.
http://freepastry.
rice.
edu.
123JInternetServAppl(2012)3:329–346339Tomeasureresourcessothattheycouldbecomparedagainsteachotherinasimpleadditivemodel(usedintheResourceDiscoverylayer,describedinSect.
2.
4),wehavetoconvertdirectindicatorsofavailabilityintoacommonscale,ratedfrom0to1.
Therefore,werelyonthefollowingexpressiontodothatconversion:fr(x)=min(1,x/MAXr).
MAXristhevaluethatweconsiderasverygoodfortheresourcer,andxisthedirectmeasuredvalue.
Forexample,ifweconsiderMAXCPU=500andx=250,weobtainf(250)=0.
5.
Inaddition,FreePastryprovidesaproximitymetric(basedontheRTTvalue)thatisalsoconvertedtothecommonscaleandusedintheadditivemodel.
Therefore,theglobalavail-abilityvalue(i.
e.
,theglobalnoderate)iscalculatedthroughthefollowingexpression:NR(a)=kr(a)·vr(a);kr(a)istheweightoftheresourcerinthenodea,andvr(a)thevalueoftheresourcerinthenodea(i.
e.
,fr(x)).
Furthermore,theuserisfreetodenetheweightsandtheverygoodreferencevalueassociatedwitheachresource.
Todifferentiatemaptasksfrom"normal"ones(i.
e.
,non-MapReducetasks),theMapReduceVClayermodiestheirtemplatesbyadding""tagswithadditionalinformationsuchasjobidandstage.
TheGiGi-MRserverusesanadditionalgeneralcongurationle(inXML)tocoordinatebetweenstagesandhandletaskcreation.
GiGi-MRclientsuseTCPforinter-clienttransfers(betweenmap-persandreducers),duetoitsreliabilityandsimplicity.
AmapperopensaTCPsockettolistenforincomingconnec-tionswheneverithasnishedamaptaskanditsoutputisavailable.
Incomingrequestsfromreducersareacceptedonlyforspeciedmaples,andthesocketisclosedwhentherearenomorelesavailableforupload.
IntheUserInterfacelayer,theinteractionbetweentheApplicationRegistrarandtheGiGi-MRserverismadebyXML-RPCcalls.
JobinformationorganizationwithintheGiGi-MRserverimpliesonemodication:allusersubmittedjobsareprocessedwithinthesameGiGi-MRprojectbutmaybelongtodifferentuserprojects.
Toaccommodatethisnewinformation,anewtable(UserProject)hadtobeaddedtotheserverdatabase.
Furthermore,aCommodityApplicationtablewasaddedtoaccommodatethenamesandversionsofthecommodityapplicationsavailableonremotehosts.
3.
2EvaluationWeevaluateGiGi-MRbyrunningseveraltestsovertheInter-net,inascenariothatresemblesatypicalVCenvironment.
WerunexperimentswiththreedifferentMapReduceappli-cations(wordcount,invertedindex,andN-Gram)togaugeoursystem'sperformanceunderdifferentconditions.
Inaddi-tion,inordertoevaluateeachcomponentindependentlywerunmicro-benchmarks,tailoredtomeasuretheimpactandTable1EvaluationofapplicationtransformationOutsideGiGi-MRInsideGiGi-MROriginalModied1CPU2CPU3CPUTime(s)60.
0161.
9364.
0832.
5916.
97overheadofthedifferentlayersinoursystem.
Thissectionpresentstheresultsofourexperiments.
3.
3UserInterfaceInthissection,wepresenttheexperimentsforthetwofea-turessupportedbytheUserInterfacelayer:transformationofsequentialapplicationsintoparallelBoTs(Sect.
3.
3.
1);andsubmissionofjobsbynon-expertusers(Sect.
3.
3.
2).
3.
3.
1ApplicationtransformationOurevaluationistwofold:(1)functional,developingsampleapplicationsandexecutingondifferentenvironments,and(2)quantitative,whereweshowtheoverheadincurredusingoursolution.
WeparallelizeaMonte-Carlo[23]computationtointe-grateonefunction.
Insteadoftreatingeachrandomvalueinasequentialway,eachtaskisresponsibleforobtainingpartofthesolution.
Inordertousethisfeature,thedenitionofaclassisnecessary,whileamoresimplesolutionwouldonlyrequirealoopwiththecomputationcodeinside.
Theover-headincurredusingGiGi-MRisminimalandeasilyoutdonebytheparallelizationgains.
Table1showstheoverheadwhenrunningitonasinglemachine.
ThisevaluationwasperformedonanIntel(R)Core2QuadCPUwith4coresrunningat2.
40GHz.
Thetestedapplica-tionintegratesonecomplexfunctionusingtheMonte-Carlomethodwhilegenerating50millionrandompoints.
AsseeninTable1,thereisanincreaseofexecutiontimewhenrun-ningthemodiedversionandusingGiGi-MR.
Oneofthereasonsfortheexecutiontimeincreaseisfromtherewritingoftheapplication:theinclusionofobjects,andtheincreaseofcycleinteractionandmethodcalls.
Moreoverheadisaddedbyoursystem.
Intheversionwith1CPU,differentthreadsforeachobjectwerecreatedbutserializedwiththehelpofalock,guaranteeingthattheyallexecutedonthesameproces-sor.
Itisobservableanincreaseofabout2sontheexecutiontimeleadingtoanoverheadofabout1/8ofasecondforeachparallelobject.
Iftasksarelonger,theseoverheadswillhavealowerimpact.
Furthermore,withconcurrentworkingprocessorsalloverheadissubduedbythegainsofconcurrentprocessing.
123340JInternetServAppl(2012)3:329–346Fig.
8Animationmovierenderingtimes3.
3.
2JobsubmissionbyordinaryusersToevaluatetheusabilityandperformancegains,wedeployedaGiGi-MRserverandallowedsomeclientstouseit.
Theexperimentsweredoneonourlocalnetwork,topinpointtheoverheadbroughtonbyoursystemmoreprecisely.
Theexperimentconsistsonusingaraytracertogenerateanani-mationwith100frames.
OnaPentium4runningat3.
2GHzwithLinux,eachframetookbetween3and100s,givingatotalrenderingtimeofabout127min.
Thetimesfortheexe-cutionofthesejobsonseveralcomputers,showninFigure8,aremeasuredwithidenticalcomputersconnectedbya100Mbit/slocalnetwork.
Wepresentthetimetoexecutethejobssequentiallyononecomputer(bothlocallyandbymeansoftheGiGi-MRinfrastructure)andonseveralcomputers.
Asexpected,thespeedupsareinlinewiththenumberofcycledonorhosts.
Theoverheadincurredusingourjobdis-tributionplatformisminimal,only2min.
Thisiscausedbythejobsubmissionandclientstartup.
Withtheparticipationofanotherhost,evenduringasmallperiod,thisoverheadisnotnoticeable.
Onanwideareanetwork,orwithlargerinputles,thisoverheadislargerbutiseasilysurpassedwiththecontributionofanotheruser.
3.
4MapReduceVCWeevaluatetheperformanceofGiGi-MRintermsofappli-cationturnaroundandnetworkusebyrunningseveraltestsovertheInternet,inascenariothatresemblesatypicalVCenvironment.
WecompareourresultswithanexistingVCsystem(BOINC),referredtoasVCSthroughoutthissection.
BOINCclientshavethelimitationsmentionedpreviously(inSects.
1and2),anddonotsupportinter-clienttransfers.
TheGiGi-MRserverisabletosupportMapReducejobseveninanenvironmentcomposedsolelyofunmodiedBOINCclients.
Thismeansthat,eventhoughtheserverisnotabletoleverageclients'resources(forschedulingandinter-clienttransfers),itisstillabletodistributemapandreducetasksandobtainavalidnaloutput.
However,allcommunicationmustgothroughthecentralserver,andassuchthereisnotolerancetoservertransientfailures.
Toeval-uatethishypothesis,intheVCSscenariowedeployaGiGi-MRserverandunmodiedBOINCclients(version6.
13.
0).
Werunexperimentswiththreedifferentapplications(wordcount,invertedindex,andN-Gram),inordertogaugeoursystem'sperformanceunderdifferentconditions.
Duetospaceconstraints,weonlypresenttheresultsfromtheN-Gramapplication.
Note,however,thattheothertwoappli-cationsshowasimilarperformance.
Wemeasureapplicationturnaround,whiledifferentiatingbetweenmapandreducestagestopinpointpotentialbottle-necksandareasthatwouldbenetmostfromimprovement.
Additionally,wemonitornetworktrafcontheserver.
Thisallowsustoidentifythebenetsofreducingthedependenceonthecentralserver.
WerunourexperimentsonPlanetLab,awide-areatestbedthatsupportsthedevelopmentofdistrib-utedsystemsandnetworksservices.
Intheseexperiments,weuse50nodesthatworkastheclients,andonenodetoactasserver.
TheResourceDiscovery(Sect.
2.
4)layeremploysaneigh-bourselectionmechanismthatcouplesnodeswiththebestavailablebandwidth,promotinganhomogenousnetworkbandwidth;thus,theevaluationresultswereobtainedforanetworkdownloadbandwidthofapproximately700KB/s.
Ifotherconditions(e.
g.
,heavychurnorfailurerate)aremet,thenetworkbandwidthmaychangeaccordinglythusrequir-ingspecictechniquestoavoidsuchslownodes(asshownin[17]).
3.
4.
1ApplicationturnaroundWebeginbymeasuringapplicationturnaround.
WemeasurethetimeittookeachMapReducejobtonish,startingfromtheinitialdownloadofmapinputles,andendingwiththeuploadofthelastreduceoutput.
Weseparatethemapandreducestepstoidentifytheirrespectiveweightinregardstotheoverallapplicationturnaroundtime.
Themapstageisconsideredtobenishedonceallitsoutputhasbeenvalidatedintheserver.
TheresultsobtainedwithN-GramareshowninFig.
9.
TherstconclusionisthatGiGi-MRisabletonishtheMapRe-ducejobinhalfthetimeofVCS.
WecanalsoobservethatthereducestageonGiGi-MRisonlyslightlyfasterthanVCS.
Thiscanbeexplainedbythefastnetworkconnectionoftheserver.
Despiteitslargebandwidth,inter-clienttransfersstillperformbetterthanthecentralizedsystem.
Ontheotherhand,thedifferencesinthemapstepare,asexpected,muchmoresignicant.
GiGi-MRisfourtimesfasterinexecutingthemapstage,whichtranslatestojustaquarteroftimeneededbyVCStovalidateallitsmaptasks.
ThisresultshowsthatGiGi-MRperformsbetterwithapplicationsthatcreatelargeintermediateles.
123JInternetServAppl(2012)3:329–346341Fig.
9TurnaroundofN-GramapplicationbystageFig.
10UploadtrafcforVCSandGiGi-MRserverwithN-Gramapplication3.
4.
2NetworktrafcWemeasureuploadanddownloadtrafcintheserver,forGiGi-MRandVCSwhilerunningtheapplications.
Monitor-ingthenetworktrafcontheserverprovidesamoreaccu-ratemeasureofitsoverhead.
ItalsoallowsustoquantifytheimpactofoursolutionconcerningthedecentralizationoftheVCmodel.
Wepresenttheamountofdatadownloadedfromtheclientsbytheserver,aswellastheamountuploadedbytheservertotheclients.
TheuploadtrafcforaserverrunningN-GramisshowninFig.
10.
Notethat,asmentionedintheprevioussection,GiGi-MRhasamuchlowerapplicationturnaroundthanVCS.
ThisiswhytheGiGi-MRlineinFig.
10stopsaroundsecond3,000(thesamehappensinFig.
11),whileVCSonlynishesitsexecutionmuchlater.
ItisclearthatthereisasignicantdifferenceintheamountofdatauploadedbyGiGi-MRandVCS.
Thisisduetothelargesizeofintermediateles,whichcausestheVCSservertosendalmostvetimesmoredatatotheclientsthantheGiGi-MRserverinthereducestep.
Theserver'sdownloadtrafcisexhibitedinFig.
11.
Here,wecanseethebenetsofusinghashesformaptaskvalidationFig.
11DownloadtrafcforVCSandGiGi-MRserverwithN-Gramapplication(describedinSect.
2.
2).
Upuntilsecond2000,theGiGi-MRserverhasreceivedalmostnodatafromtheclients.
Ataroundthattimeintheexperiment,reducersthatnishedtheirtaskbegansendingtheoutputbacktotheserver.
TheGiGi-MRserverdownloadsatotalof820MBfromtheclients.
Ontheotherhand,theVCSserverisresponsiblefordownloadingallmapoutputsfrommappers,whichcorrespondstothesteepincreaseupuntilsecond4000.
TheVCSserverisrequiredtodownloadsixtimesmoredatathanGiGi-MR.
Theinvertedindexandwordcountexperimentsyieldverysimilarresults,sotheyarenotshownhere.
Ininvertedindex,GiGi-MRisabletoreducetheamountofdatasentfromtheserverfrom6.
5to2.
3GBandcutdatareceivedbytheserverby96%.
Inthewordcountapplication,theGiGi-MRserverreceivesamere250MB,avaluetentimessmallerthanVCS's3GB,andisrequiredtosend2.
5GB,whereastheVCSserversendsmorethandoublethatamounttoclients.
Therefore,wecanconcludethatGiGi-MRnotonlyperformsbetterthanVCSwhenrunningjobswithlargeintermediateles,butisalsoabletoalleviatetheserver'snetworkconnection.
3.
5Checkpoint/restartandpartitioningInthissection,wefocusontheoverheadofdistributingtasksinsideVirtualMachines,andtheperformanceofdifferentdatapartitioningtechniquesprovidedbyoursystemwhenvalidatingresults.
3.
5.
1VirtualmachinecheckpointingThemostrelevantissueofthecheckpoint/restarttechniqueisthesizeofthecheckpointdata.
ThepotentiallyprohibitiveVMimagesizeismitigatedbytheuseofdifferentialdiskimages(efcientrepresentationofthemodicationsmadetothevirtualdisksupportedbytheVMimplementationswith123342JInternetServAppl(2012)3:329–346Table2Checkpoint/RestartthroughaVMimage:checkpointdatasizeusingVirtualBoxandUbuntuDesktop9.
10Datasize(KB)BasePoweredoffDiskimage2,651,1692,768,998ImageAfterBootDiskimage(differential)33Volatilestate117,796CurrentRunningDiskimage(differential)16,417154,403ImageAApplicationAVolatilestate137,986CurrentRunningDiskimage(differential)23,585209,597ImageBApplicationBVolatilestate186,012specicdiskimageformatles,suchasQCOW27).
Table2depictsthesizeofthecheckpointdatafromtheexecutionofaray-tracer(POV-Ray)ontwodifferentinputs,attenuatedwiththeuseofdifferentialdiskimages.
CheckpointAreducesthesizeabout17times(154,403KBinsteadof2,768,998KB),checkpointBabout14times(209,597KBinsteadof2,768,998KB),bringingtransmis-sionand/orstoragecoststoreasonablevalues.
Thedifferen-tialdiskisanefcientrepresentationofthedifferentamountsofmodicationsmadetothevirtualdisk,whichexplainsthedifferenceweobserve.
3.
5.
2ResultvericationthroughreplicationWhenusingreplicationtovalidatereturnedresults,GiGi-MRisabletotakeadvantageofdifferentdatapartitioningtech-niques(asdescribedinSect.
2.
3).
WepresentresultsfromexperimentsusingOverlappedandMeshedpartitioning.
Fur-thermore,oursamplicationtechniqueisalsoevaluated.
WeanalysetheperformanceofGiGi-MR'sresultveri-cationalgorithmbyidentifyingthepercentageofwrongresultsthatarenotdetected.
Weuseasimulatortotestresultvericationapproacheswithlargepopulations.
Thesimula-torisaJavaapplicationthatsimulatesascenariowhereann-dimensionaljobisbrokenintoworkunitsthatareran-domlyassigned.
Amongtheparticipants,thereisagroupofcolludersthatattempttoreturnthesamebadresult(basedoncompleteorimperfectknowledge,dependingonthepartitionoverlapping),inordertofoolthereplicationbasedverica-tionmechanisms.
Thesimulatorreturnsthepercentageofwrongresultsthatwerenotdetectedbytheserver.
Figure12showstheresultsforoverlappedpartitioning.
Wecanseethatoverlappedpartitioningperformsaswellasstandardpartitioning(i.
e.
,exactreplicasofthewholele),inascenariowherethecolludersarefullyabletoidentifythecommonpartandcolludeit,whilestillexecutingtherestofthetask.
Thisispossibleintheory,buthardertoachievein7TheQCOW2ImageFormat.
http://people.
gnome.
org/~markmc/qcow-image-format.
html.
Fig.
12Replicationw/standardpartitioningvs.
replicationw/over-lappedpartitioning,usingreplicationfactor3Fig.
13Percentageofresultsusingbi-dimensionalmeshedpartition-ingbeforerescheduling,inascenariowherecolludersreturnresults100%forgedpracticeasthismayrequireglobalknowledgeandimposeheaviercoordinationandmatchingofinformationamongthecolluders.
Thisistheworstcasescenario,thereforeover-lappedpartitioningmayimprovethereliabilityoftheresults,dependingonhowsmartthecolludersare.
MeshedPartitioningsplitsthetaskinmorethanonedimensionandprovidesmanypointsofcomparison,whicharethenusedtodecideonthecorrectnessofaresult.
Figure13showsthatthepercentageofundetectedwrongresultsisverylow,andalmostnullwhendealingwithunder40%ofcolluders.
However,asmallnumberofresultsmustberescheduledtoreachaverdict.
Theworkthathastobe123JInternetServAppl(2012)3:329–346343Fig.
14Samplication:percentageofwrongresultsnotdetectedinascenariowherecolludersreturnresults50%corruptedrescheduledismostlycomposedbytheportionswherewrongresultsoverlap.
Therefore,thoseresultscannotbeaccepted,andreschedulingistheonlysolution.
Thistechniqueprovestobeveryefcientasweareonlyusingtwicethebaseamountofwork.
Samplicationisatechniquethatcombinessamplingandreplicationwithoutusingvotingquorums.
Plus,thistech-niqueworkswithevenreplicationfactors.
Itusesinforma-tionfromreplicationtodecidewheretochoosesamples,ratherthanselectingsamplesrandomly.
Itselectsthesam-pleswithinareplicationmismatchareaanddiscardstheresultsthatmismatchthechosensample.
Ifthereisnomis-matchinreplicationitresortstorandomsampling.
AsseeninFigure14,whichshowsscenarioswithdifferentreplicationfactors(R.
F.
),thistechniqueisquiteeffective,asitkeepsthepercentageofundetectedwrongresultsverylow,evenforenvironmentswithupto60%ofcolluders.
3.
6ResourcediscoveryToevaluateGiGi-MR'sdiscoverymechanismbasedonABF,wecompareittoasimplermechanism,randomwalk(RW)[24],whichactedasourbaseline.
WechoseRWforbeingasimple,widelyuseddiscoveryalgorithm.
WewanttoassessGiGi-MR'sefciencyandeffectivenessinndingavailableresources,sincethiscanhaveadirecteffectintheserver'sschedulingperformance.
ThetestswereranusingthePeer-Sim8simulatorwithitsEventDrivencapabilities,approx-imatingthesimulationmoretoreal-lifeasopposedtoaCycleDrivensimulation.
WeuseanopensourceBloomFil-terimplementationfromthewellknownHadoopproject.
9Thetestsareexecutedwiththerandomwalkprotocolandthreevariationsofouralgorithmforscalableandefcient8PeerSim.
http://peersim.
sourceforge.
net/.
9ApacheHadoop.
http://hadoop.
apache.
org/.
Fig.
15Querysatisfactionforstaticscenariosresourcediscovery(SERD):SERD1,SERD2,andSERD3whichcorrespondtotheABFdepthsof1,2,and3,respec-tively.
Inourexperiments,wesetthenetworksizeto5,000or10,000nodes,witheitherthreeorsixneighbourspernode.
Wedenethreeresourcedistributioncategories:50%(veryabundantresource),25%(abundantresource),and5%(scarceresource).
Inaddition,weconsidertheresourcestobeoftwotypes:static(e.
g.
,OperatingSystem,orCPU)ordynamic(e.
g.
,memoryused).
Everyvesimulationcycles,10%ofthenodesinthenetworksentresourcequeriesthatcouldbesatisedbyatleastonenodeinthenetwork,andwemeasurethepercentageofsuccessfulqueries.
Regardingthestaticscenario(seeFig.
15)SERD1andSERD2consistentlyshowapercentagerateabove90%exceptforthescarcescenarioswithamaximumofthreeneighbours.
ThiscanbeexplainedbythefactthatthedepthoftheABFdidnotallowtheforwardingofquerieswithmuchhindsight,especiallyinascenariowhereveryfewnodesactu-allycontaintheresourceandwhereeachnodeonlyhasamaximumofthreeneighbours,thusfurtherlimitinganode'sknowledgeaboutthenetwork.
SERD3hasasatisfactionrateof100%inalmostallscenarios,and99%intherest.
Asthealgorithmhadagreaterdepth,itwasabletodirectqueriesintherightdirectionforthemtobesatised.
TheRWalgo-rithm'slackofintelligenceintheforwardingofqueriesisagreatcontrast,withalmostallsatisfactionratesbeloworaround80%.
Figure16showsthequerysatisfactionforthedynamicresourcescenarios,whichareexpectedtonotbeashighasthestaticscenariosduetothevaryingvaluesoftheresources.
Onceagain,SERDoutperformstheRWprotocol,whichdis-playasuccessrateof80%andlower.
Inalmostalltests,theSERDprotocolswereabove80%,exceptforthescarcescenariotests.
Inthose,SERD1struggledthemostbecauseithaslittleinformationabouttheneighbourhood.
SERD2andSERD3onlydisplayasatisfactionratelowerthan80%123344JInternetServAppl(2012)3:329–346Fig.
16Querysatisfactionfordynamicscenarioswhenthescarcescenariowascombinedwithamaximumof3neighbours,whichlimitedtheavailableoptionswhenfor-wardingquerymessages.
RWinthosecaseswashardlyabletoreach20%querysatisfaction,makingitslackofintelli-genceeversoapparent.
Analysis:Inthissection,wepresentedtheevaluationofGiGi-MR.
Wesummarizebrieyitskeyaspects.
First,weexperiencedperformanceimprovements(intheexecutionandturnaroundtimes)onasetofapplicationsthatarerep-resentativeofthosecurrentlyused,bothinacademicandcommercialenvironments,suchase-science(raytracing,imaging)andbigdataanalytics(namelyMapReduceasusedbyGoogleandothersinproductionsettings).
Second,weobtainedsignicantreductionsinnetworktrafcdirectedtoservers,duringexecution,whichimproveserverscalabilityandalloweachservertohandlelargercomputationswithmoreparticipantnodes,i.
e.
,scaletolargernumbersofslavestoexecutemoretasksconcurrently.
Third,weimprovethereliabilityofvoluntarycomputingwithasetofnovelreplica-tionandsamplingtechniques,thatrequirecolludersalwaystoexecutepartoftheirtasks,andbyimposingmorecoor-dinationandoverheadtosuccessfullyforgeresults,allthiscombinedwithefcientandlowoverheadcheckpointing.
Finally,weshowedhowoursystemcanscaletolargepopu-lationsofvolunteers,whileachievingefcientresourcedis-coveryandhighresourceutilization,thustakingadvantageofidleresourcesscatteredontheInternet,bymeansoftheSERDresourcediscoveryprotocol.
4RelatedworkXtremWeb[5]andLeidenClassical10aredistributedcom-putingprojectsthatallowregistereduserstosubmittheirjobs,asopposedtoplainBOINCinstallationswhereonlythe10UniversityofLeiden.
Leidenclassical.
http://boinc.
gorlaeus.
net/.
systemadministratorcreatesjobs.
InLeidenClassical,thereisonlyonedataprocessingapplicationandusersonlysubmitinputlestobeprocessedbythatapplication.
XtremWebismoreversatileasithostsseveralinstalledapplications.
InXtremWeb,usersprovidetheinputlesanddenethecommandlineargumentsusedtoinvoketheapplication.
XtremWeballowstheuseofabroadersetofapplications,butstillrequiresthesystemadministratortoinstallthem.
Auserisnotallowedtoinstallanewdataprocessingapplica-tiontosolvehisproblems.
SupercomputinganddatacenterstypicallyemployMPItaskfarmerswhenrunningBoTapplications[16].
Taskfarm-ingfollowsamaster/workermodelinwhichthemastercoor-dinatestaskcreationandscheduling,distributestasksamongworkers,andreceivestheresults.
In[16],MPIisextendedtosupportdynamicprocessmanagementandtaskcreationinclient/serverapplications.
Despitehavingseveralcommongoalstooursolutionsuchasadaptiveexecution,ormaximiz-ingresourceutilization,thesesystemsoperateintightlycou-pledenvironmentssuchasclusters.
GiGi-MR'sdeploymentovertheInternetcreatesanentirelynewsetofrequirementsandchallenges,whichpreventsusfromadaptingexistingMPITaskFarmingsolutions.
Nimrod[1]istargetedatparametersweepapplications,andfollowsamodelsimilartotaskfarming.
InNimrod,theuserdenestheinputles,thetypeofparametersandhowtheyvary.
Nimrodthengeneratesallparametercom-binationsandassignseachparametercombinationtoatask.
EventhoughNimrodhelpsonthecombinationofallparame-ters,theusermuststillhavesomeprogrammingknowledge,becausetheprocessingapplicationmustbecodedandthedatatypeofeachparametermustbedened.
CombiningtheconceptsofCloudandVolunteerComput-inghasbeenproposedin[18],inwhichtheauthorsstudiedthecostandbenetsofusingcloudsasasubstituteforvol-unteersorservers.
In[22],theauthorsdeneaP2PmodelundertheMapRe-duceframework.
Theirsystemistailoredtoadynamiccloudenvironment,creatingacloudofclouds.
Ithasasimilarorga-nizationtoexistingGridinfrastructures,but,muchlikeOur-Grid[8],ismeanttocreateafederationorclusterofdatacentersthroughaP2Poverlaynetwork.
MOON(MapReduceOnopportunisticeNvironments)[20]proposesanextensiontoHadoopthatimplementsadap-tivetaskschedulingtoaccountfornodefailure.
However,MOONistailoredforaclusterenvironment,suchasaresearchlab,inwhichnodesaretrustedorevendedicated.
MapReducewasalsoadaptedtodesktopgridsin[31].
ThesystemwasdesignedontopofBitDew[13],amiddlewarethehandlesdatamanagementthroughtheuseofvarioustransferprotocols.
TheauthorsclaimthatitisabletorunMapRe-ducejobsonXtremWeb[5],overtheInternet.
However,theirexperimentswereconductedinaclusterinterconnectedby123JInternetServAppl(2012)3:329–346345GigabitEthernet.
ThisenvironmentmorecloselyresemblesthecommonscenarioofXtremWeb,whichconsistsofafed-erationofresearchlabs.
BOINC,ontheotherhand,hasmillionofusers,andisactuallytailoredforatrulyvolunteerenvironmentovertheInternet.
Bymovingfrombenchmarksandproof-of-conceptstoactualapplicationsinarealistictestbed,wecanstatewithmorecertaintywhataretheadvantagesandshortcomingsofthisparadigmonavolunteercomputingenvironment.
BloomFiltershavebeenappliedinavarietyofsystems[6],suchasdictionaries,databases,andnetworkapplications.
Theyareimplementedasbitarrays,therefore,theunionoftwosetscanbecomputedbyperformingtheORoperationbetweenthetwo,whiletheirapproximateintersectionscanbecomputedusingtheANDoperation.
Totestwhetheranelementisinthesetornot,ithastobepassedthroughallhashfunctionsandifalltheresultingpositionsinthearrayaresettoone,thentheelementhashahighprobabilityofbeingintheset.
Ifanypositionhasthevaluezero,thenweknowthatitisdenitelynotintheset.
Thesmallfalsepositiveratearisesfromthefactthatwhenqueryingforanelementthatisnotintheset,somehashfunctionsmayresultinpositionsthatwerealreadyused(havethevalueone)forapreviouslyinserteditem.
Therefore,themoreelementsareinsertedintotheBloomFilter,thehigherthechanceofaqueryresultinginafalsepositive.
AnothershortcomingistheinabilitytoremoveanelementfromtheBloomFilter,assimplysettingthepositionsgivenbythekhashfunctionstozerohavethesideeffectofremovingotherelementsaswell.
Oursolutionisdifferenttotheexistingsystemsbecauseitcombinesalltypesofdifferentresourcesintoonediscoverymechanism.
Itisespeciallydifferenttotheworks[14,21]thatalsomakeuseofABFduetototheusageofoneaggre-gatedABF(explainedinSect.
2.
4.
1),andthefactthatallthedifferenttypesofbasicresources,services,andapplicationsareencodedintheBloomFilter.
5ConclusionWehavepresentedGiGi-MR,aVolunteerComputingplat-formthatallowsordinaryuserstocreateandsubmitjobsforexecutiononvolunteermachinesovertheInternet.
OursystemisabletoexecuteMapReduceapplicationsovertheInternetandtoleratevolunteerfaults,andtransientserverfail-ures.
Furthermore,itiscompatiblewithexistingVCsystems(inparticularBOINC).
Itsignicantlyreducesthedepen-denceonthecentralserver,whichistypicallyoverburdenedincurrentVCplatforms,thusallowingittoobtainbetterperformance.
GiGi-MRenhancestaskschedulingusinginformationexchangedbyclientswithinanoverlaynetwork.
Neighbourselectionisbasedonresourcesandavailabilityinformationisprovidedthroughanovelresourcediscoverymechanism.
Itiscapableoflocatingphysicalresources,services,andappli-cationsfrommanycomputersconnectedtothesameoverlay.
Thisisdoneinanovelwaybystoringallresource,applica-tion,andserviceinformationinABF.
GiGi-MRisabletodistributetasksinsideVirtualMachines,andsupportssev-eralpartitioningmechanisms,thusincreasingthesystem'sadaptabilityandusefulness.
WeevaluatedGiGi-MRbymeasuringtheapplicationturn-aroundandservernetworktrafcwhilerunningthreediffer-entMapReduceapplications.
Wealsoranmicro-benchmarkstoassesstheimpactofeachofoursystem'scomponents.
TheexperimentsshowthattheoverheadoftheUserInter-facelayerisminimal,andthatitispossibletotakeadvantageofparallelprocessingenvironmentswithouttheuseofcom-plexAPIs.
Wecanalsoconcludethatitallowsthedenitionandexecutionofamyriadofjobsthatcantakeadvantageofremoteidlecycles.
Wemanagedtoexecuteabatchofimagerendering,necessarytocreateananimationvideo,aswellasprocessseveralMapReducejobs.
Ingeneral,theapplica-tionsthatoursystemshandlesbestarethosewhichcanbedescribedasBag-of-Tasksproblems,oreasilydecomposedinasetofmapandreducetasks;thus,Monte-Carlobasedapplicationsaregoodcandidates.
GiGi-MR'sdiscoverymechanismperformedwellinthevarioustestscenariosthatincludedstaticanddynamicresoures,andoutperformedtheRWprotocolwhichwasourbaseline.
Oursystemprovedtobeeffectiveinlocatingvari-oustypesofresources,andscalableasthenumberofnodesinthenetworkdidnotaffectthemechanism'sresourcequerysatisfaction.
Ourresultvericationschemesvaryintheircomplex-ityandoverhead.
Replicationwithoverlappedpartitioningsmakescollusionhardertoachieve,whileensuringthatthereliabilityoftheresultsisthesameasusingstandardparti-tionings.
Replicationwithmeshedpartitioningsenablestheuseofevenreplicationfactorsandimprovesthereliabilityoftheresultsusingitsstatelessresultreputationalgorithm.
Samplicationcombinesreplicationandsamplinginanele-gantmanner,ensuringittakesthebestadvantageofredun-dantexecutionthroughthecomparisonwithlocalsamplesratherthanusingvotingquorums.
Ourcheckpoint/restartthroughavirtualmachineovercameitsbiggestobstacle,checkpointdatasize,throughdifferentialdiskimagesandcompression.
Wewereabletominimizethecheckpointsizeabout17times,toatransmittableamountofdata.
OursolutionwasabletoimprovetheperformanceofalltheMapReducejobswetested.
Themapstagewasupto4timesfasterthaninanexistingVCsystem.
Thereducestepalsoshowedanimprovement,thusreducingeachMapRe-ducejob'sexecutiontimedowntolessthanhalf.
Ourexper-imentsregardingtheserver'snetworktrafcalsogaveus123346JInternetServAppl(2012)3:329–346someinterestingresults.
Wewereabletoreduceserverdown-loadtrafcbyanorderofmagnitudeonthewordcountandinvertedindexapplications.
Therefore,wewereabletowit-nessadecreaseinuploadeddatato20%oftheexistingVCsystemserver'svalue.
AcknowledgmentsTheauthorswouldliketothankstudentsFilipeParedes,JooPaulinoandRaoulFelix,fortheirworkandenthusiasmduringtheproject.
References1.
AbramsonD,SosicR,GiddyJ,HallB(1995)Nimrod:atoolforperformingparametrisedsimulationsusingdistributedworksta-tions.
In:Proceedingsofthe4thIEEEinternationalsymposiumonhighperformancedistributedcomputing,HPDC'95.
IEEECom-puterSociety,Washington,DC,USA,pp112–121.
2.
AndersonDP(2004)Boinc:Asystemforpublic-resourcecomput-ingandstorage.
In:Proceedingsofthe5thIEEE/ACMinternationalworkshopongridcomputing,GRID'04.
IEEEComputerSociety,Washington,DC,USA,pp4–103.
AndersonDP,CobbJ,KorpelaE,LebofskyM,WerthimerD(2002)Seti@home:anexperimentinpublic-resourcecomputing.
Com-munACM45:56–614.
BloomBH(1970)Space/timetrade-offsinhashcodingwithallow-ableerrors.
CommunACM13(7):422–4265.
CappelloF,DjilaliS,FedakG,HeraultT,MagnietteF,NériV,LodygenskyO(2005)Computingonlarge-scaledistributedsys-tems:Xtremwebarchitecture,programmingmodels,security,testsandconvergencewithgrid.
FutureGenerComputSyst21:417–4376.
ChazelleB,KilianJ,RubinfeldR,TalA(2004)Thebloomierl-ter:anefcientdatastructureforstaticsupportlookuptables.
In:ProceedingsofthefteenthannualACM-SIAMsymposiumondis-cretealgorithms,SocietyforIndustrialandAppliedMathematics,SODA'04,Philadelphia,PA,USA,pp30–397.
ChunB,CullerD,RoscoeT,BavierA,PetersonL,WawrzoniakM,BowmanM(2003)Planetlab:anoverlaytestbedforbroad-coverageservices.
SIGCOMMComputCommunRev33:3–128.
CirneW,BrasileiroF,AndradeN,CostaL,AndradeA,NovaesR,MowbrayM(2006)Labsoftheworld,unite!
!
!
.
JGridComput4:225–2469.
CostaF,KelleyI,SilvaL,FedakG(2008a)Optimizingdatadis-tributionindesktopgridplatforms.
ParallelProcessLett(PPL)18(3):391–41010.
CostaF,SilvaL,FedakG,KelleyI(2008b)Optimizingthedatadistributionlayerofboincwithbittorrent.
In:Internationalsympo-siumonparallelanddistributedprocessingsymposium,pp1–811.
DeanJ,GhemawatS(2008)Mapreduce:simplieddataprocessingonlargeclusters.
CommunACM51:107–11312.
FanL,CaoP,AlmeidaJ,BroderAZ(2000)Summarycache:ascalablewide-areawebcachesharingprotocol.
IEEE/ACMTransNetw8(3):281–29313.
FedakG,HeH,CappelloF(2008)Bitdew:aprogrammableenvi-ronmentforlarge-scaledatamanagementanddistribution.
In:Pro-ceedingsofthe2008ACM/IEEEconferenceonsupercomputing,SC'08.
IEEEPress,Piscataway,pp45:1–45:1214.
GoeringP,HeijenkG(2006)Servicediscoveryusingblooml-ters.
In:Proceedingsoftwelfthannualconferenceoftheadvancedschoolforcomputingandimaging,pp14–1615.
GoodwinP,WrightG(2004)DecisionAnalysisforManagementJudgment.
Wiley,NewYork16.
GroppW,LuskE(1995)Dynamicprocessmanagementinanmpisetting.
In:ProceedingsoftheseventhIEEEsymposiumonparallelanddistributedprocessing,1995,pp530–53317.
GuoZ,FoxG,ZhouM(2012)Investigationofdatalocalityinmapreduce.
In:Proceedingsofthe201212thIEEE/ACMinter-nationalsymposiumoncluster,cloudandgridcomputing(ccgrid2012),CCGRID'12.
IEEEComputerSociety,Washington,DC,pp419–42618.
KondoD,JavadiB,MalecotP,CappelloF,AndersonDP(2009)Cost-benetanalysisofcloudcomputingversusdesktopgrids.
In:Proceedingsofthe2009IEEEinternationalsymposiumonparallelanddistributedprocessing,IPDPS'09.
IEEEComputerSociety,Washington,DC,USA,pp1–1219.
LarsonSM,SnowCD,ShirtsM,PandeVS(2002)Folding@homeandgenome@home:Usingdistributedcomputingtotacklepre-viouslyintractableproblemsincomputationalbiology.
Comput.
Genomics.
20.
LinH,MaX,ArchuletaJ,FengWc,GardnerM,ZhangZ(2010)Moon:mapreduceonopportunisticenvironments.
In:Proceedingsofthe19thACMinternationalsymposiumonhighperformancedistributedcomputing,HPDC'10.
ACM,NewYork,pp95–10621.
LvQ,CaoQ(2007)Servicediscoveryusinghybridbloomltersinad-hocnetworks.
In:InternationalConferenceonwirelesscommu-nications,networkingandmobilecomputing,2007.
WiCom2007,pp1542–154522.
MarozzoF,TaliaD,TrunoP(2008)Adaptingmapreducefordynamicenvironmentsusingapeer-to-peermodel.
In:Proceedingsoftherstworkshoponcloudcomputinganditsapplications(CCA2008),Chicago,USA23.
MetropolisN,UlamS(1949)TheMonteCarlomethod.
JAmStatAssoc44(247):335–34124.
PearsonK(1905)Theproblemoftherandomwalk.
Nature7225.
RatnasamyS,FrancisP,HandleyM,KarpR,ShenkerS(2001)Ascalablecontent-addressablenetwork.
In:Proceedingsofthe2001conferenceonapplications,technologies,architectures,andprotocolsforcomputercommunications,SIGCOMM'01.
ACM,NewYork,pp161–17226.
RheaS,KubiatowiczJ(2002)Probabilisticlocationandrouting.
In:ProceedingsofINFOCOM2002.
Twenty-rstannualjointcon-ferenceoftheIEEEcomputerandcommunicationssocieties,vol3.
IEEE,NewYork,pp1248–125727.
RowstronA,DruschelP(2001)Pastry:Scalable,decentralizedobjectlocation,androutingforlarge-scalepeer-to-peersystems.
In:GuerraouiR(ed)Middleware2001.
LectureNotesinComputerScience,vol2218,Springer,Berlin,pp329–35028.
SilvaJ,FerreiraP,VeigaL(2010)Serviceandresourcediscov-eryincycle-sharingenvironmentswithautilityalgebra.
In:2010IEEEinternationalsymposiumonparalleldistributedprocessing(IPDPS),pp1–1129.
SnirM,OttoSW,WalkerDW,DongarraJ,Huss-LedermanS(1995)MPI:thecompletereference.
MITPress,Cambridge30.
StoicaI,MorrisR,KargerD,KaashoekMF,BalakrishnanH(2001)Chord:ascalablepeer-to-peerlookupserviceforinternetappli-cations.
In:Proceedingsofthe2001conferenceonapplications,technologies,architectures,andprotocolsforcomputercommuni-cations,SIGCOMM'01.
ACM,NewYork,pp149–16031.
TangB,MocaM,ChevalierS,HeH,FedakG(2010)Towardsmapreducefordesktopgridcomputing.
In:Proceedingsofthe2010internationalconferenceonP2P,parallel,grid,cloudandinternetcomputing,3PGCIC'10.
IEEEComputerSociety,Washington,DC,pp193–200123

CYUN(29元/月)美国、香港、台湾、日本、韩国CN2,续费原价

关于CYUN商家在之前有介绍过一次,CYUN是香港蓝米数据有限公司旗下的云计算服务品牌,和蓝米云、蓝米主机等同属该公司。商家主要是为个人开发者用户、中小型、大型企业用户提供一站式核心网络云端部署服务,促使用户云端部署化简为零,轻松快捷运用云计算。目前,CYUN主要运营美国、香港、台湾、日本、韩国CN2线路产品,包括云服务器、站群服务器和独立服务器等。这次看到CYUN夏季优惠活动发布了,依然是熟悉的...

丽萨主机:美国CN2 GIA精品网/KVM/9折,美国原生IP,最低27元/月

丽萨主机怎么样?丽萨主机,团队于2017年成立。成立之初主要做的是 CDN 和域名等相关业务。最近开辟新领域,新增了独立服务器出租、VPS 等业务,为了保证业务质量从一开始就选择了中美之间的 CN2 GIA 国际精品网络,三网回程 CN2 GIA,电信去程 CN2 GIA + BGP 直连智能路由,联通移动去程直连,原生IP。适合对网络要求较高的用户,同时价格也比较亲民。点击进入:丽萨主机官方网站...

硅云香港CN2+BGP云主机仅188元/年起(香港云服务器专区)

硅云怎么样?硅云是一家专业的云服务商,硅云的主营产品包括域名和服务器,其中香港云服务器、香港云虚拟主机是非常受欢迎的产品。硅云香港可用区接入了中国电信CN2 GIA、中国联通直连、中国移动直连、HGC、NTT、COGENT、PCCW在内的数十家优质的全球顶级运营商,是为数不多的多线香港云服务商之一。目前,硅云香港云服务器,CN2+BGP线路,1核1G香港云主机仅188元/年起,域名无需备案,支持个...

p2pover为你推荐
参考手册NDXS和ND5XS网络音频播放器中文目录generatedgooglecontentcss我研制千万亿次超级电脑支持ipad支持ipad更新iphonenetbios端口26917 8000 4001 netbios-ns 端口 是干什么的windows键是哪个Windows快捷键是什么127.0.0.1为什么输入127.0.0.1无法打开页面
金万维动态域名 hostmonster vps.net mediafire idc测评网 42u机柜尺寸 12306抢票攻略 鲜果阅读 免费ddos防火墙 警告本网站美国保护 godaddy域名证书 创梦 大容量存储器 免费申请个人网站 免费智能解析 多线空间 西安服务器托管 石家庄服务器托管 畅行云 工信部icp备案查询 更多