supportssisoftwaresandra

sisoftwaresandra  时间:2021-04-01  阅读:()
PACUE:ProcessorAllocatorConsideringUserExperienceTetsuroHorikawa1,MichioHonda1,JinNakazawa2,KazunoriTakashio2,andHideyukiTokuda2,31GraduateSchoolofMediaandGovernance,KeioUniversity2FacultyofEnvironmentandInformationStudies,KeioUniversity,5322,Endo,Fujisawa,Kanagawa252-8520,Japan3JST-CREST,Japan{techi,jin,kaz,hxt}@ht.
sfc.
keio.
ac.
jp,micchie@sfc.
wide.
ad.
jpAbstract.
GPUacceleratedapplicationsincludingGPGPUonesarecommonlyseeninmodernPCs.
IfmanyapplicationscompeteonthesameGPU,theperfor-mancewilldecreasesignicantly.
Someapplicationshavealargeimpactonuserexperience.
Therefore,forsuchapplications,wehavetolimitGPUutilizationbytheotherapplications.
Itmightbestraightforwardtomodifyapplicationstoswitchcomputedevicedynamicallyforintelligentresourcesallocation.
Unfortu-nately,wecannotdosoduetosoftwaredistributionpolicyortheotherreasons.
Inthispaper,weproposePACUE,whichallowstheendsystemtoallocatecomputedevicesarbitrarytoapplications.
Inaddition,PACUEguessesoptimalcomputedeviceforeachapplicationaccordingtouserpreference.
WeimplementedthedynamiccomputedeviceredirectorofPACUEincludingOpenCLAPIhookinganddevicecamouagingfeatures.
WealsoimplementedtheframeoftheresourcemanagerofPACUE.
WedemonstratePACUEachievesdynamiccomputedeviceredirectingononeoutoftworealapplicationsandonallof20samplecodes.
Keywords:Resourcemanagement,OpenCL,binarycompatibility,GPU,GPGPU,PC,userexperience.
1IntroductionGraphicsProcessingUnit(GPU)usehasbeenextendedtoawiderrangeofcomput-ingpurposesonthePCplatform.
GPUutilizationpurposesonPCscanbeclassiedintofourpurposes.
Therstis3Dgraphicscomputation,suchas3Dgamesand3D-graphics-basedGUIshell(e.
g.
,WindowsAero).
Thesecondis2Dgraphicsaccelera-tion,suchasfontrenderinginmodernwebbrowsers.
Thethirdisvideodecodingandencodingacceleration.
VideoplayerapplicationsusethevideodecodingaccelerationfunctionoftheGPUtoreduceCPUloadandtoincreasethevideoquality.
Also,someofGPUshavevideoencodingaccelerationunitsonthedieoftheGPU.
Thelastpurposeisgeneral-purposecomputing,calledGeneral-PurposecomputingonGPU(GPGPU).
OnPCs,GPGPUisoftenusedbyvideoencodingapplicationsandphysicssimulationapplicationsincluding3Dgames.
11Some3DgamesutilizeGPUforgeneral-purposecomputingbesides3Dgraphicsrendering.
M.
Alexanderetal.
(Eds.
):Euro-Par2011Workshops,PartII,LNCS7156,pp.
335–344,2012.
cSpringer-VerlagBerlinHeidelberg2012336T.
Horikawaetal.
Intoday'sPCsGPUsareutilizedefciently,becauseonlyafewoftheapplicationsareacceleratedatthesametime;theseapplicationsdonotcompeteeachotheronthesameGPU.
Applicationsthuschoosecomputedevicesstatically,suchasbyuserselec-tionintheapplicationcongurationmenuoftheGUIinterface.
However,weenvisagethatmoreandmoreapplicationsutilizeGPUs.
Forexample,OpenComputingLanguage(OpenCL)[2]allowsapplicationstoselectthecomputedeviceexplicitlytoexecutesomepartsoftheapplication.
Therefore,efcientloadbal-ancingbetweencomputedevicesconsistingofCPUsandGPUsisessentialforfutureconsumerPCs.
TherearethreetechnicalchallengestoachieveefcientcomputedeviceassignmentofheterogeneousprocessorsinPCs.
First,GPUaccelerationisutilizedforvariouspur-poses,whileGPUsareutilizedmainlyforgeneral-purposecomputinginsupercomput-ers.
Inaddition,someoftasksrunninginPCsstronglyrequirespecicprocessors.
Forexample,3DrenderingisnormallyprocessedbyGPUs,andsomeof3Dgraphicstrans-actionscannotbeprocessedbyCPUs,whereassomeapplicationscanbeprocessedbybothCPUsandGPUs.
WhentheGPUloadishigh,wecouldrunthelatterapplicationsexplicitlyonCPUs.
Second,wemustnotmodifyapplications.
Typically,mostofapplicationsinstalledinmajorOSessuchasWindowsandMacOScannotbemodiedbyathirdperson,duetotheirsoftwaredistributionpolicies.
Applicationvendorsmaynotbewillingtomodifytheirapplicationseither,becauseitwillnotbenetthemstraightforwardly.
Forthesereasons,existingruntimelibrariesorlibrariestodistributetasksbetweencomputedevices[6,10,7]proposedforHPCarenotdeployableonconsumerPCs.
Third,performancemetricforconsumerPCsiscomplicated,becauseuserpreferenceisoneofthemostimportantmetricsforassigningcomputedevicestoapplications.
ItisclearlydifferentfromgeneralHPC'smetricswhosetaskdistributingpolicyisusuallystatic,suchasmaximizingtasktransactionspeedormaximizingperformanceperwatt.
InPCs,taskdistributingpoliciesandmeritseasilychangedependingontheuse.
Forexample,whentheuserwouldliketoplaythe3Dgamesmoothly,theotherGPGPUtasksshouldnotbeassignedtotheGPU.
Ontheotherhand,sometimestheusermightbewillingtotranscodevideosquicklyratherthanplayingthetriinggamesmoothly.
Thecomputedeviceselectingmethodmustrecognizeuserpreferencestodecidethepropercomputedevicetoassign.
Howeverthisishard,thususerpreferencerecognizingcannotautomate.
Therefore,theresourcemanagementhastoinferPCutilizationandtheusershavetobeabletotellhowtheyareusingPCatthattime.
Inthispaper,weproposePACUEwhichallocatescomputedevicestoapplicationsefciently.
PACUEhastwofeatures,oneisdynamiccomputedeviceredirectingfeatureandtheotherissystem-wideoptimaldeviceselectingfeature.
Westronglyfocusonsolvingrealproblemswhichwilloccurwhenwedistributeoursystemovertheworldviaweb.
Therefore,wepreferchoosingpoliticallysafermethodratherthantechnicallybettermethod.
Thus,rstadvantageofPACUEisthepossibilityofthedeployment.
ThesecondadvantageofPACUEisdesignedtomaximizePCusers'experience.
Thus,webringanewmetricforusingaccelerators,anditwillbealsobenecialforothercomputerssuchassmartphonesorgameconsoles.
PACUE:ProcessorAllocatorConsideringUserExperience337OurexperimentalresultsshowthatPACUEcanswitchcomputedevicesin1outof2applications,andallof20samplecodesbuiltwithOpenCL.
Thereminderofthispaperisorganizedasfollows:InSec.
2,wedescribethedesignofPACUEconsistingofthedynamiccomputedeviceredirectingandthesystemresourcemanager.
InSec.
3,weevaluateourprototypeimplementation.
ThepaperconcludeswithSec.
4.
2DesigningPACUEPACUEisconstructedbytwocomponents;DynamicComputeDeviceRedirectorandResourceManager.
WefocusonapplicationsbuiltwithOpenCL,awidelyusedframe-workwhichsupportsmanytypesofcomputedevicessuchasCPUsandGPUs.
2.
1DynamicComputeDeviceRedirectionWedesigntheDynamicComputeDeviceRedirection(DCDR)methodtomeetthe"noapplicationmodication"requirement.
DCDRimplementsOpenCLAPIhookingthatconcealsactualcomputedevicesfromapplications,andavoidserrorcausedbyinconsistentinformationofdevices.
OpenCLAPIHooking.
OpenCLabstractscomputedevicesandmemoryhierarchytoutilizeheterogeneousprocessorswithinitsprogrammingmodel.
Toutilizeacom-putedevice,applicationscallOpenCLAPIsandspecifyacomputedevice.
Assigningprocessarefollowing:Secondly,selectpossibledevicesandcreateanOpenCLcon-text.
Thirdly,selectonedevicetouseandcreateacommandqueue.
Lastly,puttaskstothequeuecreatedabove.
Inthesecondandthethirdsteps,theapplicationspeciesaconcretedevicebecauseOpenCLAPIsneedsdeviceIDasitsparameter,whichmakessystem-wideoptimaldeviceselectionimpossible.
Foroptimaldeviceselection,were-movetherestrictionthattheapplicationsneedtochoosethedevicebyitselfbecausethedecisionishardforapplicationsandusers.
However,decisionsbyapplicationsorusersarerarelyoptimal(SeeSec.
2.
2).
PACUEhooksapartofOpenCLAPIswhichconcerndeviceselecting,andimplementsaskingfunctionthataskswhichdevicetoutilize.
ThereareseveralmethodstohookAPIsinWindows7wherePACUEisimple-mented.
TherstpossibilityismakingathreadinthetargetapplicationbycallingaWindowsAPICreateRemoteThread()[12].
Withthismethodweimplementanapplica-tionwhichmakeathreadinotherapplicationsandmapexternalDLLcontainingover-riddentargetAPIs.
However,theseapplicationsandDLLsarehardtoimplementduetocomplicatedprocedures.
Ithasariskbeingtreatedasmalwarebytheanti-malwaresoft-ware.
ThesecondpossibilityisGlobalHook,theuserapplicationhooksspecicAPIsofallapplicationbycallingWindowsAPISetWindowsHookEx()[13].
Thismethodisunsafe,becauseithasariskofhookingunknownapplicationsandcausingunexpectedaffecttothem.
ThethirdpossibilityismakingWrapperDLL,whichisaDLLwiththesamelenameoforiginalDLLandhasallAPIsoforiginalDLL.
WrapperDLLisalmostshelloforiginalDLL,becausemostAPIsaresimplycallsoriginalDLLAPIsexceptAPIswhichactuallyneedtododifferenttransactionfromoriginal.
ThismethodhasthemostchanceofhookingAPIs,becausewrapperDLLlocatedintheapplica-tiondirectoryisalwaysloadedpriortotheotherones,suchasDLLslocatedinsystem338T.
Horikawaetal.
Fig.
1.
DynamicComputeDeviceSwitchingbyOpenCLAPIHookingdirectoriesbydefault.
Inaddition,whenlocatingwrapperDLLinthedirectorywhichtargetEXElocated,onlyaffectsapplicationswhosebinaryislocatedinthesamedi-rectory.
Therefore,thisisreallysafewaytohookAPIs.
ThelastpossibilityistheuseofAPIhooklibraries,suchas[14].
Theselibrariesareeasytouse,howeverithaslessprobabilitytosuccesstohookAPIsthanWrapperDLL.
Italsohasarisktobetreatedasmalware.
Fromthiscomparison,weadopttheWrapperDLLmethod.
Fig.
1illus-tratesthearchitecturetohookOpenCLAPIswiththismethod.
OthermajorPCOSessuchasMacOSorLinuxdonotprovideanyfunctionlikewrapperDLLs,stillwecanimplementasimilarsystembyusingAPIhookingfunctionsofferedbyotherOSes.
Anothermethodtoswitchdevicesismakingavirtualdevice.
[5]Onthismethod,ap-plicationswillassignthevirtualdeviceandtheresourcemanagementsystemchoosearealdevice.
Thismethodhasasignicantadvantagethatitcanswitchrealdevicesatanytime,howeveritmayconictwithInstallableClientDriver(ICD)systemofOpenCL.
InstallerofOpenCLruntimelibrariesdistributedbyhardwarevendorssometimesover-write"OpenCL.
dll"le,thusinstallingavirtualdeviceorshowingapplicationsonlythevirtualdeviceisdifcultonPCs.
DeviceInformationCamouaging.
Whenapartofapplications'tasksareassignedtoPACUEselectedOpenCLdevice,someapplicationsshowerrors.
Thisisbecausedeviceinformationisdifferentfromtheapplication'sintendedone,thussomeapplicationsrecognizeitasanunusualevent.
Toavoidtheseerrors,PACUEcamouagesOpenCLdevicedetailswhenthedesiredOpenCLdevicehasbeenchangeddynamically.
However,camouagingOpenCLdevicedetailsisrisky,becausedeviceshavediffer-entspecicationsinthelowerlevel.
Therstriskisapplicationstability.
Thememorysizeofeachhierarchyisdevicedependent,hencetheunexpectedmemorysizecanre-sultinapplicationcrashorerror.
Thesecondriskisexecutionspeed.
Ifanapplicationimplementsper-deviceoptimization,mismatchbetweentheintendeddeviceandtheas-signeddevicecanresultinunexpectedperformancedegradation.
Fromthesereasons,weshouldcamouagesdevicedetailsonlywhenitisnecessary.
Tominimizetherisks,PACUEcamouagesdevicesinfollowinglevels.
1.
DevicetypelevelcamouageWhenanapplicationtriestoacquireanOpenCLdevicelist,PACUEwillover-writethecldevicetypevalue.
Asfaraspossible,PACUEwillchangethisvalueforCLDEVICETYPEALL.
Showingalldevicesinsteadofthespecictypede-vicesisareasonablechoice,becauseitavoidsforcingapplicationusingunknownPACUE:ProcessorAllocatorConsideringUserExperience339Table1.
ComparisonofDeviceCamouagingMethodsOverriddendevicetype/IDSpeciedTypewhengettingdevicelistSpeciedIDwhencreatingaContextSpeciedIDwhencreatingaCommandqueuecreationCrash/ErrorRiskCompatibilityA.
DevicetypelevelCPUsorGPUsAllCPUsorallGPUs\LowMostapplica-tionsB.
Contextlevel\CPUsorGPUsLowLowC.
Commandqueuelevel\ALLOneCPUoroneGPUHighMostapplica-tionsD.
A+CALLALLOneCPUoroneGPUNormalHighdevice.
Occasionally,applicationscannotexecutetheirOpenCLcodeonsomede-vicetypes.
Inthiscase,PACUEsetsthecldevicetypevaluetothedesiredtype,suchasCLDEVICETYPECPUorCLDEVICETYPEGPU.
2.
ContextlevelcamouageWhencreatinganOpenCLcontext,PACUEoverridesthecldeviceidvalueandforceOpenCLframeworktobuildOpenCLbinariesforeachcomputedevice.
IfPACUErecognizethatthetargetapplicationsupportonlyspecictypeofcomputedevices,PACUEwilloverwritethecldeviceidvalueandlimitdevicetypesforcontext.
Inaddition,PACUEoverridesthecldeviceidvaluewhenapplicationsrequestsdetaileddeviceinformation.
Therefore,applicationwillseeinformationofthedevicePACUEselected.
Thiscontributestoapplication'sstability,becauseacquireddeviceinformation,suchasthememorysizecorrespondstothatofthedeviceactuallywillbeused.
3.
CommandqueuelevelcamouageWhentheapplicationcallsclCreateCommandQueue()API,thisisthelastchancetochangethedevice.
Becauseofthestabilityissuedescribedabove,PACUEtriesnottochangedevicethistiming,butifnecessary,PACUEchangescldeviceidinargumentsofthisAPI.
Inthissituation,thedeviceiscamouagedcompletely,thustheapplicationrecognizesthecamouageddeviceasthedeviceapplicationspeci-ed.
Thisisaterriblydangerouswaytochangedevice,stillitimprovesapplicationcompatibility.
Thisisriskyintermsofdevicedependentcharacteristics,suchasthememorysize,however,wecanswitchtheprocessorinmoreapplicationswiththismethod.
Hence,thismethodisaceinthehole.
AsshowninTable1,thereareseveraldeviceassignmentoverridingwaysbythecom-binationofthesesteps.
Becausetheyhaveatrade-offbetweenapplicationcompatibilityandapplicationstability,wehavetomakearuleforapplyingthesemethods,andsomehintsareguredoutinSec.
3.
2.
2SystemResourceManagementWeneedasystem-wideresourcemanagerforheterogeneousprocessors,becauseav-eragePCuserscannotchoosepropercomputedeviceforeachapplication,anditis340T.
Horikawaetal.
inconvenientthattheyselectcomputedeviceeverytimetheapplicationruns.
Somead-vancedPCuserscanchoosepropercomputedevicemanually,howeveritisterriblyinconvenient.
Besides,manyPCusersdonotknowdetailedconstructionofthePCtheyareusing.
Theseuserscannotchoosethepropercomputedevicewhichsatisestheirpreferenceaccurately,eveniftheapplicationallowstheusertoselectthecomputede-viceonitsGUIcongurationmenu.
Forachievinghighuser-experience,theresourcemanagershouldselectacomputedeviceautomaticallyaccordingtouser'spreferences.
TherearemanystudiesinHPCareathatbuildaresourcemanagertoselectcomputedeviceautomatically[7,8].
Theyshowtaskdistributingalgorithmforheterogeneousprocessorsenvironmentthatoptimizedforsomespecicpurposes,suchasmaximizingperformanceormaximizingperformance-per-watt.
However,theycannotbeappliedtoresourcemanagementonPCbecausetherequirementsaredifferentbetweenPCandHPC.
Theotherapproachtodifferentiatetasks,suchasdevice-driverlevelapproach[9]wouldbeapossibilityforourgoal.
However,westillneedasystemwideresourcemanagertoconsiderheterogeneousprocessorsandapplications.
Thesearethreere-quirementsoftheresourcemanagerespeciallyforPCs.
–ConsideringuserpreferenceAPCuser'spreferenceoftenchangesandtheyarenotsimpleobjectssuchasmax-imizingperformance.
Inaddition,itisdifculttorecognizewhichapplicationisreallyimportant,becausewerarelyspecifypriorityoftheprocessexplicitly.
There-fore,wehavetobuildaresourcemanager,whichinfersuser'spreferencebycol-lectingPCutilizationstatusandchoosescomputedevicesforeachapplicationtoachieveuserpreferenceaccurately.
–SupportingvarioushardwarecongurationsThereareplentyofPChardwarecomponentsandapplications.
Becauseofthisreason,combinationofhardwarecomponentsandapplicationsareinnumerable.
Inaddition,thespecicationsofcomponentsdependontechnologytrends.
Forinstance,somenewGPUvirtualizationtechnologiesforPCsuchasVirtuGPUvirtualization[11]seamlesslyusediscreteGPUwhenspecicAPIscalled.
Thus,wehavetobuildresourcemanagerthatsupportsvarioushardwarecongurations.
–SupportingvariousruntimeversionsInstalledruntimelibrariesforparallelcomputingmayvaryinPCs.
Applicationexecutionspeedsarenotonlydependsonhardware,butalsodependsonruntimelibrarieslikeOpenCLframeworks.
Thus,acomputedeviceselectingalgorithmop-timizedforspecicruntimeversion,suchasdesignedforHPC,maynotshowgoodresultsonthenewerversionruntimelibraries.
Wehavetobuildcomputedevicese-lectingalgorithmsthatdonotdependonaspecicruntimeversion.
Thisresourcemanagerhasthreefeaturesforsatisfyingtherequirementsexplainedabove.
Therstfeatureisinformationgathering.
PACUEcollectsinformationabouthowPCisutilized,suchaswhetheranACadapterisconnected,temperaturesandvolt-agesofcomponents,andprocessorutilizationlevelsuchasprocessorloadsandtherunningapplicationslist.
Thesecondfeatureistheuserpreferenceinferringfeature.
Theuserdescribestheirrequirementsbycreatingseveralrequirementpatterns.
PACUEinferswhichpatternisthebestforthepresentsituationbyusinginformationacquiredinPACUE:ProcessorAllocatorConsideringUserExperience341therststep.
Thethirdfeatureiscomputedeviceselection,whichdecidestheOpenCLdevicetobeassignedtoeachapplication.
Weplantoimplementafewcomputedeviceselectingalgorithmsforseveraluserpreferencepatterns.
PACUEwillassigncomputedevicestoeachapplicationbasedonthealgorithmwhichmatchestheinferredpatternofuserpreference.
Theresourcemanagerworksascyclesofthesesteps:1.
CollectPCutilizationinformation.
2.
Guesswhichproleisthebestforthepresentcondition.
3.
Waitaninquiryofapplicationandanswerwhichdeviceshouldbeused.
Forevaluationpurpose,webuiltabasicresourcemanagerwhichhascommunicationfunctiontoorderapplicationstoutilizespeciccomputedevice.
Becauseoflackofuserpreferencebasedcomputedeviceselectingalgorithms,recentPACUEcanonlyselectcomputedevicebymanualselectionintheresourcemanagerGUI.
Still,itcanreceiveaninquiryofcomputedeviceselectionandansweracomputedevicetoutilize.
3EvaluationInthissectionweconrmPACUEprovidescomputedevicesredirectioncapabilityforapplicationswithoutmodicationonwidelyusedapplications.
Werststatethepolicyoftheevaluation,thenshowandanalyzetheresults.
3.
1EvaluationPolicyWeevaluatePACUEinaPCwithIntelCorei7-920CPUandAMDRADEONHD4850GPU.
AsOpenCLframework,weadoptx86binaryofATIStreamSDK2.
2[4].
ThisframeworksupportsbothCPUsandAMDRADEONGPUsasOpenCLdevices.
Astestingapplications,wechosethefollowings.
Theyarepubliclyreleasedandwidelyusedforbenchmarking,thussuitesourpurpose.
–DirectCompute&OpenCLBenchmark[1]–SiSoftwareSandra2011[15]–Samplecodeof"OpenCLIntrodouction"book[3]Weswitchthedevicetoutilizefortheseapplications,andcomparethemethodsfordeviceswitchingforeachoftheseapplications.
3.
2ResultsDirectCompute&OpenCLBenchmark.
Table2showstheresults.
PACUEcanredirectcomputedeviceperfectlyonDirectCompute&OpenCLBenchmark,butonlywithmethodD.
SiSoftwareSandra2011.
Deviceswitchingfailed.
WhenPACUEtriedtoswitchthedevice,Sandra2011exhibitedstrangebehavior,suchasshowingthesamedevicetwiceintheGUI.
BecauseSandra2011isaninformation&diagnosticutilityforPC,itgathersdeviceinformationbyvariousAPIs.
Thus,thefailuremaybecausedbythelackofintegritybetweendeviceinformationgatheredbyPACUEhookedOpenCLAPIandinformationgatheredbyotherAPIs.
However,PACUEdonotmakeSandracrashed.
342T.
Horikawaetal.
Table2.
ResultofDirectCompute&OpenCLBenchmarkOverrideMethodA-1A-2B-1B-2C-1C-2D-1D-2SpeciedDeviceTypeCPUGPU\\\\ALLALLSpeciedDeviceIDforContext\\CPUsGPUsALLALLALLALLSp.
Dev.
IDforCommandQueue\\\\CPUGPUCPUGPUApplicationRecognizedDevicesCPU*2GPU*2CPU*1GPU*1CPU*1CPU*1CPU*1+GPU*1CPU*1+GPU*1DynamicDeviceSwitchingImpossibleImpossibleStaticStaticStaticStaticDynamicDynamicSampleCodesof"OpenCLIntroduction"Book.
Thesecodesareasetof20sampleapplicationsofOpenCLAPIs.
Thedeviceswitchingsucceededforallapplicationsinthem.
However,1sampleusesdevicememoryinformationfortheoptimizedarraysize,thustheresultmightdependonthedevice.
Thecompletecamouagingdeviceinfor-mationmightthusbeincompatiblewiththeinformationexpectedbythesample.
Thiscancausetheapplicationcrashingorerrors,howeveritseemedtobeworkingcorrectlywhiletheexperiment.
3.
3AnalysisTheresultsshowthatPACUEcanswitchthecomputedevicesonrealapplications.
However,itfailsfordevicedependentapplications.
Theyusedetailedinformationoftheparticulardevice,suchasdevicememorysize.
Thus,theymaycrashorbehavestrangelybecauseoftheinformationcamouagedbyPACUE.
Amongcombinationsofthedeviceinformationoverriding,wefoundtheproperor-dertoapplyonapplications.
ShowninTable1,thesemethodshaveatrade-offbetweenapplicationstabilityandapplicationcompatibility.
Inourevaluation,wefoundthatthecompletecamouagingmethodsignicantlyincreaseapplicationcompatibilityforrealapplications,suchasDirectCompute&OpenCLBenchmark.
However,itisrealizedbygivingapplicationstheinformationofthedevicetheapplicationspecied,insteadofgivingthedeviceinformationactuallyusing.
Originalapplicationcreatoristheonlyonewhoknowsiftheapplicationworkscorrectlywhenusingthecompletecamouag-ingmethod,thusweshouldavoidusingthisriskymethodifpossible.
Ingeneral,wesuggestthefollowingmethodapplyingorder;1.
OverridedevicetypeALLandoverridedeviceidwhencreatingcontext.
(Table1B)2.
OverridedevicetypeALLandoverridedeviceidwhencreatingcommandqueue.
(Table1D)3.
Keeporiginaldevicetypeandoverridedeviceidwhencreatingcommandqueue.
(Table1C)4.
OverridedevicetypeCPUorGPUwhenapplicationrequestslistofavailablede-vices.
(Table1A)Thersttothethirdmethodssimilarlyrealizedynamicdeviceselection.
Theupperissafer,thelowerhasmorecompatibility.
Applicationsthatcannotswitchdeviceswiththerstmethodshouldusethesecondorthethirdmethod.
Thelastonehasthehigh-estcompatibilitybutitonlyprovidesstaticandrestrictivedeviceswitching.
Thus,thismethodshouldbeappliedwhenallothermethodsfail.
PACUE:ProcessorAllocatorConsideringUserExperience3434ConclusionsandFutureWorkInthispaperwepresentedPACUE.
First,PACUEswitchesthecomputedevicesdynam-icallyforapplicationsonPCswithheterogeneousprocessors.
Second,PACUEchoosescomputedevicesassignedtoapplicationstomeettheuser'srequirement.
Weconductedexperimentsofourimplementation,anddemonstratedthat1outof2realOpenCLap-plications,andallof20sampleprogramscanchangethecomputedevicedynamicallywiththedynamiccomputedeviceredirector.
Inaddition,weshowedthatafewde-viceinformationcamouagingmethodssignicantlyincreaseapplicationcompatibil-ity.
Fromabovework,wedemonstratedpotentialavailabilityofthedynamiccomputedeviceredirectingwithoutapplicationmodied.
However,thereare2technicaldisad-vantagesinPACUE.
TherstdisadvantageisthatPACUEcanswitchdevicesonlywhencreatingcommandqueue.
Thisisbecausethereisnosupportfordynamicdeviceswitch-inginOpenCL,thusthechancesforswitchingdevicesarelimited.
Wewillinvestigateothermethodstoexpandthechancesforswitchingdevices,alsowewillinvestigatethefrequenciesofthedeviceswitchingtimingonotherAPIs.
TheseconddisadvantageisOpenCLkerneloptimization.
Becauseofdeviceinformationcamouaging,thereisapossibilityofexecutingkernelsdesignedforotherdevices.
Thismaydecreasetheper-formancesignicantly,thusweshouldavoidmakingsituationslikethat.
OneansweriscachingeverytypeofkernelsourcecodesbyAPIhooking,andswitchitaccordingtothedeviceactuallyusing.
Anotheranswerisapplyingjust-in-timeOpenCLcodeopti-mizationtechniquetoimproveperformance.
However,bothofthemcaninterferethecopyrightlaworlicensesoftheapplications.
Therefore,itmaybedifculttoapplyitforPCapplications.
Becauseofthisreason,wecontinueimprovingcamouagemethodsandwewillavoidshowingdifferentdevicesinformationaspossibleaswecan.
Forourresearchgoals,wehavetheseongoingworks:IncreaseCompatibilityforApplications.
WewilladdresstheproblemthatPACUEcannotswitchcomputedevicesinsomeapplications.
Alsowewillexperimentapplica-tionstabilitytestsonapplications.
EvaluateinManyHardwareEnvironment.
WewillconductexperimentsonmorehardwarecongurationsuchasVirtu,andimprovehardwaresupportofPACUE.
ImplementtheUserPreferencesHandlerintheResourceManager.
Weassumethatthereareseveralpatternsdescribinguserpredenedrequirements(e.
g.
,playingimportantgamewiththeACadaptor,andhastylecompressionwithunremarkablevideoencoding).
PACUEinfersmatchingpatternfromtheuser'sactivityandresourceutilization.
ImplementComputeDeviceSelectingAlgorithm.
Withuserrequirementrecogni-tion,weselectcomputedevicestofollowuserpreferenceaccurately.
Wewillimple-mentsomealgorithmsandparametersetsforeachuserrequirementpattern.
Also,wewillexploreperformanceimpactwhileredirectingcomputedeviceinrealapplicationsandtakemeasureagainstheavyperformancedegradation.
ShowingapplicationsnoOpenCLdevicebyoverridingOpenCLAPIscanbeoneoftheanswers.
Inthiscase,344T.
Horikawaetal.
applicationswilluseinternaloptimizedassemblytoexecuteitstransactionanditisoftenmuchfasterthanexecutingOpenCLcodeonCPUs.
However,ithasadisadvan-tagethatcomputedevicecannotchangeuntilrestartingtheapplication,becausetheapplicationwillnevercallOpenCLAPIsagain.
Therefore,wewillinvestigateeachapplication'sbehaviorconcretelytodecidehowtoletapplicationtouseCPUs.
SupportforOtherParallelComputingFrameworks.
Weplantoimplementmod-ulesforotherAPIssuchasFusionSystemArchitectureIntermediateLayerLanguage(FSAIL).
References1.
DirectCompute&OpenCLBenchmark,http://www.
ngohq.
com/graphic-cards/16920-directcompute-and-opencl-benchmark.
html(accessedonAugust21,2011)2.
OpenCL1.
1Specication,http://www.
khronos.
org/registry/cl/specs/opencl-1.
1.
pdf3.
FixtarsCorporation:OpenCLIntroduction-ParallelProgrammingforMulticoreCPUsandGPUs.
ImpressJapan(January2010)(inJapanese)4.
AMD.
ATIStreamTechnology,http://www.
amd.
com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.
aspx(accessedonAu-gust21,2011)5.
Aoki,R.
,Oikawa,S.
,Tsuchiyama,R.
,Nakamura,T.
:Hybridopencl:Connectingdifferentopenclimplementationsovernetwork.
In:Proc.
IEEECIT2010,pp.
2729–2735(2010)6.
Brodman,J.
C.
,Fraguela,B.
B.
,Garzaran,M.
J.
,Padua,D.
:Newabstractionsfordataparallelprogramming.
In:Proc.
USENIXHotPar,p.
16(2009)7.
Diamos,G.
F.
,Yalamanchili,S.
:Harmony:anexecutionmodelandruntimeforheteroge-neousmanycoresystems.
In:Proc.
ACMHPDC,pp.
197–200(2008)8.
Gupta,V.
,Schwan,K.
,Tolia,N.
,Talwar,V.
,Ranganathan,P.
:Pegasus:CoordinatedSchedul-ingforVirtualizedAccelerator-basedSystems.
In:Proc.
USENIXATC,pp.
31–44(2011)9.
Kato,S.
,Lakshmanan,K.
,Rajkumar,R.
,Ishikawa,Y.
:TimeGraph:GPUSchedulingforReal-TimeMulti-TaskingEnvironments.
In:Proc.
USENIXATC,pp.
17–30(2011)10.
Liu,W.
,Lewis,B.
,Zhou,X.
,Chen,H.
,Gao,Y.
,Yan,S.
,Luo,S.
,Saha,B.
:Abalancedpro-grammingmodelforemergingheterogeneousmulticoresystems.
In:Proc.
USENIXHotPar,p.
3(2010)11.
Lucidlogix.
Lucidlogixvirtu,http://www.
lucidlogix.
com/product-virtu.
html(accessedonAugust21,2011)12.
Microsoft.
CreateRemoteThreadFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms682437.
aspx(accessedonAugust21,2011)13.
Microsoft.
SetWindowsHookExFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms644990.
aspx(accessedonAugust21,2011)14.
MicrosoftResearch.
Detours-microsoftresearch,http://research.
microsoft.
com/en-us/projects/detours/(accessedonAugust21,2011)15.
SiSoftware.
Sisoftwarezone,http://www.
sisoftware.
net/(accessedonAugust21,2011)

妮妮云,美国cera CN2线路,VPS享3折优惠

近期联通CUVIP的线路(AS4837线路)非常火热,妮妮云也推出了这类线路的套餐以及优惠,目前到国内优质线路排行大致如下:电信CN2 GIA>联通AS9929>联通AS4837>电信CN2 GT>普通线路,AS4837线路比起前两的优势就是带宽比较大,相对便宜一些,所以大家才能看到这个线路的带宽都非常高。妮妮云互联目前云服务器开放抽奖活动,每天开通前10台享3折优惠,另外...

Hostwinds:免费更换IP/优惠码美元VPS免费更换IP4.99,7月最新优惠码西雅图直连VPS

hostwinds怎么样?2021年7月最新 hostwinds 优惠码整理,Hostwinds 优惠套餐整理,Hostwinds 西雅图机房直连线路 VPS 推荐,目前最低仅需 $4.99 月付,并且可以免费更换 IP 地址。本文分享整理一下最新的 Hostwinds 优惠套餐,包括托管型 VPS、无托管型 VPS、Linux VPS、Windows VPS 等多种套餐。目前 Hostwinds...

打开海外主机域名商出现"Attention Required"原因和解决

最近发现一个比较怪异的事情,在访问和登录大部分国外主机商和域名商的时候都需要二次验证。常见的就是需要我们勾选判断是不是真人。以及比如在刚才要访问Namecheap检查前几天送给网友域名的账户域名是否转出的,再次登录网站的时候又需要人机验证。这里有看到"Attention Required"的提示。我们只能手工选择按钮,然后根据验证码进行选择合适的标记。这次我要选择的是船的标识,每次需要选择三个,一...

sisoftwaresandra为你推荐
云爆发养兵千日用兵千日这个说法对不对sonicchat苹果手机微信显示WeChat地图应用手机地图软件那么多,都不知道用哪个好了?www.kkk.com谁有免费的电影网站,越多越好?lunwenjiancepaperfree论文检测怎样算合格冯媛甑夏如芝是康熙来了的第几期?rawtools照片上面的RAW是什么意思,为什么不能到PS中去编辑www.522av.com我的IE浏览器一打开就是这个网站http://www.522dh.com/?mu怎么改成百度啊 怎么用注册表改啊www.15job.com广州天河区的南方人才市场百度关键字在百度 输入任何关键词,可以搜出想要的内容,但是 搜索工具栏里面的字,却始终是同一个关键词, 如图
国外vps 美国加州vps 个人域名备案流程 n点虚拟主机管理系统 华为云服务 香港vps99idc 主机点评 加勒比群岛 韩国空间 softbank官网 godaddy支付宝 http500内部服务器错误 hnyd 空间出租 howfile 免费吧 阿里校园 cn3 免费智能解析 流媒体加速 更多