supportssisoftwaresandra

sisoftwaresandra  时间:2021-04-01  阅读:()
PACUE:ProcessorAllocatorConsideringUserExperienceTetsuroHorikawa1,MichioHonda1,JinNakazawa2,KazunoriTakashio2,andHideyukiTokuda2,31GraduateSchoolofMediaandGovernance,KeioUniversity2FacultyofEnvironmentandInformationStudies,KeioUniversity,5322,Endo,Fujisawa,Kanagawa252-8520,Japan3JST-CREST,Japan{techi,jin,kaz,hxt}@ht.
sfc.
keio.
ac.
jp,micchie@sfc.
wide.
ad.
jpAbstract.
GPUacceleratedapplicationsincludingGPGPUonesarecommonlyseeninmodernPCs.
IfmanyapplicationscompeteonthesameGPU,theperfor-mancewilldecreasesignicantly.
Someapplicationshavealargeimpactonuserexperience.
Therefore,forsuchapplications,wehavetolimitGPUutilizationbytheotherapplications.
Itmightbestraightforwardtomodifyapplicationstoswitchcomputedevicedynamicallyforintelligentresourcesallocation.
Unfortu-nately,wecannotdosoduetosoftwaredistributionpolicyortheotherreasons.
Inthispaper,weproposePACUE,whichallowstheendsystemtoallocatecomputedevicesarbitrarytoapplications.
Inaddition,PACUEguessesoptimalcomputedeviceforeachapplicationaccordingtouserpreference.
WeimplementedthedynamiccomputedeviceredirectorofPACUEincludingOpenCLAPIhookinganddevicecamouagingfeatures.
WealsoimplementedtheframeoftheresourcemanagerofPACUE.
WedemonstratePACUEachievesdynamiccomputedeviceredirectingononeoutoftworealapplicationsandonallof20samplecodes.
Keywords:Resourcemanagement,OpenCL,binarycompatibility,GPU,GPGPU,PC,userexperience.
1IntroductionGraphicsProcessingUnit(GPU)usehasbeenextendedtoawiderrangeofcomput-ingpurposesonthePCplatform.
GPUutilizationpurposesonPCscanbeclassiedintofourpurposes.
Therstis3Dgraphicscomputation,suchas3Dgamesand3D-graphics-basedGUIshell(e.
g.
,WindowsAero).
Thesecondis2Dgraphicsaccelera-tion,suchasfontrenderinginmodernwebbrowsers.
Thethirdisvideodecodingandencodingacceleration.
VideoplayerapplicationsusethevideodecodingaccelerationfunctionoftheGPUtoreduceCPUloadandtoincreasethevideoquality.
Also,someofGPUshavevideoencodingaccelerationunitsonthedieoftheGPU.
Thelastpurposeisgeneral-purposecomputing,calledGeneral-PurposecomputingonGPU(GPGPU).
OnPCs,GPGPUisoftenusedbyvideoencodingapplicationsandphysicssimulationapplicationsincluding3Dgames.
11Some3DgamesutilizeGPUforgeneral-purposecomputingbesides3Dgraphicsrendering.
M.
Alexanderetal.
(Eds.
):Euro-Par2011Workshops,PartII,LNCS7156,pp.
335–344,2012.
cSpringer-VerlagBerlinHeidelberg2012336T.
Horikawaetal.
Intoday'sPCsGPUsareutilizedefciently,becauseonlyafewoftheapplicationsareacceleratedatthesametime;theseapplicationsdonotcompeteeachotheronthesameGPU.
Applicationsthuschoosecomputedevicesstatically,suchasbyuserselec-tionintheapplicationcongurationmenuoftheGUIinterface.
However,weenvisagethatmoreandmoreapplicationsutilizeGPUs.
Forexample,OpenComputingLanguage(OpenCL)[2]allowsapplicationstoselectthecomputedeviceexplicitlytoexecutesomepartsoftheapplication.
Therefore,efcientloadbal-ancingbetweencomputedevicesconsistingofCPUsandGPUsisessentialforfutureconsumerPCs.
TherearethreetechnicalchallengestoachieveefcientcomputedeviceassignmentofheterogeneousprocessorsinPCs.
First,GPUaccelerationisutilizedforvariouspur-poses,whileGPUsareutilizedmainlyforgeneral-purposecomputinginsupercomput-ers.
Inaddition,someoftasksrunninginPCsstronglyrequirespecicprocessors.
Forexample,3DrenderingisnormallyprocessedbyGPUs,andsomeof3Dgraphicstrans-actionscannotbeprocessedbyCPUs,whereassomeapplicationscanbeprocessedbybothCPUsandGPUs.
WhentheGPUloadishigh,wecouldrunthelatterapplicationsexplicitlyonCPUs.
Second,wemustnotmodifyapplications.
Typically,mostofapplicationsinstalledinmajorOSessuchasWindowsandMacOScannotbemodiedbyathirdperson,duetotheirsoftwaredistributionpolicies.
Applicationvendorsmaynotbewillingtomodifytheirapplicationseither,becauseitwillnotbenetthemstraightforwardly.
Forthesereasons,existingruntimelibrariesorlibrariestodistributetasksbetweencomputedevices[6,10,7]proposedforHPCarenotdeployableonconsumerPCs.
Third,performancemetricforconsumerPCsiscomplicated,becauseuserpreferenceisoneofthemostimportantmetricsforassigningcomputedevicestoapplications.
ItisclearlydifferentfromgeneralHPC'smetricswhosetaskdistributingpolicyisusuallystatic,suchasmaximizingtasktransactionspeedormaximizingperformanceperwatt.
InPCs,taskdistributingpoliciesandmeritseasilychangedependingontheuse.
Forexample,whentheuserwouldliketoplaythe3Dgamesmoothly,theotherGPGPUtasksshouldnotbeassignedtotheGPU.
Ontheotherhand,sometimestheusermightbewillingtotranscodevideosquicklyratherthanplayingthetriinggamesmoothly.
Thecomputedeviceselectingmethodmustrecognizeuserpreferencestodecidethepropercomputedevicetoassign.
Howeverthisishard,thususerpreferencerecognizingcannotautomate.
Therefore,theresourcemanagementhastoinferPCutilizationandtheusershavetobeabletotellhowtheyareusingPCatthattime.
Inthispaper,weproposePACUEwhichallocatescomputedevicestoapplicationsefciently.
PACUEhastwofeatures,oneisdynamiccomputedeviceredirectingfeatureandtheotherissystem-wideoptimaldeviceselectingfeature.
Westronglyfocusonsolvingrealproblemswhichwilloccurwhenwedistributeoursystemovertheworldviaweb.
Therefore,wepreferchoosingpoliticallysafermethodratherthantechnicallybettermethod.
Thus,rstadvantageofPACUEisthepossibilityofthedeployment.
ThesecondadvantageofPACUEisdesignedtomaximizePCusers'experience.
Thus,webringanewmetricforusingaccelerators,anditwillbealsobenecialforothercomputerssuchassmartphonesorgameconsoles.
PACUE:ProcessorAllocatorConsideringUserExperience337OurexperimentalresultsshowthatPACUEcanswitchcomputedevicesin1outof2applications,andallof20samplecodesbuiltwithOpenCL.
Thereminderofthispaperisorganizedasfollows:InSec.
2,wedescribethedesignofPACUEconsistingofthedynamiccomputedeviceredirectingandthesystemresourcemanager.
InSec.
3,weevaluateourprototypeimplementation.
ThepaperconcludeswithSec.
4.
2DesigningPACUEPACUEisconstructedbytwocomponents;DynamicComputeDeviceRedirectorandResourceManager.
WefocusonapplicationsbuiltwithOpenCL,awidelyusedframe-workwhichsupportsmanytypesofcomputedevicessuchasCPUsandGPUs.
2.
1DynamicComputeDeviceRedirectionWedesigntheDynamicComputeDeviceRedirection(DCDR)methodtomeetthe"noapplicationmodication"requirement.
DCDRimplementsOpenCLAPIhookingthatconcealsactualcomputedevicesfromapplications,andavoidserrorcausedbyinconsistentinformationofdevices.
OpenCLAPIHooking.
OpenCLabstractscomputedevicesandmemoryhierarchytoutilizeheterogeneousprocessorswithinitsprogrammingmodel.
Toutilizeacom-putedevice,applicationscallOpenCLAPIsandspecifyacomputedevice.
Assigningprocessarefollowing:Secondly,selectpossibledevicesandcreateanOpenCLcon-text.
Thirdly,selectonedevicetouseandcreateacommandqueue.
Lastly,puttaskstothequeuecreatedabove.
Inthesecondandthethirdsteps,theapplicationspeciesaconcretedevicebecauseOpenCLAPIsneedsdeviceIDasitsparameter,whichmakessystem-wideoptimaldeviceselectionimpossible.
Foroptimaldeviceselection,were-movetherestrictionthattheapplicationsneedtochoosethedevicebyitselfbecausethedecisionishardforapplicationsandusers.
However,decisionsbyapplicationsorusersarerarelyoptimal(SeeSec.
2.
2).
PACUEhooksapartofOpenCLAPIswhichconcerndeviceselecting,andimplementsaskingfunctionthataskswhichdevicetoutilize.
ThereareseveralmethodstohookAPIsinWindows7wherePACUEisimple-mented.
TherstpossibilityismakingathreadinthetargetapplicationbycallingaWindowsAPICreateRemoteThread()[12].
Withthismethodweimplementanapplica-tionwhichmakeathreadinotherapplicationsandmapexternalDLLcontainingover-riddentargetAPIs.
However,theseapplicationsandDLLsarehardtoimplementduetocomplicatedprocedures.
Ithasariskbeingtreatedasmalwarebytheanti-malwaresoft-ware.
ThesecondpossibilityisGlobalHook,theuserapplicationhooksspecicAPIsofallapplicationbycallingWindowsAPISetWindowsHookEx()[13].
Thismethodisunsafe,becauseithasariskofhookingunknownapplicationsandcausingunexpectedaffecttothem.
ThethirdpossibilityismakingWrapperDLL,whichisaDLLwiththesamelenameoforiginalDLLandhasallAPIsoforiginalDLL.
WrapperDLLisalmostshelloforiginalDLL,becausemostAPIsaresimplycallsoriginalDLLAPIsexceptAPIswhichactuallyneedtododifferenttransactionfromoriginal.
ThismethodhasthemostchanceofhookingAPIs,becausewrapperDLLlocatedintheapplica-tiondirectoryisalwaysloadedpriortotheotherones,suchasDLLslocatedinsystem338T.
Horikawaetal.
Fig.
1.
DynamicComputeDeviceSwitchingbyOpenCLAPIHookingdirectoriesbydefault.
Inaddition,whenlocatingwrapperDLLinthedirectorywhichtargetEXElocated,onlyaffectsapplicationswhosebinaryislocatedinthesamedi-rectory.
Therefore,thisisreallysafewaytohookAPIs.
ThelastpossibilityistheuseofAPIhooklibraries,suchas[14].
Theselibrariesareeasytouse,howeverithaslessprobabilitytosuccesstohookAPIsthanWrapperDLL.
Italsohasarisktobetreatedasmalware.
Fromthiscomparison,weadopttheWrapperDLLmethod.
Fig.
1illus-tratesthearchitecturetohookOpenCLAPIswiththismethod.
OthermajorPCOSessuchasMacOSorLinuxdonotprovideanyfunctionlikewrapperDLLs,stillwecanimplementasimilarsystembyusingAPIhookingfunctionsofferedbyotherOSes.
Anothermethodtoswitchdevicesismakingavirtualdevice.
[5]Onthismethod,ap-plicationswillassignthevirtualdeviceandtheresourcemanagementsystemchoosearealdevice.
Thismethodhasasignicantadvantagethatitcanswitchrealdevicesatanytime,howeveritmayconictwithInstallableClientDriver(ICD)systemofOpenCL.
InstallerofOpenCLruntimelibrariesdistributedbyhardwarevendorssometimesover-write"OpenCL.
dll"le,thusinstallingavirtualdeviceorshowingapplicationsonlythevirtualdeviceisdifcultonPCs.
DeviceInformationCamouaging.
Whenapartofapplications'tasksareassignedtoPACUEselectedOpenCLdevice,someapplicationsshowerrors.
Thisisbecausedeviceinformationisdifferentfromtheapplication'sintendedone,thussomeapplicationsrecognizeitasanunusualevent.
Toavoidtheseerrors,PACUEcamouagesOpenCLdevicedetailswhenthedesiredOpenCLdevicehasbeenchangeddynamically.
However,camouagingOpenCLdevicedetailsisrisky,becausedeviceshavediffer-entspecicationsinthelowerlevel.
Therstriskisapplicationstability.
Thememorysizeofeachhierarchyisdevicedependent,hencetheunexpectedmemorysizecanre-sultinapplicationcrashorerror.
Thesecondriskisexecutionspeed.
Ifanapplicationimplementsper-deviceoptimization,mismatchbetweentheintendeddeviceandtheas-signeddevicecanresultinunexpectedperformancedegradation.
Fromthesereasons,weshouldcamouagesdevicedetailsonlywhenitisnecessary.
Tominimizetherisks,PACUEcamouagesdevicesinfollowinglevels.
1.
DevicetypelevelcamouageWhenanapplicationtriestoacquireanOpenCLdevicelist,PACUEwillover-writethecldevicetypevalue.
Asfaraspossible,PACUEwillchangethisvalueforCLDEVICETYPEALL.
Showingalldevicesinsteadofthespecictypede-vicesisareasonablechoice,becauseitavoidsforcingapplicationusingunknownPACUE:ProcessorAllocatorConsideringUserExperience339Table1.
ComparisonofDeviceCamouagingMethodsOverriddendevicetype/IDSpeciedTypewhengettingdevicelistSpeciedIDwhencreatingaContextSpeciedIDwhencreatingaCommandqueuecreationCrash/ErrorRiskCompatibilityA.
DevicetypelevelCPUsorGPUsAllCPUsorallGPUs\LowMostapplica-tionsB.
Contextlevel\CPUsorGPUsLowLowC.
Commandqueuelevel\ALLOneCPUoroneGPUHighMostapplica-tionsD.
A+CALLALLOneCPUoroneGPUNormalHighdevice.
Occasionally,applicationscannotexecutetheirOpenCLcodeonsomede-vicetypes.
Inthiscase,PACUEsetsthecldevicetypevaluetothedesiredtype,suchasCLDEVICETYPECPUorCLDEVICETYPEGPU.
2.
ContextlevelcamouageWhencreatinganOpenCLcontext,PACUEoverridesthecldeviceidvalueandforceOpenCLframeworktobuildOpenCLbinariesforeachcomputedevice.
IfPACUErecognizethatthetargetapplicationsupportonlyspecictypeofcomputedevices,PACUEwilloverwritethecldeviceidvalueandlimitdevicetypesforcontext.
Inaddition,PACUEoverridesthecldeviceidvaluewhenapplicationsrequestsdetaileddeviceinformation.
Therefore,applicationwillseeinformationofthedevicePACUEselected.
Thiscontributestoapplication'sstability,becauseacquireddeviceinformation,suchasthememorysizecorrespondstothatofthedeviceactuallywillbeused.
3.
CommandqueuelevelcamouageWhentheapplicationcallsclCreateCommandQueue()API,thisisthelastchancetochangethedevice.
Becauseofthestabilityissuedescribedabove,PACUEtriesnottochangedevicethistiming,butifnecessary,PACUEchangescldeviceidinargumentsofthisAPI.
Inthissituation,thedeviceiscamouagedcompletely,thustheapplicationrecognizesthecamouageddeviceasthedeviceapplicationspeci-ed.
Thisisaterriblydangerouswaytochangedevice,stillitimprovesapplicationcompatibility.
Thisisriskyintermsofdevicedependentcharacteristics,suchasthememorysize,however,wecanswitchtheprocessorinmoreapplicationswiththismethod.
Hence,thismethodisaceinthehole.
AsshowninTable1,thereareseveraldeviceassignmentoverridingwaysbythecom-binationofthesesteps.
Becausetheyhaveatrade-offbetweenapplicationcompatibilityandapplicationstability,wehavetomakearuleforapplyingthesemethods,andsomehintsareguredoutinSec.
3.
2.
2SystemResourceManagementWeneedasystem-wideresourcemanagerforheterogeneousprocessors,becauseav-eragePCuserscannotchoosepropercomputedeviceforeachapplication,anditis340T.
Horikawaetal.
inconvenientthattheyselectcomputedeviceeverytimetheapplicationruns.
Somead-vancedPCuserscanchoosepropercomputedevicemanually,howeveritisterriblyinconvenient.
Besides,manyPCusersdonotknowdetailedconstructionofthePCtheyareusing.
Theseuserscannotchoosethepropercomputedevicewhichsatisestheirpreferenceaccurately,eveniftheapplicationallowstheusertoselectthecomputede-viceonitsGUIcongurationmenu.
Forachievinghighuser-experience,theresourcemanagershouldselectacomputedeviceautomaticallyaccordingtouser'spreferences.
TherearemanystudiesinHPCareathatbuildaresourcemanagertoselectcomputedeviceautomatically[7,8].
Theyshowtaskdistributingalgorithmforheterogeneousprocessorsenvironmentthatoptimizedforsomespecicpurposes,suchasmaximizingperformanceormaximizingperformance-per-watt.
However,theycannotbeappliedtoresourcemanagementonPCbecausetherequirementsaredifferentbetweenPCandHPC.
Theotherapproachtodifferentiatetasks,suchasdevice-driverlevelapproach[9]wouldbeapossibilityforourgoal.
However,westillneedasystemwideresourcemanagertoconsiderheterogeneousprocessorsandapplications.
Thesearethreere-quirementsoftheresourcemanagerespeciallyforPCs.
–ConsideringuserpreferenceAPCuser'spreferenceoftenchangesandtheyarenotsimpleobjectssuchasmax-imizingperformance.
Inaddition,itisdifculttorecognizewhichapplicationisreallyimportant,becausewerarelyspecifypriorityoftheprocessexplicitly.
There-fore,wehavetobuildaresourcemanager,whichinfersuser'spreferencebycol-lectingPCutilizationstatusandchoosescomputedevicesforeachapplicationtoachieveuserpreferenceaccurately.
–SupportingvarioushardwarecongurationsThereareplentyofPChardwarecomponentsandapplications.
Becauseofthisreason,combinationofhardwarecomponentsandapplicationsareinnumerable.
Inaddition,thespecicationsofcomponentsdependontechnologytrends.
Forinstance,somenewGPUvirtualizationtechnologiesforPCsuchasVirtuGPUvirtualization[11]seamlesslyusediscreteGPUwhenspecicAPIscalled.
Thus,wehavetobuildresourcemanagerthatsupportsvarioushardwarecongurations.
–SupportingvariousruntimeversionsInstalledruntimelibrariesforparallelcomputingmayvaryinPCs.
Applicationexecutionspeedsarenotonlydependsonhardware,butalsodependsonruntimelibrarieslikeOpenCLframeworks.
Thus,acomputedeviceselectingalgorithmop-timizedforspecicruntimeversion,suchasdesignedforHPC,maynotshowgoodresultsonthenewerversionruntimelibraries.
Wehavetobuildcomputedevicese-lectingalgorithmsthatdonotdependonaspecicruntimeversion.
Thisresourcemanagerhasthreefeaturesforsatisfyingtherequirementsexplainedabove.
Therstfeatureisinformationgathering.
PACUEcollectsinformationabouthowPCisutilized,suchaswhetheranACadapterisconnected,temperaturesandvolt-agesofcomponents,andprocessorutilizationlevelsuchasprocessorloadsandtherunningapplicationslist.
Thesecondfeatureistheuserpreferenceinferringfeature.
Theuserdescribestheirrequirementsbycreatingseveralrequirementpatterns.
PACUEinferswhichpatternisthebestforthepresentsituationbyusinginformationacquiredinPACUE:ProcessorAllocatorConsideringUserExperience341therststep.
Thethirdfeatureiscomputedeviceselection,whichdecidestheOpenCLdevicetobeassignedtoeachapplication.
Weplantoimplementafewcomputedeviceselectingalgorithmsforseveraluserpreferencepatterns.
PACUEwillassigncomputedevicestoeachapplicationbasedonthealgorithmwhichmatchestheinferredpatternofuserpreference.
Theresourcemanagerworksascyclesofthesesteps:1.
CollectPCutilizationinformation.
2.
Guesswhichproleisthebestforthepresentcondition.
3.
Waitaninquiryofapplicationandanswerwhichdeviceshouldbeused.
Forevaluationpurpose,webuiltabasicresourcemanagerwhichhascommunicationfunctiontoorderapplicationstoutilizespeciccomputedevice.
Becauseoflackofuserpreferencebasedcomputedeviceselectingalgorithms,recentPACUEcanonlyselectcomputedevicebymanualselectionintheresourcemanagerGUI.
Still,itcanreceiveaninquiryofcomputedeviceselectionandansweracomputedevicetoutilize.
3EvaluationInthissectionweconrmPACUEprovidescomputedevicesredirectioncapabilityforapplicationswithoutmodicationonwidelyusedapplications.
Werststatethepolicyoftheevaluation,thenshowandanalyzetheresults.
3.
1EvaluationPolicyWeevaluatePACUEinaPCwithIntelCorei7-920CPUandAMDRADEONHD4850GPU.
AsOpenCLframework,weadoptx86binaryofATIStreamSDK2.
2[4].
ThisframeworksupportsbothCPUsandAMDRADEONGPUsasOpenCLdevices.
Astestingapplications,wechosethefollowings.
Theyarepubliclyreleasedandwidelyusedforbenchmarking,thussuitesourpurpose.
–DirectCompute&OpenCLBenchmark[1]–SiSoftwareSandra2011[15]–Samplecodeof"OpenCLIntrodouction"book[3]Weswitchthedevicetoutilizefortheseapplications,andcomparethemethodsfordeviceswitchingforeachoftheseapplications.
3.
2ResultsDirectCompute&OpenCLBenchmark.
Table2showstheresults.
PACUEcanredirectcomputedeviceperfectlyonDirectCompute&OpenCLBenchmark,butonlywithmethodD.
SiSoftwareSandra2011.
Deviceswitchingfailed.
WhenPACUEtriedtoswitchthedevice,Sandra2011exhibitedstrangebehavior,suchasshowingthesamedevicetwiceintheGUI.
BecauseSandra2011isaninformation&diagnosticutilityforPC,itgathersdeviceinformationbyvariousAPIs.
Thus,thefailuremaybecausedbythelackofintegritybetweendeviceinformationgatheredbyPACUEhookedOpenCLAPIandinformationgatheredbyotherAPIs.
However,PACUEdonotmakeSandracrashed.
342T.
Horikawaetal.
Table2.
ResultofDirectCompute&OpenCLBenchmarkOverrideMethodA-1A-2B-1B-2C-1C-2D-1D-2SpeciedDeviceTypeCPUGPU\\\\ALLALLSpeciedDeviceIDforContext\\CPUsGPUsALLALLALLALLSp.
Dev.
IDforCommandQueue\\\\CPUGPUCPUGPUApplicationRecognizedDevicesCPU*2GPU*2CPU*1GPU*1CPU*1CPU*1CPU*1+GPU*1CPU*1+GPU*1DynamicDeviceSwitchingImpossibleImpossibleStaticStaticStaticStaticDynamicDynamicSampleCodesof"OpenCLIntroduction"Book.
Thesecodesareasetof20sampleapplicationsofOpenCLAPIs.
Thedeviceswitchingsucceededforallapplicationsinthem.
However,1sampleusesdevicememoryinformationfortheoptimizedarraysize,thustheresultmightdependonthedevice.
Thecompletecamouagingdeviceinfor-mationmightthusbeincompatiblewiththeinformationexpectedbythesample.
Thiscancausetheapplicationcrashingorerrors,howeveritseemedtobeworkingcorrectlywhiletheexperiment.
3.
3AnalysisTheresultsshowthatPACUEcanswitchthecomputedevicesonrealapplications.
However,itfailsfordevicedependentapplications.
Theyusedetailedinformationoftheparticulardevice,suchasdevicememorysize.
Thus,theymaycrashorbehavestrangelybecauseoftheinformationcamouagedbyPACUE.
Amongcombinationsofthedeviceinformationoverriding,wefoundtheproperor-dertoapplyonapplications.
ShowninTable1,thesemethodshaveatrade-offbetweenapplicationstabilityandapplicationcompatibility.
Inourevaluation,wefoundthatthecompletecamouagingmethodsignicantlyincreaseapplicationcompatibilityforrealapplications,suchasDirectCompute&OpenCLBenchmark.
However,itisrealizedbygivingapplicationstheinformationofthedevicetheapplicationspecied,insteadofgivingthedeviceinformationactuallyusing.
Originalapplicationcreatoristheonlyonewhoknowsiftheapplicationworkscorrectlywhenusingthecompletecamouag-ingmethod,thusweshouldavoidusingthisriskymethodifpossible.
Ingeneral,wesuggestthefollowingmethodapplyingorder;1.
OverridedevicetypeALLandoverridedeviceidwhencreatingcontext.
(Table1B)2.
OverridedevicetypeALLandoverridedeviceidwhencreatingcommandqueue.
(Table1D)3.
Keeporiginaldevicetypeandoverridedeviceidwhencreatingcommandqueue.
(Table1C)4.
OverridedevicetypeCPUorGPUwhenapplicationrequestslistofavailablede-vices.
(Table1A)Thersttothethirdmethodssimilarlyrealizedynamicdeviceselection.
Theupperissafer,thelowerhasmorecompatibility.
Applicationsthatcannotswitchdeviceswiththerstmethodshouldusethesecondorthethirdmethod.
Thelastonehasthehigh-estcompatibilitybutitonlyprovidesstaticandrestrictivedeviceswitching.
Thus,thismethodshouldbeappliedwhenallothermethodsfail.
PACUE:ProcessorAllocatorConsideringUserExperience3434ConclusionsandFutureWorkInthispaperwepresentedPACUE.
First,PACUEswitchesthecomputedevicesdynam-icallyforapplicationsonPCswithheterogeneousprocessors.
Second,PACUEchoosescomputedevicesassignedtoapplicationstomeettheuser'srequirement.
Weconductedexperimentsofourimplementation,anddemonstratedthat1outof2realOpenCLap-plications,andallof20sampleprogramscanchangethecomputedevicedynamicallywiththedynamiccomputedeviceredirector.
Inaddition,weshowedthatafewde-viceinformationcamouagingmethodssignicantlyincreaseapplicationcompatibil-ity.
Fromabovework,wedemonstratedpotentialavailabilityofthedynamiccomputedeviceredirectingwithoutapplicationmodied.
However,thereare2technicaldisad-vantagesinPACUE.
TherstdisadvantageisthatPACUEcanswitchdevicesonlywhencreatingcommandqueue.
Thisisbecausethereisnosupportfordynamicdeviceswitch-inginOpenCL,thusthechancesforswitchingdevicesarelimited.
Wewillinvestigateothermethodstoexpandthechancesforswitchingdevices,alsowewillinvestigatethefrequenciesofthedeviceswitchingtimingonotherAPIs.
TheseconddisadvantageisOpenCLkerneloptimization.
Becauseofdeviceinformationcamouaging,thereisapossibilityofexecutingkernelsdesignedforotherdevices.
Thismaydecreasetheper-formancesignicantly,thusweshouldavoidmakingsituationslikethat.
OneansweriscachingeverytypeofkernelsourcecodesbyAPIhooking,andswitchitaccordingtothedeviceactuallyusing.
Anotheranswerisapplyingjust-in-timeOpenCLcodeopti-mizationtechniquetoimproveperformance.
However,bothofthemcaninterferethecopyrightlaworlicensesoftheapplications.
Therefore,itmaybedifculttoapplyitforPCapplications.
Becauseofthisreason,wecontinueimprovingcamouagemethodsandwewillavoidshowingdifferentdevicesinformationaspossibleaswecan.
Forourresearchgoals,wehavetheseongoingworks:IncreaseCompatibilityforApplications.
WewilladdresstheproblemthatPACUEcannotswitchcomputedevicesinsomeapplications.
Alsowewillexperimentapplica-tionstabilitytestsonapplications.
EvaluateinManyHardwareEnvironment.
WewillconductexperimentsonmorehardwarecongurationsuchasVirtu,andimprovehardwaresupportofPACUE.
ImplementtheUserPreferencesHandlerintheResourceManager.
Weassumethatthereareseveralpatternsdescribinguserpredenedrequirements(e.
g.
,playingimportantgamewiththeACadaptor,andhastylecompressionwithunremarkablevideoencoding).
PACUEinfersmatchingpatternfromtheuser'sactivityandresourceutilization.
ImplementComputeDeviceSelectingAlgorithm.
Withuserrequirementrecogni-tion,weselectcomputedevicestofollowuserpreferenceaccurately.
Wewillimple-mentsomealgorithmsandparametersetsforeachuserrequirementpattern.
Also,wewillexploreperformanceimpactwhileredirectingcomputedeviceinrealapplicationsandtakemeasureagainstheavyperformancedegradation.
ShowingapplicationsnoOpenCLdevicebyoverridingOpenCLAPIscanbeoneoftheanswers.
Inthiscase,344T.
Horikawaetal.
applicationswilluseinternaloptimizedassemblytoexecuteitstransactionanditisoftenmuchfasterthanexecutingOpenCLcodeonCPUs.
However,ithasadisadvan-tagethatcomputedevicecannotchangeuntilrestartingtheapplication,becausetheapplicationwillnevercallOpenCLAPIsagain.
Therefore,wewillinvestigateeachapplication'sbehaviorconcretelytodecidehowtoletapplicationtouseCPUs.
SupportforOtherParallelComputingFrameworks.
Weplantoimplementmod-ulesforotherAPIssuchasFusionSystemArchitectureIntermediateLayerLanguage(FSAIL).
References1.
DirectCompute&OpenCLBenchmark,http://www.
ngohq.
com/graphic-cards/16920-directcompute-and-opencl-benchmark.
html(accessedonAugust21,2011)2.
OpenCL1.
1Specication,http://www.
khronos.
org/registry/cl/specs/opencl-1.
1.
pdf3.
FixtarsCorporation:OpenCLIntroduction-ParallelProgrammingforMulticoreCPUsandGPUs.
ImpressJapan(January2010)(inJapanese)4.
AMD.
ATIStreamTechnology,http://www.
amd.
com/US/PRODUCTS/TECHNOLOGIES/STREAM-TECHNOLOGY/Pages/stream-technology.
aspx(accessedonAu-gust21,2011)5.
Aoki,R.
,Oikawa,S.
,Tsuchiyama,R.
,Nakamura,T.
:Hybridopencl:Connectingdifferentopenclimplementationsovernetwork.
In:Proc.
IEEECIT2010,pp.
2729–2735(2010)6.
Brodman,J.
C.
,Fraguela,B.
B.
,Garzaran,M.
J.
,Padua,D.
:Newabstractionsfordataparallelprogramming.
In:Proc.
USENIXHotPar,p.
16(2009)7.
Diamos,G.
F.
,Yalamanchili,S.
:Harmony:anexecutionmodelandruntimeforheteroge-neousmanycoresystems.
In:Proc.
ACMHPDC,pp.
197–200(2008)8.
Gupta,V.
,Schwan,K.
,Tolia,N.
,Talwar,V.
,Ranganathan,P.
:Pegasus:CoordinatedSchedul-ingforVirtualizedAccelerator-basedSystems.
In:Proc.
USENIXATC,pp.
31–44(2011)9.
Kato,S.
,Lakshmanan,K.
,Rajkumar,R.
,Ishikawa,Y.
:TimeGraph:GPUSchedulingforReal-TimeMulti-TaskingEnvironments.
In:Proc.
USENIXATC,pp.
17–30(2011)10.
Liu,W.
,Lewis,B.
,Zhou,X.
,Chen,H.
,Gao,Y.
,Yan,S.
,Luo,S.
,Saha,B.
:Abalancedpro-grammingmodelforemergingheterogeneousmulticoresystems.
In:Proc.
USENIXHotPar,p.
3(2010)11.
Lucidlogix.
Lucidlogixvirtu,http://www.
lucidlogix.
com/product-virtu.
html(accessedonAugust21,2011)12.
Microsoft.
CreateRemoteThreadFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms682437.
aspx(accessedonAugust21,2011)13.
Microsoft.
SetWindowsHookExFunction(Windows),http://msdn.
microsoft.
com/en-us/library/ms644990.
aspx(accessedonAugust21,2011)14.
MicrosoftResearch.
Detours-microsoftresearch,http://research.
microsoft.
com/en-us/projects/detours/(accessedonAugust21,2011)15.
SiSoftware.
Sisoftwarezone,http://www.
sisoftware.
net/(accessedonAugust21,2011)

无法忍受旧版不兼容PHP7+主题 更换新主题

今天父亲节我们有没有陪伴家人一起吃个饭,还是打个电话问候一下。前一段时间同学将网站账户给我说可以有空更新点信息确保他在没有时间的时候还能保持网站有一定的更新内容。不过,他这个网站之前采用的主题也不知道来源哪里,总之各种不合适,文件中很多都是他多年来手工修改的主题拼接的,并非完全适应WordPress已有的函数,有些函数还不兼容最新的PHP版本,于是每次出现问题都要去排查。于是和他商量后,就抽时间把...

RAKSmart VPS主机半价活动 支持Windows系统 包含香港、日本机房

RAKSmart 商家最近动作还是比较大的,比如他们也在增加云服务器产品,目前已经包含美国圣何塞和洛杉矶机房,以及这个月有新增的中国香港机房,根据大趋势云服务器算是比较技术流的趋势。传统的VPS主机架构方案在技术层面上稍微落后一些,当然也是可以用的。不清楚是商家出于对于传统VPS主机清理库存,还是多渠道的产品化营销,看到RAKSmart VPS主机提供美国、香港和日本机房的半价促销,当然也包括其他...

QQ防红跳转短网址生成网站源码(91she完整源码)

使用此源码可以生成QQ自动跳转到浏览器的短链接,无视QQ报毒,任意网址均可生成。新版特色:全新界面,网站背景图采用Bing随机壁纸支持生成多种短链接兼容电脑和手机页面生成网址记录功能,域名黑名单功能网站后台可管理数据安装说明:由于此版本增加了记录和黑名单功能,所以用到了数据库。安装方法为修改config.php里面的数据库信息,导入install.sql到数据库。...

sisoftwaresandra为你推荐
AsgardiaCONSTANTIA 1685(2017年)红酒多少一瓶?固态硬盘是什么固态硬盘是什么?与普通硬盘有什么区别?移动硬盘与u盘有什么区别?h连锁酒店连锁酒店有哪些对对塔对对塔和魔方格那个是正宗的?今日油条油条是怎样由来巨星prince去世有几位好莱坞巨星死在2016年www.522av.com跪求 我的三个母亲高清在线观看地址 我的三个母亲高清QVOD下载播放地址 我的三个母亲高清迅雷高速下载地址抓站工具一起来捉妖神行抓妖辅助工具都有哪些?抓站工具抓鸡要什么工具?www.kaspersky.com.cn卡巴斯基中国总部设立在?
什么是域名 购买域名和空间 科迈动态域名 主机屋 fdcservers virpus hawkhost dreamhost 美国翻墙 BWH 美国仿牌空间 魔兽世界台湾服务器 云鼎网络 165邮箱 100m空间 静态空间 网站卫士 linux服务器维护 in域名 双线机房 更多