IntegrationandAdvancedUsageBitfusionGuideWHITEPAPER–OCTOBER2019WHITEPAPER|2Bitfusion:IntegrationandAdvancedUsageTableofcontentsIntroduction3StartingFlexDirectDaemonsasServersviaCLI3RunClientApplicationswithFlexDirectviaflexdirectclient4ConfiguringIPAddressesasPartofClientConfiguration.
5Advanced:ClusterCommunications7Advanced:FlexibleDynamicGPUConfigurations7ExposingOneGPUoutoftheFourGPUsforApps8ExposingTwoGPUsoutoftheFourGPUsforApps8PartialGPUConfigurations9WHITEPAPER|3IntroductionForengineersintegratingBitfusiontechnologyintotheirownresourceschedulerorperhapsforadvancedusersneedingmorecontroloverGPUsareresourcing,thisguideshowshowtostartandinvokebothserverandclientprocesseswithlow-levelaccess.
YouwillstartaserverdaemonforaparticularGPUconfiguration(e.
g.
,partialmemory)andwriteaclient-sideconfigurationfile'adaptor.
conf'asshownintheexamplesbelow.
Wehavedoneintegrationsforseveraljobschedulersandresourcemanagers,socontactusifyou'relookingforhelp.
StartingFlexDirectDaemonsasServersviaCLIThedrawingbelowshowsthefourprocessesthatarerunningonaclient(orCPU)nodeandonaserver(orGPU)nodewhenyouareinteractingwiththeFlexDirectServer(Dispatcher).
Itshouldhelpyouunderstandtheconcepts,commandsandusagethatthismanualdiscusses.
Onlytwoprocessesaredirectlylaunchedbytheuser.
Thesearetheonesshowninafixedfontasyouwouldtypetheminacommandshell.
ThedrawingalsoshowstheTCPportsusedbytheGPUserverprocesses.
USERGPUGPUGPUGPUCOMPUTESERVERGPUSERVERALLOCATEINUSEflexdirectserver(Dispatcher)setupclients.
confforGPUserver:55001flexdirectclient--listeningonport55001(default)listeningonports45201+fordatapathmessageslinkedtoVMwareBitfusionCUDAlibCUDAServerCoolAppYoumuststartFlexDirectasaserver(whichiscalledDispatcher)onalltheinstancesthathaveGPUswhichyou'dliketomakeavailabletoyourclientnodesandapplications.
Shellflexdirectserver[-pport]WHITEPAPER|4YoucanalsostartaFlexDirectserver(Dispatcherprocess)fromtheclientmachinewiththerequest_gpuscommand.
However,thisrequiresthattheGPUserverisalreadyrunningtheresourcescheduler.
Advantagesinclude:PreventsmultipleusersfromtryingtoservethesameGPUsCreatesadaptors.
conffileforyouDoesnotautomaticallydeallocatetheGPUsafteraclientapplicationhasfinishedsoyoucanrunseveralapplicationssequentiallyHowever,thisdocumentcoversmanuallaunchesoftheFlexDirectserver.
RunClientApplicationswithFlexDirectviaflexdirectclientOncetheFlexDirectServersarerunning,runapplicationsusingflexdirectclient.
Passthe-lparameterasalistoftheIPaddressesofthenodesonwhichyouhaveFlexDirectServerrunning.
Usesemicolonstoseparatetheaddresses.
Replacewiththeapplicationyouwouldliketorun.
Useadoubledash--beforetheapplicationifitrequiresitsownarguments.
Shellflexdirectclient-l"172.
31.
51.
20;172.
31.
51.
26"[--]CPUServerCommandLineflexdirectclient-l172.
31.
51.
20:55002nvidia-smiGPUServerCommandLinenvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11439MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|Youmayspecifyaportnumberwiththestandardcolonnotation:WHITEPAPER|5USERGPUGPUGPUGPUGPUGPUGPUGPUCOMPUTESERVERGPUSERVERGPUSERVERflexdirectclientflexdirectserverflexdirectserverConfiguringIPAddressesasPartofClientConfigurationIfyouwanttosimplifytheflexdirectclientcommand,youcanputyourBitfusionserverIPaddressesintothe/etc/bitfusionio/adaptor.
conffile.
Overridethedefaultportbyadding:.
CPUServerCommandLineCPUServerCommandLinecat/etc/bitfusionio/adaptor.
conf172.
31.
51.
20172.
31.
51.
26:57001flexdirectclientnvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|Afterwritingadaptor.
conf,simplyrunflexdirectclientwithaGPUapplication.
Forexample,ifyourunflexdirectclientwithnvidia-smiitwilllisttheGPUsconfigured.
Typeflexdirecthelporflexdirecthelp[command]foradditionalhelpfulcommandsandinformation.
WHITEPAPER|6SampleOutput$flexdirecthelpNAME:flexdirect-RunapplicationwithBitfusionFlexDirectUSAGE:flexdirect"application"flexdirect--[application]flexdirecthelp[command]Formoreinformation,systemrequirements,andadvancedusagepleasevisithttps://www-review.
vmware.
com/solutions/business-critical-apps/hardwareaccelerators-virtualization.
htmlCOMMANDS:init,iInitializeconfiguration.
Requiresrootpriviledges.
version,vDisplayfullFlexDirectversion.
localhealth,LHRunhealthcheckoncurrentnodeonly.
upgrade,UUpgradeversion.
Requiresrootpriviledges.
uninstallUninstallFlexDirect.
Requiresrootpriviledges.
deallocDeallocatelicensecertificate.
Requiresrootpriviledges.
crashreportSendcrashreporttoBitfusion.
licenseChecklicensestatus.
list_gpusListtheavailableGPUsinasharedpool.
help,hShowsalistofcommandsorhelpforonecommand.
ClientCommands:client,cRunapplication.
health,HRunhealthcheckonallspecifiedserversandcurrentnode.
request_gpusRequestGPUsfromasharedpool.
release_gpusReleaseGPUsbackintoasharedpool.
Optionsmustmatchapreviousrequest_gpuscommand.
runRequestGPUsfromasharedpool,runaclientcommand,thenreleasetheGPUs.
statsGatherstatsfromallservers.
smiDisplaysmi-likeinfoforallservers.
localRunaCUDAapplicationlocally.
net_perfGathernetworkperformancedatafromallSRSservers.
ServerCommands:server,sRunserver.
resource_schedulerRunFlexDirectresourcescheduler(SRS)onGPUserverEXAMPLES:$sudoflexdirectinit-l$flexdirectresource_scheduler--srs_port50001$flexdirectrun-n4--Herearesomeflexdirectexampleswithexplanatorycomments.
WHITEPAPER|7TextInitializeflexdirectlicensebeforethefirstrunofserveronasystem$sudoflexdirectinit-lRunaflexdirectserverwithdefaultport55001$flexdirectserverRunaflexdirectserverwithadifferentport$flexdirectserver-p55010Runanapplicationwithaserverrunninglocalwithdefaultport55001$flexdirectclient-l"localhost"Runanapplicationwithmultipleservers,localorremote$flexdirectclient-l"192.
168.
0.
2:55010;192.
168.
0.
6:51234"Runanapplicationwithserversspecifiedinoneofthedefaultconfigfiles(~/.
bitfusionio/adaptor.
confand/etc/bitfusionio/adaptor.
confinpriorityorder)$flexdirectclientRunanapplicationwithserversspecifiedinaconfigfile$flexdirectclient-fRunaserverwitharesourcescheduleronacustomport$flexdirectresource_scheduler--srs_port50001--port55010Runanapplicationwith4sharedGPUs$flexdirectrun-n4Runanapplicationwith2sharedGPUs,usinghalftheavailablememory,andacustomservers.
conf$flexdirectrun-n2-p0.
5-sservers.
confRunanapplicationwith4sharedGPUswithInfiniBand$flexdirectrun-n4Runanapplicationlocally,restrictedtoonlyhalfthephysicalGPUmemory$flexdirectlocal-p0.
5Request8remoteGPUs$flexdirectrequest_gpus-sservers.
conf-fadaptor_8gpu.
conf-n8Runanapplicationwiththegeneratedconfigfile$flexdirectclient-fadaptor_8gpu.
confReleasethe8remoteGPUsaftertheapplicationhasfinished$flexdirectrelease_gpus-fadaptor_8gpu.
confGethelponaspecificcommand(theclientcommandinthisexample)$flexdirecthelpclientAdvanced:ClusterCommunicationsIfyouareunabletoopenupthedefault45201-46225portrangeforin-clustercommunication,youcanoverridethisrangebyexportingtheseenvironmentvariablesonyourGPUserversbeforerunningtheFlexDirectServer(alsocalledDispatcher):GPUServerCommandLine$exportBF_SERVER_PORT_MIN=$exportBF_SERVER_PORT_MAX=Advanced:FlexibleDynamicGPUConfigurationsTheexamplesbelowassumethatyouhaveafour-GPUserveratIPaddress123.
45.
67.
890.
WewillusethisoneGPUnodeforthreedifferentclientapplicationswithslightlydifferentresourceconfigurations,allsharingthesameGPUnode.
WHITEPAPER|8NOTENotehowasweprogressthroughtheexamples,weusedifferentportssothateachserverprocessisutilizinguniqueportsforcommunication.
BF_VISIBLE_DEVICESreferstotheIDnumberofeachGPUdevice,whichstartsat0.
Ifyouhavea4GPUinstance,theIDswouldbe0,1,2,and3respectively.
YoucanseethedevicesandtheirspecificIDsbyrunningnvidia-smi.
ExposingOneGPUoutoftheFourGPUsforAppsStarttheFlexDirectServer(alsocalledDispatcher)onthefirstGPUdevice(outofthefourweareassumingfortheseexamples)withthefollowingcommand:NowruntheFlexDirectClientonyourCPUnode.
Inthisexample,we'lldoitwithapplication"nvidia-smi",butyoucouldreplacethiswiththeapplicationyouwouldliketorunusingFlexDirectvirtualization.
ExposingTwoGPUsoutoftheFourGPUsforAppsStarttheFlexDirectServer(alsocalledDispatcher)onthefour-GPUnodewiththefollowingcommand:GPUServerCommandLineGPUServerCommandLineGPUServerCommandLineBF_VISIBLE_DEVICES=0flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001flexdirectclient-l123.
45.
67.
89:55001nvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|BF_VISIBLE_DEVICES=0,1flexdirectserver-p55002WHITEPAPER|9RuntheFlexDirectClient.
Inthisexample,we'lldoitwithapplication"nvidia-smi:,butyoucouldreplacethiswiththeapplicationyouwouldliketorunusingFlexDirectvirtualization.
GPUServerCommandLineflexdirectclient-l123.
45.
67.
89:55002nvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|PartialGPUConfigurations1/2-GPUavailableonport55001ThisisdonebysettingenvironmentalvariableBF_GPU_DEVICE_MEMORY_LIMITtohalfoftheGPUsmemory.
NVIDIAGPUSETTINGTOALLOWSHARINGWhenyoupartitionaGPU,presumablyyouwanttobeabletousebothpartitionssimultaneously.
NVIDIAGPUshaveacomputemodethatshouldbesetto"Default"(not"Exclusive")sothatmultipleapplicationscanshareaccess.
Usethenvidia-smi-acommandtoseethecurrentcomputemodesetting.
Andsetthemodeto"Default"withthecommandsudonvidia-smi-c0.
Server-sidecommandsshown,seeaboveonhowtoinvoketheclient.
GPUServerCommandLineBF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001WHITEPAPER|10TwoGPUsAvailableonPort5500121/2-GPUsAvailableonPort55001Fortwohalf-sizedGPUs:161/2GPUsAssignedtoTwoDifferentClients(Acrosstwofour-GPUnodes).
EachclientseeseightpartialGPUs.
Usetwodifferentportnumbers,oneforeachclient.
Commentsareinterlacedwithcommands:GPUServerCommandLineGPUServerCommandLineBF_VISIBLE_DEVICES=0,1flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001BF_VISIBLE_DEVICES=0,1flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001GPUServerCommandLines#server1:$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002#server2:$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002SampleOutput#server1:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002#server2:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001ConfigurationFiles#client1adaptor.
conf:5500155001#client2adaptor.
conf:5500255002WHITEPAPER|11161/2GPUs(acrosstwofour-GPUnodes)available.
TwodifferentclientseachallocateonepartialGPU.
GPUServerCommandLines#Server1:BF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&BF_VISIBLE_DEVICES=1BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002&BF_VISIBLE_DEVICES=2BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55003&BF_VISIBLE_DEVICES=3BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55004&BF_VISIBLE_DEVICES=4BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55005&BF_VISIBLE_DEVICES=5BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55006&BF_VISIBLE_DEVICES=6BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55007&BF_VISIBLE_DEVICES=7BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55008Server2:BF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&BF_VISIBLE_DEVICES=1BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002&BF_VISIBLE_DEVICES=2BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55003&BF_VISIBLE_DEVICES=3BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55004&BF_VISIBLE_DEVICES=4BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55005&BF_VISIBLE_DEVICES=5BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55006&BF_VISIBLE_DEVICES=6BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55007&BF_VISIBLE_DEVICES=7BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55008&SampleOutput#Server1:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55003Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55004Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55005Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55006Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55007Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55008#Server2:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55003Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55004Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55005Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55006Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55007Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55008SampleOutput#Client1adaptor.
conf(firstpartialGPUofserver1):55001#Client2adaptor.
conf(secondpartialGPUofserver1):55002VMware,Inc.
3401HillviewAvenuePaloAltoCA94304USATel877-486-9273Fax650-427-5001vmware.
comCopyright2019VMware,Inc.
Allrightsreserved.
ThisproductisprotectedbyU.
S.
andinternationalcopyrightandintellectualpropertylaws.
VMwareproductsarecoveredbyoneormorepatentslistedatvmware.
com/go/patents.
VMwareisaregisteredtrademarkortrademarkofVMware,Inc.
anditssubsidiariesintheUnitedStatesandotherjurisdictions.
Allothermarksandnamesmentionedhereinmaybetrademarksoftheirrespectivecompanies.
ItemNo:VMW-0518-1843_VMW_CPBUTechnicalWhitePapers_BitfusionDocs_08IntegrationandAdvancedUsage_1.
5_YC8/19
特网云为您提供高速、稳定、安全、弹性的云计算服务计算、存储、监控、安全,完善的云产品满足您的一切所需,深耕云计算领域10余年;我们拥有前沿的核心技术,始终致力于为政府机构、企业组织和个人开发者提供稳定、安全、可靠、高性价比的云计算产品与服务。官方网站:https://www.56dr.com/ 10年老品牌 值得信赖 有需要的请联系======================特网云推出多IP云主机...
Virmach 商家算是比较久且一直在低价便宜VPS方案中玩的不亦乐乎的商家,有很多同时期的商家纷纷关闭转让,也有的转型到中高端用户。而前一段时间也有分享过一次Virmach商家推出所谓的一次性便宜VPS主机,比如很低的价格半年时间,时间到服务器也就关闭。这不今天又看到商家有提供这样的产品。这次的活动产品包括圣何塞和水牛城两个机房,为期六个月,一次性付费用完将会取消,就这么特别的产品,适合短期玩玩...
香港服务器多少钱一个月?香港服务器租用配置价格一个月多少,现在很多中小型企业在建站时都会租用香港服务器,租用香港服务器可以使网站访问更流畅、稳定性更好,安全性会更高等等。香港服务器的租用和其他地区的服务器租用配置元素都是一样的,那么为什么香港服务器那么受欢迎呢,香港云服务器最便宜价格多少钱一个月呢?阿里云轻量应用服务器最便宜的是1核1G峰值带宽30Mbps,24元/月,288元/年。不过我们一般选...
WWW YC8 COM为你推荐
印章制作传统印章怎么做中国电信互联星空电信的互联星空服务是什么?安卓应用平台手机系统应用在哪天天酷跑刷金币如何使用八门神器给天天酷跑刷钻刷金币硬盘人上海人说“硬盘”是什么梗xp系统停止服务XP系统停止服务后电脑怎么办?lockdownd[求教]在淘宝买了张激活卡,请问怎么取消激活网页打开很慢为什么我打开浏览器的时候,网页打开的很慢?系统分析员考系统分析员有什么好处?聚美优品红包聚美优品怎么给别人发红包
国外域名注册 合租服务器 最便宜的vps 鲨鱼机 dropbox网盘 建站代码 100x100头像 阿里云浏览器 可外链网盘 世界测速 什么是服务器托管 稳定免费空间 便宜空间 东莞服务器托管 贵阳电信测速 登陆qq空间 免备案cdn加速 大化网 存储服务器 第八届中美互联网论坛 更多