IntegrationandAdvancedUsageBitfusionGuideWHITEPAPER–OCTOBER2019WHITEPAPER|2Bitfusion:IntegrationandAdvancedUsageTableofcontentsIntroduction3StartingFlexDirectDaemonsasServersviaCLI3RunClientApplicationswithFlexDirectviaflexdirectclient4ConfiguringIPAddressesasPartofClientConfiguration.
5Advanced:ClusterCommunications7Advanced:FlexibleDynamicGPUConfigurations7ExposingOneGPUoutoftheFourGPUsforApps8ExposingTwoGPUsoutoftheFourGPUsforApps8PartialGPUConfigurations9WHITEPAPER|3IntroductionForengineersintegratingBitfusiontechnologyintotheirownresourceschedulerorperhapsforadvancedusersneedingmorecontroloverGPUsareresourcing,thisguideshowshowtostartandinvokebothserverandclientprocesseswithlow-levelaccess.
YouwillstartaserverdaemonforaparticularGPUconfiguration(e.
g.
,partialmemory)andwriteaclient-sideconfigurationfile'adaptor.
conf'asshownintheexamplesbelow.
Wehavedoneintegrationsforseveraljobschedulersandresourcemanagers,socontactusifyou'relookingforhelp.
StartingFlexDirectDaemonsasServersviaCLIThedrawingbelowshowsthefourprocessesthatarerunningonaclient(orCPU)nodeandonaserver(orGPU)nodewhenyouareinteractingwiththeFlexDirectServer(Dispatcher).
Itshouldhelpyouunderstandtheconcepts,commandsandusagethatthismanualdiscusses.
Onlytwoprocessesaredirectlylaunchedbytheuser.
Thesearetheonesshowninafixedfontasyouwouldtypetheminacommandshell.
ThedrawingalsoshowstheTCPportsusedbytheGPUserverprocesses.
USERGPUGPUGPUGPUCOMPUTESERVERGPUSERVERALLOCATEINUSEflexdirectserver(Dispatcher)setupclients.
confforGPUserver:55001flexdirectclient--listeningonport55001(default)listeningonports45201+fordatapathmessageslinkedtoVMwareBitfusionCUDAlibCUDAServerCoolAppYoumuststartFlexDirectasaserver(whichiscalledDispatcher)onalltheinstancesthathaveGPUswhichyou'dliketomakeavailabletoyourclientnodesandapplications.
Shellflexdirectserver[-pport]WHITEPAPER|4YoucanalsostartaFlexDirectserver(Dispatcherprocess)fromtheclientmachinewiththerequest_gpuscommand.
However,thisrequiresthattheGPUserverisalreadyrunningtheresourcescheduler.
Advantagesinclude:PreventsmultipleusersfromtryingtoservethesameGPUsCreatesadaptors.
conffileforyouDoesnotautomaticallydeallocatetheGPUsafteraclientapplicationhasfinishedsoyoucanrunseveralapplicationssequentiallyHowever,thisdocumentcoversmanuallaunchesoftheFlexDirectserver.
RunClientApplicationswithFlexDirectviaflexdirectclientOncetheFlexDirectServersarerunning,runapplicationsusingflexdirectclient.
Passthe-lparameterasalistoftheIPaddressesofthenodesonwhichyouhaveFlexDirectServerrunning.
Usesemicolonstoseparatetheaddresses.
Replacewiththeapplicationyouwouldliketorun.
Useadoubledash--beforetheapplicationifitrequiresitsownarguments.
Shellflexdirectclient-l"172.
31.
51.
20;172.
31.
51.
26"[--]CPUServerCommandLineflexdirectclient-l172.
31.
51.
20:55002nvidia-smiGPUServerCommandLinenvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11439MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|Youmayspecifyaportnumberwiththestandardcolonnotation:WHITEPAPER|5USERGPUGPUGPUGPUGPUGPUGPUGPUCOMPUTESERVERGPUSERVERGPUSERVERflexdirectclientflexdirectserverflexdirectserverConfiguringIPAddressesasPartofClientConfigurationIfyouwanttosimplifytheflexdirectclientcommand,youcanputyourBitfusionserverIPaddressesintothe/etc/bitfusionio/adaptor.
conffile.
Overridethedefaultportbyadding:.
CPUServerCommandLineCPUServerCommandLinecat/etc/bitfusionio/adaptor.
conf172.
31.
51.
20172.
31.
51.
26:57001flexdirectclientnvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|Afterwritingadaptor.
conf,simplyrunflexdirectclientwithaGPUapplication.
Forexample,ifyourunflexdirectclientwithnvidia-smiitwilllisttheGPUsconfigured.
Typeflexdirecthelporflexdirecthelp[command]foradditionalhelpfulcommandsandinformation.
WHITEPAPER|6SampleOutput$flexdirecthelpNAME:flexdirect-RunapplicationwithBitfusionFlexDirectUSAGE:flexdirect"application"flexdirect--[application]flexdirecthelp[command]Formoreinformation,systemrequirements,andadvancedusagepleasevisithttps://www-review.
vmware.
com/solutions/business-critical-apps/hardwareaccelerators-virtualization.
htmlCOMMANDS:init,iInitializeconfiguration.
Requiresrootpriviledges.
version,vDisplayfullFlexDirectversion.
localhealth,LHRunhealthcheckoncurrentnodeonly.
upgrade,UUpgradeversion.
Requiresrootpriviledges.
uninstallUninstallFlexDirect.
Requiresrootpriviledges.
deallocDeallocatelicensecertificate.
Requiresrootpriviledges.
crashreportSendcrashreporttoBitfusion.
licenseChecklicensestatus.
list_gpusListtheavailableGPUsinasharedpool.
help,hShowsalistofcommandsorhelpforonecommand.
ClientCommands:client,cRunapplication.
health,HRunhealthcheckonallspecifiedserversandcurrentnode.
request_gpusRequestGPUsfromasharedpool.
release_gpusReleaseGPUsbackintoasharedpool.
Optionsmustmatchapreviousrequest_gpuscommand.
runRequestGPUsfromasharedpool,runaclientcommand,thenreleasetheGPUs.
statsGatherstatsfromallservers.
smiDisplaysmi-likeinfoforallservers.
localRunaCUDAapplicationlocally.
net_perfGathernetworkperformancedatafromallSRSservers.
ServerCommands:server,sRunserver.
resource_schedulerRunFlexDirectresourcescheduler(SRS)onGPUserverEXAMPLES:$sudoflexdirectinit-l$flexdirectresource_scheduler--srs_port50001$flexdirectrun-n4--Herearesomeflexdirectexampleswithexplanatorycomments.
WHITEPAPER|7TextInitializeflexdirectlicensebeforethefirstrunofserveronasystem$sudoflexdirectinit-lRunaflexdirectserverwithdefaultport55001$flexdirectserverRunaflexdirectserverwithadifferentport$flexdirectserver-p55010Runanapplicationwithaserverrunninglocalwithdefaultport55001$flexdirectclient-l"localhost"Runanapplicationwithmultipleservers,localorremote$flexdirectclient-l"192.
168.
0.
2:55010;192.
168.
0.
6:51234"Runanapplicationwithserversspecifiedinoneofthedefaultconfigfiles(~/.
bitfusionio/adaptor.
confand/etc/bitfusionio/adaptor.
confinpriorityorder)$flexdirectclientRunanapplicationwithserversspecifiedinaconfigfile$flexdirectclient-fRunaserverwitharesourcescheduleronacustomport$flexdirectresource_scheduler--srs_port50001--port55010Runanapplicationwith4sharedGPUs$flexdirectrun-n4Runanapplicationwith2sharedGPUs,usinghalftheavailablememory,andacustomservers.
conf$flexdirectrun-n2-p0.
5-sservers.
confRunanapplicationwith4sharedGPUswithInfiniBand$flexdirectrun-n4Runanapplicationlocally,restrictedtoonlyhalfthephysicalGPUmemory$flexdirectlocal-p0.
5Request8remoteGPUs$flexdirectrequest_gpus-sservers.
conf-fadaptor_8gpu.
conf-n8Runanapplicationwiththegeneratedconfigfile$flexdirectclient-fadaptor_8gpu.
confReleasethe8remoteGPUsaftertheapplicationhasfinished$flexdirectrelease_gpus-fadaptor_8gpu.
confGethelponaspecificcommand(theclientcommandinthisexample)$flexdirecthelpclientAdvanced:ClusterCommunicationsIfyouareunabletoopenupthedefault45201-46225portrangeforin-clustercommunication,youcanoverridethisrangebyexportingtheseenvironmentvariablesonyourGPUserversbeforerunningtheFlexDirectServer(alsocalledDispatcher):GPUServerCommandLine$exportBF_SERVER_PORT_MIN=$exportBF_SERVER_PORT_MAX=Advanced:FlexibleDynamicGPUConfigurationsTheexamplesbelowassumethatyouhaveafour-GPUserveratIPaddress123.
45.
67.
890.
WewillusethisoneGPUnodeforthreedifferentclientapplicationswithslightlydifferentresourceconfigurations,allsharingthesameGPUnode.
WHITEPAPER|8NOTENotehowasweprogressthroughtheexamples,weusedifferentportssothateachserverprocessisutilizinguniqueportsforcommunication.
BF_VISIBLE_DEVICESreferstotheIDnumberofeachGPUdevice,whichstartsat0.
Ifyouhavea4GPUinstance,theIDswouldbe0,1,2,and3respectively.
YoucanseethedevicesandtheirspecificIDsbyrunningnvidia-smi.
ExposingOneGPUoutoftheFourGPUsforAppsStarttheFlexDirectServer(alsocalledDispatcher)onthefirstGPUdevice(outofthefourweareassumingfortheseexamples)withthefollowingcommand:NowruntheFlexDirectClientonyourCPUnode.
Inthisexample,we'lldoitwithapplication"nvidia-smi",butyoucouldreplacethiswiththeapplicationyouwouldliketorunusingFlexDirectvirtualization.
ExposingTwoGPUsoutoftheFourGPUsforAppsStarttheFlexDirectServer(alsocalledDispatcher)onthefour-GPUnodewiththefollowingcommand:GPUServerCommandLineGPUServerCommandLineGPUServerCommandLineBF_VISIBLE_DEVICES=0flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001flexdirectclient-l123.
45.
67.
89:55001nvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|BF_VISIBLE_DEVICES=0,1flexdirectserver-p55002WHITEPAPER|9RuntheFlexDirectClient.
Inthisexample,we'lldoitwithapplication"nvidia-smi:,butyoucouldreplacethiswiththeapplicationyouwouldliketorunusingFlexDirectvirtualization.
GPUServerCommandLineflexdirectclient-l123.
45.
67.
89:55002nvidia-smi|NVIDIA-SMI375.
26DriverVersion:375.
26||GPUNamePersistence-M|Bus-IdDisp.
A|VolatileUncorr.
ECC||FanTempPerfPwr:Usage/Cap|Memory-Usage|GPU-UtilComputeM.
||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||||0TeslaK80Off|0000:01:00.
0Off|N/A||N/A%53CP829W/149W|0MiB/11479MiB|0%Default||Processes:GPUMemory||GPUPIDTypeProcessnameUsage||Norunningprocessesfound|PartialGPUConfigurations1/2-GPUavailableonport55001ThisisdonebysettingenvironmentalvariableBF_GPU_DEVICE_MEMORY_LIMITtohalfoftheGPUsmemory.
NVIDIAGPUSETTINGTOALLOWSHARINGWhenyoupartitionaGPU,presumablyyouwanttobeabletousebothpartitionssimultaneously.
NVIDIAGPUshaveacomputemodethatshouldbesetto"Default"(not"Exclusive")sothatmultipleapplicationscanshareaccess.
Usethenvidia-smi-acommandtoseethecurrentcomputemodesetting.
Andsetthemodeto"Default"withthecommandsudonvidia-smi-c0.
Server-sidecommandsshown,seeaboveonhowtoinvoketheclient.
GPUServerCommandLineBF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001WHITEPAPER|10TwoGPUsAvailableonPort5500121/2-GPUsAvailableonPort55001Fortwohalf-sizedGPUs:161/2GPUsAssignedtoTwoDifferentClients(Acrosstwofour-GPUnodes).
EachclientseeseightpartialGPUs.
Usetwodifferentportnumbers,oneforeachclient.
Commentsareinterlacedwithcommands:GPUServerCommandLineGPUServerCommandLineBF_VISIBLE_DEVICES=0,1flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001BF_VISIBLE_DEVICES=0,1flexdirectserver-p55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001GPUServerCommandLines#server1:$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002#server2:$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&$BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002SampleOutput#server1:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002#server2:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001ConfigurationFiles#client1adaptor.
conf:5500155001#client2adaptor.
conf:5500255002WHITEPAPER|11161/2GPUs(acrosstwofour-GPUnodes)available.
TwodifferentclientseachallocateonepartialGPU.
GPUServerCommandLines#Server1:BF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&BF_VISIBLE_DEVICES=1BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002&BF_VISIBLE_DEVICES=2BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55003&BF_VISIBLE_DEVICES=3BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55004&BF_VISIBLE_DEVICES=4BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55005&BF_VISIBLE_DEVICES=5BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55006&BF_VISIBLE_DEVICES=6BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55007&BF_VISIBLE_DEVICES=7BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55008Server2:BF_VISIBLE_DEVICES=0BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55001&BF_VISIBLE_DEVICES=1BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55002&BF_VISIBLE_DEVICES=2BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55003&BF_VISIBLE_DEVICES=3BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55004&BF_VISIBLE_DEVICES=4BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55005&BF_VISIBLE_DEVICES=5BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55006&BF_VISIBLE_DEVICES=6BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55007&BF_VISIBLE_DEVICES=7BF_GPU_DEVICE_MEMORY_LIMIT=6291456000flexdirectserver-p55008&SampleOutput#Server1:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55003Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55004Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55005Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55006Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55007Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55008#Server2:Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55001Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55002Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55003Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55004Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55005Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55006Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55007Dispatcherlistening.
.
.
Listeningon0.
0.
0.
0:55008SampleOutput#Client1adaptor.
conf(firstpartialGPUofserver1):55001#Client2adaptor.
conf(secondpartialGPUofserver1):55002VMware,Inc.
3401HillviewAvenuePaloAltoCA94304USATel877-486-9273Fax650-427-5001vmware.
comCopyright2019VMware,Inc.
Allrightsreserved.
ThisproductisprotectedbyU.
S.
andinternationalcopyrightandintellectualpropertylaws.
VMwareproductsarecoveredbyoneormorepatentslistedatvmware.
com/go/patents.
VMwareisaregisteredtrademarkortrademarkofVMware,Inc.
anditssubsidiariesintheUnitedStatesandotherjurisdictions.
Allothermarksandnamesmentionedhereinmaybetrademarksoftheirrespectivecompanies.
ItemNo:VMW-0518-1843_VMW_CPBUTechnicalWhitePapers_BitfusionDocs_08IntegrationandAdvancedUsage_1.
5_YC8/19
部落分享过多次G-core(gcorelabs)的产品及评测信息,以VPS主机为主,距离上一次分享商家的独立服务器还在2年多前,本月初商家针对迈阿密机房限定E5-2623v4 CPU的独立服务器推出75折优惠码,活动将在9月30日到期,这里再分享下。G-core(gcorelabs)是一家总部位于卢森堡的国外主机商,主要提供基于KVM架构的VPS主机和独立服务器租用等,数据中心包括俄罗斯、美国、日...
野草云服务商在前面的文章中也有多次提到,算是一个国内的小众服务商。促销活动也不是很多,比较专注个人云服务用户业务,之前和站长聊到不少网友选择他们家是用来做网站的。这不看到商家有提供香港云服务器的优惠促销,可选CN2、BGP线路、支持Linux与windows系统,支持故障自动迁移,使用NVMe优化的Ceph集群存储,比较适合建站用户选择使用,最低年付138元 。野草云(原野草主机),公司成立于20...
pacificrack又追加了3款特价便宜vps搞促销,而且是直接7折优惠(一次性),低至年付7.2美元。这是本月第3波便宜vps了。熟悉pacificrack的知道机房是QN的洛杉矶,接入1Gbps带宽,KVM虚拟,纯SSD RAID10,自带一个IPv4。官方网站:https://pacificrack.com支持PayPal、支付宝等方式付款7折秒杀优惠码:R3UWUYF01T内存CPUSS...
WWW YC8 COM为你推荐
bbsxpdvbbs bbsxp LeadBBS 对比最新qq空间代码QQ空间代码云播怎么看片手机云播怎么用?天天酷跑刷积分教程葫芦侠3楼几十万的积分怎么刷天天酷跑积分怎么刷flash导航条谁来帮我看看这样的flash导航条 下面的页面该怎么设计ps抠图技巧ps抠图多种技巧,越详细越好,急~~~~~~~如何建立一个网站如何建立一个网站iphone越狱后怎么恢复苹果手机越狱后怎么恢复ios7固件下载iOS7如何升级固件?ejb开发什么是EJB?它是干什么的?和JAVA,JSP有关系吗?他们各有什么特点和用途?
长春域名注册 科迈动态域名 flashfxp怎么用 服务器怎么绑定域名 169邮箱 免费高速空间 免费dns解析 免费网页空间 如何建立邮箱 联通网站 cxz 免费的域名 xuni 创速 学生机 winds cdn加速 phpwind论坛 美国达拉斯 电信测速器在线测网速 更多