sufficientopteron
opteron 时间:2021-03-27 阅读:(
)
MellanoxTechnologiesInc.
2900StenderWay,SantaClara,CA95054Tel:408-970-3400Fax:408-970-3403http://www.
mellanox.
comRealApplicationPerformanceandBeyondWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
2Scientists,engineersandanalystsinvirtuallyeveryfieldareturningtohighperformancecomputingtosolvetoday'svitalandcomplexproblems.
Simulationsareincreasinglyreplacingexpensivephysicaltesting,asmorecomplexenvironmentscanbemodeledandinsomecases,fullysimulated.
High-performancecomputingencompassesadvancedcomputationoverparallelprocessing,enablingfasterexecutionofhighlycomputeintensivetaskssuchasclimateresearch,molecularmodeling,physicalsimulations,cryptanalysis,geophysicalmodeling,automotiveandaerospacedesign,financialmodeling,dataminingandmore.
HPCclustershavebecomethemostcommonbuildingblocksforhigh-performancecomputing,notonlybecausetheyareaffordable,butbecausetheyprovidetheneededflexibilityanddeliversuperiorprice/performancecomparedtoproprietarysymmetricmultiprocessing(SMP)systems,withthesimplicityandvalueofindustrystandardcomputing.
MauiHighPerformanceComputingCenter1280servers,MellanoxInfiniBandinterconnect,42.
3TFlopsReal-worldapplicationperformancedependsontheperformanceofthevariouscluster'skeyelements–theprocessor,thememory,andtheinterconnect.
Theinterconnectcontrolsthedatatransferbetweenservers,andhasahighinfluenceontheCPUefficiencyandmemoryutilization.
Transportoffloadinterconnectarchitectures,unlikethe"on-loading"ones,eliminatetheneedofdealingwiththeprotocolprocessingwithintheCPUandthereforeincreasethenumberofcyclesavailableforcomputationaltasks.
IftheCPUisbusymovingdataandhandlingnetworkprotocolprocessing,itisunabletoperformcomputationalwork,andtheoverallproductivityofthesystemisseverelydegraded.
Thememorycopyoverheadincludestheresourcesrequiredtocopydatabuffersfromthenetworkdevicetothekernelmemoryandthenfromthekernelmemorytotheapplicationmemory.
Thisapproachrequiresmultiplememoryaccessesbeforethedataisplacedinitsfinaldestination.
Whileitisnotamajorproblemforsmalldatatransfers,itisabigproblemforlargerdatatransfers.
Thisiswheretheinterconnectzero-copycapabilitieseliminatesthememorybandwidthbottleneckwithoutinvolvingtheCPUinthenetworkdatatransfer.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
3SandiaNationalLab4500servers,MellanoxInfiniBandinterconnect53TFlops,84.
66%LinpackefficiencyTheinterconnectbandwidthandlatencyhavetraditionallybeenusedastwometricsforassessingtheperformanceofthesystem'sinterconnectfabric.
However,thesetwometricsaretypicallynotsufficienttodeterminetheperformanceofrealworldapplications.
Typicalreal-worldapplicationssendmessagesrangingfrom64Byteto4Megabyteusingnotonlypoint-to-pointcommunicationbutadiversemixtureofcommunicationpatterns,includingcollectiveandreductionpatternsinthecaseofMPI.
Insomecases,interconnectvendorscreateartificialbenchmarks,suchasmessagerate,andapplybombasticmarketingsloganstothesebenchmarks–suchas"Hypermessaging".
Messagerateisyetanothersinglepointinthepoint-to-pointbandwidthgraph.
Ifthetraditionalinterconnectbandwidthindicatesthemaximumavailablebandwidth(singlepoint),messagerateindicatesthebandwidthformessagesizeofzeroor2bytes.
Thesinglepointsofdata,givesomeindicationfortheinterconnectperformance,butarefarfromdescribingtherealworldapplicationperformance.
Theinteractivecombinationofthosepoints,togetherwithothers(CPUoverhead,zerocopyetc.
),willdeterminetheoverallabilityoftheconnectivitysolution.
Thedifferencebetweentheoreticalpowerandwhatisactuallydeliveredismeasuredasprocessorefficiency.
ThemoreCPUcyclesusedtogetthedataoutthedoorby"fillingthewire"duetoprotocolanddatatransferinefficiencies,thelesscyclesareavailablefortheapplication.
Whencomparinglatenciesofdifferentinterconnects,oneneedstopayattentiontotheinterconnectarchitecture.
1useclatency"on-loading"interconnectversus2useclatency"off-load"solutionissimilartoacasewhenoneneedstodecidebetweentwocarsthatshowthesamehorsepower(i.
e.
CPU).
Bothenginesarecapableof200milesperhour,butthefirstcar,dueto"on-loading",limitstheactualenginepowerto75milesperhour(theenginepowermustbeusedforothertasks).
TheSecondcarhasnolimitationsontheengine,butitswheelscantolerateonly150milesWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
4perhour.
Theknowledgeonthewheelstolerance(i.
e.
latency),asasinglepointofdata,isdefinitelymisleading.
Thereareattemptstoproviderealworldapplicationperformancewhilecomparingdifferentinterconnects,butinmostcasesthe"comparison"isbiasedandbyusingdifferentsystemsand/orconditions,whichmakesatruecomparisondifficult.
Therehavebeenrecentcasescomparing10-GigabitEthernettoInfiniBand.
WhileInfiniBandadaptersweretestedwithPCIex4(thatislimitedto~700MByte/secbandwidth(duetolimitationsincurrentavailablesystems),the10GigabitEthernetcardswerePCI-X,thatiscapabletohigherbandwidth(~850-900MByte/s).
OthercasescompareInfiniBandPCIex4tootherinterconnectswithPCIex8hostinterface(theonlyvalidconclusiononecanmakeisthatPCIex8hasmorelanesthanPCIex4).
AnotherpapercomparedQLogicInfiniPathonIntel3GHzCPUbasedsystemtoMellanoxInfiniBandon2.
2GHzOpteronbasedsystem.
Anyattempttocomparedifferentinterconnectsinthosemannersisdeceptive.
RealapplicationperformanceInfiniBandisaproveninterconnectforclusteredserversolutions,andoneoftheleadingconnectivitysolutionforhigh-performancecomputing.
InfiniBandwasdesignedasageneralI/Oandinpracticeprovideslow-latencyandthehighestlinkspeed.
ComputationalFluidDynamics(CFD)isoneofthebranchesoffluidmechanicsthatusesnumericalmethodsandalgorithmstosolveandanalyzeproblemsthatinvolvefluidflows.
ANSYS/FLUENTisaleadingcommercialsoftwareproviderforsolvingfluidflowproblems.
ThebroadphysicalmodelingcapabilitiesofFLUENThavebeenappliedtoindustrialapplicationsrangingfromairflowoveranaircraftwingtocombustioninafurnace,frombubblecolumnstoglassproduction,frombloodflowtosemiconductormanufacturing,fromcleanroomdesigntowastewatertreatmentplants.
Theabilityofthesoftwaretomodelin-cylinderengines,aeroacoustics,turbomachinery,andmultiphasesystemshasservedtobroadenitsreach.
AtthecoreofanyCFDcalculationisacomputationalgrid,usedtodividethesolutiondomainintothousandsormillionsofelementswheretheproblemvariablesarecomputedandstored.
InFLUENT,unstructuredgridtechnologyisused,whichmeansthatthegridcanconsistofelementsinavarietyofshapes:quadrilateralsandtrianglesfor2Dsimulations,andhexahedral,tetrahedral,prisms,andpyramidsfor3Dsimulations.
Theseelementsformaninterlockingnetworkthroughoutthevolumewherethefluidflowanalysistakesplace.
TheperformanceofaCFDcodedependsonseveralfactors,includingsizeandtopologyofthemesh,physicalmodels,numericsandparallelization,compilersandoptimization,inadditiontoperformancecharacteristicsofthehardwarewherethesimulationisperformed.
FLUENTprovidesasetofbenchmarkproblemswhichrepresenttypicalcurrentusageandcoveringawiderangeofmeshsizesandphysicalmodels.
Theproblemsselectedrepresentarangeofsimulationstypicalofthosewhichmightbefoundinindustry.
TheprincipalobjectiveofthisbenchmarksuiteistoprovidecomprehensiveandfaircomparativeinformationoftheperformanceofFLUENTonavailablehardwareplatforms.
ThefollowingchartscomparesMellanoxInfiniBandandQLogicInfiniPathinterconnectsonthesameplatform–dualcore,dualsocket,IntelXeon3GHz5100series(codenameWoodcrest)servers,usingFLUENTbenchmarks.
Whentestingrealworldapplications,theentirearchitecturemakesthedifference.
TheMellanoxarchitectureisafulltransport-offloadone,withhardwarecapabilitiesofRDMA,whileQLogicisafull"on-loading"architecture.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
5InFluentFL5L3benchmark,aTurbulentflowofairthroughaductiscomputed.
Thecross-sectionalplanesoftheducttransitionfromacircleattheinlettoarectangleattheoutflowboundary.
TheReynolds-StressModelisusedforcomputingturbulence(numberofcells:9,792,512,celltypehexahedral,modelsRSMturbulence,solversegregatedimplicit).
FLUENTFL5L2benchmarkrepresentsthecomputationoftheexteriorflowfieldaroundasimplifiedmodelofapassengersedan.
ThesimulationgeometrywasusedfortheJapanExternalAerodynamicscompetition.
Aviscous-hybridgridwithprismaticcellsisusedtoadequatelyFluent6.
3,FL5L3case0200400600800100012001400160018002000020406080100120140CPUcoresRating(performance)QlogicMellanoxFluent6.
3,FL5L2case02000400060008000020406080CPUcoresRating(performance)QlogicMellanoxWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
6modeltheboundarylayerregions(numberofcells3,618,080,celltypehybrid,modelsk-epsilonturbulence,solversegregatedimplicit).
ChoosingtherightinterconnectInbothcasesofFLUENTbenchmarks,MellanoxInfiniBandshowshigherperformanceandbettersuper-linearscalingcomparingtoQLogicInfiniPath.
FLUENT'sCFDapplicationisalatency-sensitiveapplication,andtheresultsshownherearegoodexamplesonhowpurelatencybenchmarkscanbemisleadingwhenchoosingtherightinterconnect.
Inordertodeterminethesystem'sperformance,oneshouldtakeintoconsiderationtheentireinterconnectarchitecture(suchasoff-loadingversuson-loading)andtheabilityofscaling,ratherthanjustsinglepointsofdata.
Inordertoprovidebetterapplicationssight,MellanoxhascreatedtheMellanoxClusterCenter.
TheMellanoxClusterCenteroffersanenvironmentfordeveloping,testing,benchmarkingandoptimizingproductsbasedonInfiniBandtechnology.
Thecenter,locatedinSantaClara,California,provideson-sitetechnicalsupportandenablessecuresessionsonsiteorremotely.
MoredetailscanbeachievedthroughMellanoxwebsite.
ucloud美国云服务器怎么样?ucloud是国内知名云计算品牌服务商家,目前推出全球多地机房的海外云服务器。UCloud主打的优势是海外多机房,目前正在进行的2021全球大促活动参与促销的云服务器机房就多达18个。UCloud新一代旗舰产品快杰云服务器已上线洛杉矶节点,覆盖北美和亚太地区,火热促销中, 首月低至7元,轻松体验具备优秀性能与极高性价比的快杰云服务器。点击进入:ucloud美国洛杉矶...
可以看到这次国庆萤光云搞了一个不错的折扣,香港CN2产品6.5折促销,还送50的国庆红包。萤光云是2002年创立的商家,本次国庆活动主推的是香港CN2优化的机器,其另外还有国内BGP和高防服务器。本次活动力度较大,CN2优化套餐低至20/月(需买三个月,用上折扣+代金券组合),有需求的可以看看。官方网站:https://www.lightnode.cn/地区CPU内存SSDIP带宽/流量价格备注购...
快快云怎么样?快快云是一家成立于2021年的主机服务商,致力于为用户提供高性价比稳定快速的主机托管服务,快快云目前提供有香港云服务器、美国云服务器、日本云服务器、香港独立服务器、美国独立服务器,日本独立服务器。快快云专注为个人开发者用户,中小型,大型企业用户提供一站式核心网络云端服务部署,促使用户云端部署化简为零,轻松快捷运用云计算!多年云计算领域服务经验,遍布亚太地区的海量节点为业务推进提供强大...
opteron为你推荐
巨星prince去世有几位好莱坞巨星死在2016年商标注册流程及费用我想注册商标一般需要什么流程和费用?网站检测请问论文检测网站好的有那些?www.765.com哪里有免费的电影网站sesehu.com68lolita com是真的吗lcoc.toptop weenie 是什么?baqizi.cc讲讲曾子杀猪的主要内容!机器蜘蛛《不思议迷宫》四个机器蜘蛛怎么得 获得攻略方法介绍www.gogo.comNEO春之色直径?pp43.com登录www.bdnpxzl.com怎么进入网站后台啊
.cn域名注册 已备案未注册域名 服务器租用托管 已备案域名出售 ca4249 微信收钱 hinet 域名接入 百度云1t 免费phpmysql空间 香港新世界中心 国外ip加速器 创建邮箱 下载速度测试 成都主机托管 购买空间 hostease 广州主机托管 rewritecond 小夜博客 更多