sufficientopteron
opteron 时间:2021-03-27 阅读:(
)
MellanoxTechnologiesInc.
2900StenderWay,SantaClara,CA95054Tel:408-970-3400Fax:408-970-3403http://www.
mellanox.
comRealApplicationPerformanceandBeyondWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
2Scientists,engineersandanalystsinvirtuallyeveryfieldareturningtohighperformancecomputingtosolvetoday'svitalandcomplexproblems.
Simulationsareincreasinglyreplacingexpensivephysicaltesting,asmorecomplexenvironmentscanbemodeledandinsomecases,fullysimulated.
High-performancecomputingencompassesadvancedcomputationoverparallelprocessing,enablingfasterexecutionofhighlycomputeintensivetaskssuchasclimateresearch,molecularmodeling,physicalsimulations,cryptanalysis,geophysicalmodeling,automotiveandaerospacedesign,financialmodeling,dataminingandmore.
HPCclustershavebecomethemostcommonbuildingblocksforhigh-performancecomputing,notonlybecausetheyareaffordable,butbecausetheyprovidetheneededflexibilityanddeliversuperiorprice/performancecomparedtoproprietarysymmetricmultiprocessing(SMP)systems,withthesimplicityandvalueofindustrystandardcomputing.
MauiHighPerformanceComputingCenter1280servers,MellanoxInfiniBandinterconnect,42.
3TFlopsReal-worldapplicationperformancedependsontheperformanceofthevariouscluster'skeyelements–theprocessor,thememory,andtheinterconnect.
Theinterconnectcontrolsthedatatransferbetweenservers,andhasahighinfluenceontheCPUefficiencyandmemoryutilization.
Transportoffloadinterconnectarchitectures,unlikethe"on-loading"ones,eliminatetheneedofdealingwiththeprotocolprocessingwithintheCPUandthereforeincreasethenumberofcyclesavailableforcomputationaltasks.
IftheCPUisbusymovingdataandhandlingnetworkprotocolprocessing,itisunabletoperformcomputationalwork,andtheoverallproductivityofthesystemisseverelydegraded.
Thememorycopyoverheadincludestheresourcesrequiredtocopydatabuffersfromthenetworkdevicetothekernelmemoryandthenfromthekernelmemorytotheapplicationmemory.
Thisapproachrequiresmultiplememoryaccessesbeforethedataisplacedinitsfinaldestination.
Whileitisnotamajorproblemforsmalldatatransfers,itisabigproblemforlargerdatatransfers.
Thisiswheretheinterconnectzero-copycapabilitieseliminatesthememorybandwidthbottleneckwithoutinvolvingtheCPUinthenetworkdatatransfer.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
3SandiaNationalLab4500servers,MellanoxInfiniBandinterconnect53TFlops,84.
66%LinpackefficiencyTheinterconnectbandwidthandlatencyhavetraditionallybeenusedastwometricsforassessingtheperformanceofthesystem'sinterconnectfabric.
However,thesetwometricsaretypicallynotsufficienttodeterminetheperformanceofrealworldapplications.
Typicalreal-worldapplicationssendmessagesrangingfrom64Byteto4Megabyteusingnotonlypoint-to-pointcommunicationbutadiversemixtureofcommunicationpatterns,includingcollectiveandreductionpatternsinthecaseofMPI.
Insomecases,interconnectvendorscreateartificialbenchmarks,suchasmessagerate,andapplybombasticmarketingsloganstothesebenchmarks–suchas"Hypermessaging".
Messagerateisyetanothersinglepointinthepoint-to-pointbandwidthgraph.
Ifthetraditionalinterconnectbandwidthindicatesthemaximumavailablebandwidth(singlepoint),messagerateindicatesthebandwidthformessagesizeofzeroor2bytes.
Thesinglepointsofdata,givesomeindicationfortheinterconnectperformance,butarefarfromdescribingtherealworldapplicationperformance.
Theinteractivecombinationofthosepoints,togetherwithothers(CPUoverhead,zerocopyetc.
),willdeterminetheoverallabilityoftheconnectivitysolution.
Thedifferencebetweentheoreticalpowerandwhatisactuallydeliveredismeasuredasprocessorefficiency.
ThemoreCPUcyclesusedtogetthedataoutthedoorby"fillingthewire"duetoprotocolanddatatransferinefficiencies,thelesscyclesareavailablefortheapplication.
Whencomparinglatenciesofdifferentinterconnects,oneneedstopayattentiontotheinterconnectarchitecture.
1useclatency"on-loading"interconnectversus2useclatency"off-load"solutionissimilartoacasewhenoneneedstodecidebetweentwocarsthatshowthesamehorsepower(i.
e.
CPU).
Bothenginesarecapableof200milesperhour,butthefirstcar,dueto"on-loading",limitstheactualenginepowerto75milesperhour(theenginepowermustbeusedforothertasks).
TheSecondcarhasnolimitationsontheengine,butitswheelscantolerateonly150milesWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
4perhour.
Theknowledgeonthewheelstolerance(i.
e.
latency),asasinglepointofdata,isdefinitelymisleading.
Thereareattemptstoproviderealworldapplicationperformancewhilecomparingdifferentinterconnects,butinmostcasesthe"comparison"isbiasedandbyusingdifferentsystemsand/orconditions,whichmakesatruecomparisondifficult.
Therehavebeenrecentcasescomparing10-GigabitEthernettoInfiniBand.
WhileInfiniBandadaptersweretestedwithPCIex4(thatislimitedto~700MByte/secbandwidth(duetolimitationsincurrentavailablesystems),the10GigabitEthernetcardswerePCI-X,thatiscapabletohigherbandwidth(~850-900MByte/s).
OthercasescompareInfiniBandPCIex4tootherinterconnectswithPCIex8hostinterface(theonlyvalidconclusiononecanmakeisthatPCIex8hasmorelanesthanPCIex4).
AnotherpapercomparedQLogicInfiniPathonIntel3GHzCPUbasedsystemtoMellanoxInfiniBandon2.
2GHzOpteronbasedsystem.
Anyattempttocomparedifferentinterconnectsinthosemannersisdeceptive.
RealapplicationperformanceInfiniBandisaproveninterconnectforclusteredserversolutions,andoneoftheleadingconnectivitysolutionforhigh-performancecomputing.
InfiniBandwasdesignedasageneralI/Oandinpracticeprovideslow-latencyandthehighestlinkspeed.
ComputationalFluidDynamics(CFD)isoneofthebranchesoffluidmechanicsthatusesnumericalmethodsandalgorithmstosolveandanalyzeproblemsthatinvolvefluidflows.
ANSYS/FLUENTisaleadingcommercialsoftwareproviderforsolvingfluidflowproblems.
ThebroadphysicalmodelingcapabilitiesofFLUENThavebeenappliedtoindustrialapplicationsrangingfromairflowoveranaircraftwingtocombustioninafurnace,frombubblecolumnstoglassproduction,frombloodflowtosemiconductormanufacturing,fromcleanroomdesigntowastewatertreatmentplants.
Theabilityofthesoftwaretomodelin-cylinderengines,aeroacoustics,turbomachinery,andmultiphasesystemshasservedtobroadenitsreach.
AtthecoreofanyCFDcalculationisacomputationalgrid,usedtodividethesolutiondomainintothousandsormillionsofelementswheretheproblemvariablesarecomputedandstored.
InFLUENT,unstructuredgridtechnologyisused,whichmeansthatthegridcanconsistofelementsinavarietyofshapes:quadrilateralsandtrianglesfor2Dsimulations,andhexahedral,tetrahedral,prisms,andpyramidsfor3Dsimulations.
Theseelementsformaninterlockingnetworkthroughoutthevolumewherethefluidflowanalysistakesplace.
TheperformanceofaCFDcodedependsonseveralfactors,includingsizeandtopologyofthemesh,physicalmodels,numericsandparallelization,compilersandoptimization,inadditiontoperformancecharacteristicsofthehardwarewherethesimulationisperformed.
FLUENTprovidesasetofbenchmarkproblemswhichrepresenttypicalcurrentusageandcoveringawiderangeofmeshsizesandphysicalmodels.
Theproblemsselectedrepresentarangeofsimulationstypicalofthosewhichmightbefoundinindustry.
TheprincipalobjectiveofthisbenchmarksuiteistoprovidecomprehensiveandfaircomparativeinformationoftheperformanceofFLUENTonavailablehardwareplatforms.
ThefollowingchartscomparesMellanoxInfiniBandandQLogicInfiniPathinterconnectsonthesameplatform–dualcore,dualsocket,IntelXeon3GHz5100series(codenameWoodcrest)servers,usingFLUENTbenchmarks.
Whentestingrealworldapplications,theentirearchitecturemakesthedifference.
TheMellanoxarchitectureisafulltransport-offloadone,withhardwarecapabilitiesofRDMA,whileQLogicisafull"on-loading"architecture.
WhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
5InFluentFL5L3benchmark,aTurbulentflowofairthroughaductiscomputed.
Thecross-sectionalplanesoftheducttransitionfromacircleattheinlettoarectangleattheoutflowboundary.
TheReynolds-StressModelisusedforcomputingturbulence(numberofcells:9,792,512,celltypehexahedral,modelsRSMturbulence,solversegregatedimplicit).
FLUENTFL5L2benchmarkrepresentsthecomputationoftheexteriorflowfieldaroundasimplifiedmodelofapassengersedan.
ThesimulationgeometrywasusedfortheJapanExternalAerodynamicscompetition.
Aviscous-hybridgridwithprismaticcellsisusedtoadequatelyFluent6.
3,FL5L3case0200400600800100012001400160018002000020406080100120140CPUcoresRating(performance)QlogicMellanoxFluent6.
3,FL5L2case02000400060008000020406080CPUcoresRating(performance)QlogicMellanoxWhitePaper:RealApplicationPerformanceandBeyond2006MellanoxTechnologiesInc.
6modeltheboundarylayerregions(numberofcells3,618,080,celltypehybrid,modelsk-epsilonturbulence,solversegregatedimplicit).
ChoosingtherightinterconnectInbothcasesofFLUENTbenchmarks,MellanoxInfiniBandshowshigherperformanceandbettersuper-linearscalingcomparingtoQLogicInfiniPath.
FLUENT'sCFDapplicationisalatency-sensitiveapplication,andtheresultsshownherearegoodexamplesonhowpurelatencybenchmarkscanbemisleadingwhenchoosingtherightinterconnect.
Inordertodeterminethesystem'sperformance,oneshouldtakeintoconsiderationtheentireinterconnectarchitecture(suchasoff-loadingversuson-loading)andtheabilityofscaling,ratherthanjustsinglepointsofdata.
Inordertoprovidebetterapplicationssight,MellanoxhascreatedtheMellanoxClusterCenter.
TheMellanoxClusterCenteroffersanenvironmentfordeveloping,testing,benchmarkingandoptimizingproductsbasedonInfiniBandtechnology.
Thecenter,locatedinSantaClara,California,provideson-sitetechnicalsupportandenablessecuresessionsonsiteorremotely.
MoredetailscanbeachievedthroughMellanoxwebsite.
mineserver怎么样?mineserver是一家国人商家,主要提供香港CN2 KVM VPS、香港CMI KVM VPS、日本CN2 KVM VPS、洛杉矶cn2 gia端口转发等服务,云服务器网(yuntue.com)介绍过几次,最近比较活跃。现在新推出了3款特价KVM VPS,性价比高,香港CMI/洛杉矶GIA VPS,2核/2GB内存/20GB NVME/3.5TB流量/200Mbps...
每年的7月的最后一个周五是全球性质的“系统管理员日”,据说是为了感谢系统管理员的辛苦工作....friendhosting决定从现在开始一直到9月8日对其全球9个数据中心的VPS进行4.5折(优惠55%)大促销。所有VPS基于KVM虚拟,给100M带宽,不限制流量,允许自定义上传ISO...官方网站:https://friendhosting.net比特币、信用卡、PayPal、支付宝、微信、we...
Pia云商家在前面有介绍过一次,根据市面上的信息是2018的开办的国人商家,原名叫哔哔云,目前整合到了魔方云平台。这个云服务商家主要销售云服务器VPS主机业务和服务,云服务器采用KVM虚拟架构 。目前涉及的机房有美国洛杉矶、中国香港和深圳地区。洛杉矶为crea机房,三网回程CN2 GIA,自带20G防御。中国香港机房的线路也是CN2直连大陆,比较适合建站或者有游戏业务需求的用户群。在这篇文章中,简...
opteron为你推荐
酒店回应名媛拼单酒店分房时出现单男单女时,怎样处理?巨星prince去世作者为什么把伏尔泰的逝世说成是巨星陨落18comic.fun18岁以后男孩最喜欢的网站rawtoolsRAW是什么衣服牌子同一服务器网站同一服务器上的域名/网址无法访问www.haole012.com012.qq.com是真的吗www.vtigu.com初三了,为什么考试的数学题都那么难,我最多也就135,最后一道选择,填空啊根本没法做,最后几道大题倒www.se222se.com原来的www站到底222eee怎么了莫非不是不能222eee在收视com了,/?求解www.ijinshan.com金山毒霸的网站是多少1377.com真实.女友下载地址谁有
广西虚拟主机 广东服务器租用 泛域名解析 大硬盘 hawkhost优惠码 加勒比群岛 tier 美国主机论坛 便宜建站 网盘申请 165邮箱 宁波服务器 合租空间 200g硬盘 佛山高防服务器 卡巴斯基免费试用 四核服务器 智能dns解析 qq金券 atom处理器 更多