1.0ivybridge

ivybridge  时间:2021-03-28  阅读:()
LS-DYNAPerformanceBenchmarkandProfilingOctober20172NoteThefollowingresearchwasperformedundertheHPCAdvisoryCouncilactivities–Participatingvendors:LSTC,Huawei,Mellanox–Computeresource-HPCAdvisoryCouncilClusterCenterThefollowingwasdonetoprovidebestpractices–LS-DYNAperformanceoverview–UnderstandingLS-DYNAcommunicationpatterns–WaystoincreaseLS-DYNAproductivity–MPIlibrariescomparisonsFormoreinfopleasereferto–http://www.
lstc.
com–http://www.
huawei.
com–http://www.
mellanox.
com3LS-DYNALS-DYNA–Ageneralpurposestructuralandfluidanalysissimulationsoftwarepackagecapableofsimulatingcomplexrealworldproblems–DevelopedbytheLivermoreSoftwareTechnologyCorporation(LSTC)LS-DYNAusedby–Automobile–Aerospace–Construction–Military–Manufacturing–Bioengineering4ObjectivesThepresentedresearchwasdonetoprovidebestpractices–LS-DYNAperformancebenchmarkingMPILibraryperformancecomparisonInterconnectperformancecomparisonCompilerscomparisonOptimizationtuningThepresentedresultswilldemonstrate–Thescalabilityofthecomputeenvironment/application–Considerationsforhigherproductivityandefficiency5TestClusterConfigurationHuaweiFusionServerE9000withFusionServerCH121V516-node(640-core)"Skylake"cluster–Dual-Socket20-CoreIntelXeonGold6138@2.
00GHzCPUs–Memory:192GBmemory,DDR42666MHzRDIMMspernode–OS:RHEL7.
2,MLNX_OFED_LINUX-4.
1-1.
0.
2.
0InfiniBandSWstackMellanoxConnectX-5EDR100Gb/sInfiniBandAdaptersMellanoxSwitch-IBSB780036-portEDR100Gb/sInfiniBandSwitchCompilers:IntelParallelStudioXE2018MPI:IntelMPI2018,MellanoxHPC-XMPIToolkitv1.
9.
7,PlatformMPI9.
1.
4.
3Application:MPPLS-DYNAR9.
1.
0,build113698,singleprecisionMPIProfiler:IPM(fromMellanoxHPC-X)Benchmarks:TopCrunchbenchmarks–NeonRefinedRevised(neon_refined_revised),ThreeVehicleCollision(3cars),NCACMinivanModel(Caravan2m-ver10),odb10m(NCACTaurusmodel)6High-Performance2-SocketBladeUnlocksSupremeComputingPowerFull-seriesIntelXeonScalableProcessors,24DDR4DIMMs,AEPmemorysupported,1PCIeslot,2SFF/2NVMeSSDs/4M.
2SSDshigh-performancestorage,multi-planenetwork,LOMsupportedIntroducingHuaweiFusionServerE9000(CH121)V57LS-DYNAPerformance–CPUSKUsandGenerationLS-DYNAperformancegainbylargercorecountsandbettermemorythroughput–The"Gold6140"demonstratesa50%ofperformancegain(29%morecores)vsE5-2680v4–The"Gold6148"demonstratesa61%ofperformancegain(42%morecores)vsE5-2680v4–BaseclockarethesameonE5-2680v4andGold6148,whileGold6140runsslightlyslower–Skylakesupports6memorychannelsandfasterDIMMswhichimpactsonmemoryperformanceSingleNodePerformanceHigherisbetter61%50%8LS-DYNAPerformance–MemorySpeedMemoryspeedprovidessomebenefitstoLS-DYNAperformance–SkylakeplatformsupportsDIMMspeedupto2666MHzDIMMs–2666MHzDIMMsistheoretically~11%fasterthanthe2400MHzDIMMs–LS-DYNAreportsonlyabout~2-3%oftheimprovementonasinglenode–ItappearsonlypartofthespeeddifferenceistranslatedintoLS-DYNAperformancegain40MPIProcesses/NodeHigherisbetter9LS-DYNAPerformance–Sub-NUMAClusteringEnablingSNCprovidessomebenefitsforLS-DYNA–Sub-NUMAClustering(SNC)issimilartoacluster-on-die(COD)inHaswell/Broadwellgeneration–CPUcoresandmemorywouldbesplitinto2separateNUMAdomainswhenSNCisenabled–SNCgenerallyshoulddemonstratesomebenefitsforapplicationsthatrequiresgoodNUMAlocality–SNCdemonstratesaperformancegainof~2-3%onasinglenodebasis40MPIProcesses/NodeHigherisbetter10LS-DYNAPerformance–CPUInstructionsAVX2outperformsbothAVX-512andSSE2executablesonSkylakeCPU–Performancegainof17%byusingAVX2overAVX-512executables–AVX-512performsworsecomparedtoAVX2,despiteimprovedvectorization–AVX-512instructionsrunsatareducedclockfrequencyasAVX2andnormalclocks–BenefitofAVX2appearstobelargeronbiggerdataset(suchascar2car)40MPIProcesses/NodeHigherisbetter17%8%4%3%11LS-DYNAPerformance–CPUInstructionSetsSomevarianceinperformanceamongdifferentLS-DYNAversions/executables–AVX2performsbetterthanSSE2LS-DYNAexecutables–SmallvarianceinperformanceamongdifferentLS-DYNAreleases–R7.
1.
3appearedtoperformbetteronlargerdatasets40MPIProcesses/NodeHigherisbetter20%12LS-DYNAPerformance–MPILibrariesAllthreeMPIimplementationsshowsdecentperformanceatscale–PlatformMPIandHPC-Xperformssimilarly,whileIntelMPIshowsadropatsmalldatasetatscale40MPIProcesses/NodeHigherisbetter13LS-DYNAPerformance–SystemGenerationsCurrentSkylakesystemconfigurationoutperformspriorsystemgenerations–SkylakeplatformoutperformedBroadwellby21%,Haswellby51%,IvyBridgeby89%,SandyBridgeby132%,Westmereby222%,Nehalemby425%–Skylakeperforms41%betterthanBroadwellforthe3carsmodelonasingle-nodebasis–Systemcomponentsused:Skylake:2-socket20-coreXeonGold61382.
0GHz,2666MHzDIMMs,ConnectX-5EDRInfiniBandBroadwell:2-socket14-coreXeonE5-2690v42.
6GHz,2400MHzDIMMs,ConnectX-4EDRInfiniBandHaswell:2-socket14-coreXeonE5-2697v32.
6GHz,2133MHzDIMMs,ConnectX-4EDRInfiniBandIvyBridge:2-socket10-coreXeonE5-2680v22.
8GHz,1600MHzDIMMs,Connect-IBFDRInfiniBandSandyBridge:2-socket8-coreXeonE5-26802.
7GHz,1600MHzDIMMs,ConnectX-3FDRInfiniBandWestmere:2-socket6-coreXeonx56702.
93GHz,1333MHzDIMMs,ConnectX-2QDRInfiniBandNehalem:2-socket4-coreXeonx55702.
93GHz,1333MHzDIMMs,ConnectX-2QDRInfiniBandBestresultsshownHigherisbetter41%14LS-DYNASummaryLS-DYNAismulti-purposeexplicitandimplicitfiniteelementprogram–Utilizesbothcompute,memoryandnetworkcommunicationsforperformanceEffectofMPIonperformance–PlatformMPIandHPC-Xperformssimilarly,IntelMPIshowsadropatsmalldatasetEffectofSkylakegenerationonperformance–Providessubstantialperformancegainduetothelargercorecount,supportformemorychannels–Faster2666MHzDIMM(comparesto2400MHz)translatestoincrease2-3%inhigherperformanceEffortofCPUInstructionsonperformance–AVX-512performsworsecomparedtoAVX2,despitetheimprovedvectorization–AVX-512instructionsrunsatareducedclockfrequencyasAVX2andnormalclocksEffectofSNConperformance–EnablingSub-NUMAClusteringprovidessmalladvantage(~2-3%)onsinglenodeEffectfoLS-DYNAversiononperformance–SmallvarianceinperformanceamongdifferentLS-DYNAreleases;bestappearedtobeR7.
1.
31515ThankYouHPCAdvisoryCouncilAlltrademarksarepropertyoftheirrespectiveowners.
Allinformationisprovided"As-Is"withoutanykindofwarranty.
TheHPCAdvisoryCouncilmakesnorepresentationtotheaccuracyandcompletenessoftheinformationcontainedherein.
HPCAdvisoryCouncilundertakesnodutyandassumesnoobligationtoupdateorcorrectanyinformationpresentedherein

弘速云20.8元/月 ,香港云服务器 2核 1g 10M

弘速云元旦活动本公司所销售的弹性云服务器、虚拟专用服务器(VPS)、虚拟主机等涉及网站接入服务的云产品由具备相关资质的第三方合作服务商提供官方网站:https://www.hosuyun.com公司名:弘速科技有限公司香港沙田直营机房采用CTGNET高速回国线路弹性款8折起优惠码:hosu1-1 测试ip:69.165.77.50​地区CPU内存硬盘带宽价格购买地址香港沙田2-8核1-16G20-...

阿里云年中活动最后一周 - ECS共享型N4 2G1M年付59元

以前我们在参与到云服务商促销活动的时候周期基本是一周时间,而如今我们会看到无论是云服务商还是电商活动基本上周期都要有超过一个月,所以我们有一些网友习惯在活动结束之前看看商家是不是有最后的促销活动吸引力的,比如有看到阿里云年中活动最后一周,如果我们有需要云服务器的可以看看。在前面的文章中(阿里云新人福利选择共享性N4云服务器年79.86元且送2月数据库),(LAOZUO.ORG)有提到阿里云今年的云...

spinservers:圣何塞物理机7.5折,$111/月,2*e5-2630Lv3/64G内存/2T SSD/10Gbps带宽

spinservers美国圣何塞机房的独立服务器补货120台,默认接入10Gbps带宽,给你超高配置,这价格目前来看好像真的是无敌手,而且可以做到下单后30分钟内交货,都是预先部署好了的。每一台机器用户都可以在后台自行安装、重装、重启、关机操作,无需人工参与! 官方网站:https://www.spinservers.com 比特币、信用卡、PayPal、支付宝、webmoney、Payssi...

ivybridge为你推荐
vc组合有一首歌好像是什么昆虫组合?跟青春有关好像。叫什么了permissiondeniedpermission denied 怎么解决月神谭求几个个性网名:月神谭求男变女类的变身小说百度指数词百度指数我创建的新词www.se222se.com原来的www站到底222eee怎么了莫非不是不能222eee在收视com了,/?求解m88.comwww.m88.com现在的官方网址是哪个啊 ?www.m88.com怎么样?partnersonline国内有哪些知名的ACCA培训机构www.zhiboba.com看NBA直播的网站哪个知道baqizi.cc孔融弑母是真的吗?
国外vps mach 42u机柜尺寸 标准机柜尺寸 中国智能物流骨干网 天互数据 空间出租 卡巴斯基试用版 美国免费空间 昆明蜗牛家 paypal注册教程 上海服务器 东莞idc ebay注册 独立主机 中国linux 宿迁服务器 tracker服务器 SmartAXMT800 godaddy中文 更多