Intel:AcceleratingthePathtoExascaleKirkSkaugenVicePresidentIntelArchitectureGroupGeneralManagerDataCenterGroupAnInsatiableNeedForComputingExascaleProblemsCannotBeSolvedUsingtheComputingPowerAvailableToday10PFlops1PFlops100TFlops10TFlops1TFlops100GFlops10GFlops1GFlops100MFlops100PFlops10EFlops1EFlops100EFlops1993201719992005201120231ZFlops2029WeatherPredictionMedicalImagingGenomicsResearchSource:www.
top500.
orgForecastExascaleAnswersMankind'sChallengesIn…Weather/ClimateHealthcareNewFormsofEnergyWe'veHelpedTransformIndustries~1TFLOP~$55K/GFLOP500TFLOPSPerformance$/GFLOPAnnualServerProcessorShipmentsSupercomputingin1997Supercomputingin201019952000200020052010201519952000200520101995IntelCommitmentToExascaleProgrammingParallelismEfficientPerformanceExtremeScalabilityIntelExascaleCommitment:>100XPerformanceOfTodayAtOnly2XThePowerofToday's#1SystemScalingToday'sSoftwareModel6ExascaleRequirementsPetascaleMachineof2010:TFLOPofComputeEstimationbasedonPetascalemachinerequirementscirca2010.
Compute40xMemory75XComms20xDisk/Storage33xOther900xVisceralFocusonSystemPowerEfficiencyImprovementScalingProgrammabilityOneProgrammingModelDemocratizesUsage…AvoidCostlyDetours2003200520072009201190nm65nm45nm32nm22nmInventedSiGeStrainedSilicon2ndGen.
SiGeStrainedSilicon2ndGen.
Gate-LastHigh-kMetalGateInventedGate-LastHigh-kMetalGateFirsttoImplementTri-GateSTRAINEDSILICONHIGH-kMETALGATETRI-GATE22nmARevolutionaryLeapinProcessTechnology37%PerformanceGainatLowVoltage*>50%ActivePowerReductionatConstantPerformance*ProcessTechnologyLeadershipThefoundationforallcomputingSource:Intel*ComparedtoIntel32nmTechnologyIntelLabs&HPCStrongResearchPartnershipsUniversitiesGovernmentIndustryWorldClassResearchinHPC*Othernames,logosandbrandsmaybeclaimedasthepropertyofothers.
DeliveringBreakthroughTechnologiestoFuelInnovationPowerful.
Intelligent.
EfficientI/OIntegratedPCIereduceslatencyandpowerGrowingPerformanceUpto8corespersocket2XFLOPSwithIntelAdvancedVectorExtensionsContinuingTheJourney:NextIntelXeonProcessorCodenamedSandyBridge-EPTheFoundationoftheInnovationinScienceandTechnologyHighlyParallelPerformanceIntelManyIntegratedCore(IntelMIC)ArchitectureLaunchingon22nmwith>50corestoprovideoutstandingperformanceforHPCusersThemanybenefitsofbroadIntelCPUprogrammingmodels,techniques,andfamiliarx86developertoolsDeliveredPerformanceThecomputedensityassociatedwithspecialtyacceleratorsforparallelworkloadsAStepForwardInDealingWithEfficientPerformance&ProgrammabilityProgrammabilityPerformanceDensity13EvaluatingtheIntelMICArchitectureArndtBodeLeibnizSupercomputingCentre,GermanywithinputfromIrisChristadler,AlexanderHeineckeandVolkerWeinbergJune2011,ISC,HamburgEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011PrefaceProgrammingmodelsarethekeytoharnessthecomputationalpowerofmassivelyparalleldevices.
Obviously,Intelhasrealizedthistrendandsubstantiallysupportsopenstandardsandinvestsininnovativeprogrammingmodels.
LRZandTUMareusingIntelhard-andsoftwareformanyyearsandknowthetoolchainbyheart.
Weexpect:Ahardwareproductthatdeliversgoodperformance(andenergy-efficiency)withoutloosingprogrammability.
14EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
WritingMIC-acceleratedcodewithminimaleffortandgreatperformance15EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011WorkloadsunderInvestigationEurobenKernels(7dwarfsofHPC)DataMiningTifaMMy–MatrixOperations(DemohereatISC'11!
)FurtherLinearAlgebraandSimulationCodes16EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011EurobenKernelsSelectedmicro-benchmarksusedinPRACEfortheevaluationofacceleratorhardware&newlanguages:http://www.
prace-project.
eu/documents/public-deliverables/d6-6.
pdf–Example:mod2am:densematrix-matrixmultiplication(MxM)17Performanceevaluationofmod2amonKNFwith30cores@1050MHzusingIntel'sOffloadCompiler,singleprecision,datatransfertimesexcludedEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011DataMiningwithAdaptiveSparseGridsMachinelearningalgorithmLearningfunctionfromatrainingdatasetImportantworkloadforclassificationandregressionofhugedatasetsMIC-Execution:StraightforwardFirstversionwithinafewhoursOptimizedversiontook2days150420050100150200250300350400450WSM-EPX5670KNF32/1200(incl.
offload)GFlops/s18Testworkload:Learning5dcheckerboardwith262144instancesandclassificationaccuracyof92%EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–IdeaandApplicationTifaMMy:self-adaptiveandcache-obliviousframeworkformatrixoperationsoptimizedonfatx86coresThisisdonebynestedrecursionsandvectorizedkernels–OnMIConlythekernelswerechanged,MIC'sx86coresareabletotacklenestedrecursions!
parallelizationschemeemployingOpenMPcanbereusedhavingSSEkernels,bringingcodetoMICisnearlyforfree19EvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011TifaMMy–PerformanceMatrixMultiplication20010020030040050060070032256480704928115213761600182420482272249627202944316833923616384040644288451247364960518454085632585660806304652867526976720074247648MatrixSizeGFLOPSMaxTestworkload:TifaMMyExecutedonKNFwith32cores@1200MHzEvaluatingtheIntelMICArchitecture,Prof.
A.
Bode,LRZJune2011AdvantagesoftheMICArchitectureIsastandardx86architecture!
AllowsmanydifferentparallelprogrammingmodelslikeOpenMP,MPIandIntelCilk!
Offersstandardmath-librarieslikeIntelMKL!
SupportswholeInteltoolchain,e.
g.
Compiler&Debugger!
Pre-releaseMIC-acceleratedcodeforatypicalscientificworkload(e.
g.
DataMining,TifaMMy)canreachupto50%ofpeakperformance!
VisitdemohereatISC'11!
21"SGIunderstandsthesignificanceofinter-processorcommunications,power,densityandusabilitywhenarchitectingforexascale.
IntelhasmadetheleaptowardsexaflopcomputingwiththeintroductionofIntelManyIntegratedCore(MIC)architecture.
FutureIntelMICproductswillsatisfyallfourofthesepriorities,especiallywiththeirexpectedtentimesincreaseincomputedensitycoupledwiththeirfamiliarX86programmingenvironment.
"Dr.
EngLimGoh,SGICTO23IntelMICArchitecture:NeededforExascaleExaflopby2018125xcomputepower25x:Moore'sLaw5x:remains24IntelMICArchitecture:Familiarx86Programming#include#include#defineN1000000000LLmain(){doublepi=0.
0f;longi;#pragmaoffloadtarget(mic)#pragmaompparallelforreduction(+:pi)for(i=0;i100XPerformanceOfTodayAtOnly2XThePowerOfToday's#1ScalingToday'sSoftwareModel30SystemConfiguration7TFLOPSSGEMMinanodeHWspecifications8xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostColfaxCXT8000:2socketplatformwith2IntelXeonprocessorX5690(3.
46GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,DualIntel5520IOH,OSRHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):ComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–ColfaxModel:CXT8000Serverw/Intel5520chipsetand4PLXPEX8647Gen2PCIeswitches–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)31SystemConfigurationHybridComputingwithIntelMKLHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,IntelMKL,driversetc.
)SWspecificationsMKL4KNFMKLKNF.
b2build20110518MKL10.
3.
332SystemConfigurationHybridComputingLUFactorizationHWspecifications1xKNFD0Si@1.
2GHz,2GBGDDR5@3.
6GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)33SystemConfigurationKISTIMolecularDynamicsHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostDellPrecisionWorkstation1socketplatformwith1IntelXeonprocessorX5620(4cores,2.
4GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–DellPrecisionWorkstation–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)34SystemConfigurationCERNopenlab:CoreScalingofIntelMICArchitectureHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostSGIH40022socketplatformwith2IntelXeonprocessorX5690(6cores,3.
46GHz,12MBL3cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–SGIH4002System–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)35SystemConfigurationLRZ:TifaMMyMatrixMultiplicationHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)36SystemConfigurationFZJülich:SMMPProteinFoldingHWspecifications1xKNFC0Si@1.
2GHz,2GBGDDR5@3.
0GT/sHostShadyCove2socketplatformwith2IntelXeonprocessorX5680(3.
33GHz,6cores,12MBL3Cache)with24GBDDR3@1333MHz,singleIntel5520IOH,OS:RHEL6.
0KNFSWStackLarrabeekerneldriverver.
1.
6.
197FlashImage/uOS:1.
0.
0.
1137/1.
0.
0.
1137-EXT-HPCOffloadcompiler(w/dataxfer):IntelComposerXEforMIC0.
043Nativecompiler(w/odataxfer):VersionAlphaBuild20110518–KnightsFerrySoftwareDevelopmentPlatform(ShadyCove)–IntelAlphalevelsoftware(IntelCompilers,driversetc.
)
CloudCone发布了2021年的闪售活动,提供了几款年付VPS套餐,基于KVM架构,采用Intel® Xeon® Silver 4214 or Xeon® E5s CPU及SSD硬盘组RAID10,最低每年14.02美元起,支持PayPal或者支付宝付款。这是一家成立于2017年的国外VPS主机商,提供VPS和独立服务器租用,数据中心为美国洛杉矶MC机房。下面列出几款年付套餐配置信息。CPU:...
hosteons当前对美国洛杉矶、达拉斯、纽约数据中心的VPS进行特别的促销活动:(1)免费从1Gbps升级到10Gbps带宽,(2)Free Blesta License授权,(3)Windows server 2019授权,要求从2G内存起,而且是年付。 官方网站:https://www.hosteons.com 使用优惠码:zhujicepingEDDB10G,可以获得: 免费升级10...
iON Cloud怎么样?iON Cloud升级了新加坡CN2 VPS的带宽和流量最低配的原先带宽5M现在升级为10M,流量也从原先的150G升级为250G。注意,流量也仅计算出站方向。iON Cloud是Krypt旗下的云服务器品牌,成立于2019年,是美国老牌机房(1998~)krypt旗下的VPS云服务器品牌,主打国外VPS云服务器业务,均采用KVM架构,整体性能配置较高,云服务器产品质量靠...
www.6080.org为你推荐
openeuler谁知道open opened close closed的区别吗硬盘的工作原理硬盘的工作原理?是怎样存取数据的?网站检测请问论文检测网站好的有那些?www.e12.com.cn有什么好的高中学习网?se95se.comwww.sea8.com这个网站是用什么做的 需要多少钱bbs2.99nets.com让(bbs www)*****.cn进入同一个站www.zhiboba.com网上看nbajavlibrary.com大家有没有在线图书馆WWW。QUESTIA。COM的免费帐号pp43.com登录www.bdnpxzl.com怎么进入网站后台啊红玉头冠wow里面达拉然那个鼎鼎有名的佛罗佐的头部是什么啊?就是三颗冰晶的那个,我记得是可以得到的、因为看
香港虚拟主机 域名服务器 太原域名注册 域名备案收费吗 新加坡服务器 webhosting 42u标准机柜尺寸 英文简历模板word 密码泄露 gg广告 促正网秒杀 天互数据 vip购优汇 北京双线机房 免费个人空间 php空间购买 稳定免费空间 天翼云盘 免费cdn 上海联通宽带测速 更多