1Copyright2015EMCCorporation.
Allrightsreserved.
RAIDShield:Characterizing,Monitoring,andProactivelyProtectingAgainstDiskFailuresPresenter:AoMaAjointworkwithFredDouglis,GuanlinLu,DarrenSawyerSurendarChandra,WindsorHsu2Copyright2015EMCCorporation.
Allrightsreserved.
DiskfailuresarecommonplaceWhole-diskfailurePartialfailureRAIDiswidelydeployedProtectdataagainstfailureswithredundancyPervasiveRAIDProtection3Copyright2015EMCCorporation.
Allrightsreserved.
StoragesystemisevolvingEscalateduseoflessreliabledrivescausesmorewhole-diskfailuresIncreasingdiskcapacityresultsinmoresectorerrorsSolutionAddextraredundancy(RAID5,RAID6,…)–EnsuredatareliabilityatthecostofstorageefficiencyRAIDOverviewIsaddingextraredundancyanefficientsolution4Copyright2015EMCCorporation.
Allrightsreserved.
Analyzed1millionSATAdisksandrevealedFailuremodesdegradingRAIDreliabilityReallocatedsectorsreflectdiskreliabilitydeteriorationDiskfailureispredictableBuiltRAIDSHIELD,anactivedefensemechanismReconstructfailingdiskbeforeit'stoolate!
PLATE:single-diskproactiveprotection–Deploymenteliminates70%ofRAIDfailuresARMOR:diskgroupproactiveprotection–RecognizevulnerableRAIDgroupsWhatWeDid5Copyright2015EMCCorporation.
Allrightsreserved.
BackgroundDiskfailureanalysisRAIDSHIELD:IdentifyfailureindicatorReallocatedSector(RS)characterizationSinglediskproactiveprotectionDiskgroupproactiveprotection5Outline6Copyright2015EMCCorporation.
Allrightsreserved.
Diskfailuredoesnotfollowafail-stopmodelTheproductionsystemsstudieddefinefailureasConnectionislostAnoperationexceedsthetimeoutthresholdWritefailsWhole-diskFailureDefinition7Copyright2015EMCCorporation.
Allrightsreserved.
EachdiskdrivemodelisdenotedasRelativesizeswithinafamilyareorderedbythecapacitynumber–E.
g.
A-2islargerthanA-1DiskModelPopulation(Thousands)FirstDeploymentLogLength(Months)A-13406/200860A-216511/200860B-110006/200848C-19310/201036C-225312/201036D-138409/201121DiskDataCollection8Copyright2015EMCCorporation.
Allrightsreserved.
WhatDoRealDiskFailuresLookLike9Copyright2015EMCCorporation.
Allrightsreserved.
0001424462031010203040500-612-1824-3036-4248-54A-2MonthDistributionofLifetimeofFailedDrives001241534291130102030400-612-1824-3036-4248-54A-1MonthAlargefractionoffaileddrivesarefoundatasimilaragePercentage(%)Percentage(%)10Copyright2015EMCCorporation.
Allrightsreserved.
ThenumberofaffecteddiskskeepgrowingAbout10%ofdisksgetsectorerrorsatthe3rdyearSectorerrornumbersincreasescontinuouslyAverageerrorcountincreases25%to300%yearoveryearIncreasingFrequencyofSectorErrors11Copyright2015EMCCorporation.
Allrightsreserved.
DrivefailingatasimilarageFailurerateisnotconstantAhighriskofmultiplesimultaneousfailuresIncreasingfrequencyofsectorerrorsExacerbateriskofreconstructionfailuresPassiveRedundancyisInefficientEnsuringreliabilityintheworstcaserequiresaddingconsiderableextraredundancy,makingitunattractivefromacostperspective12Copyright2015EMCCorporation.
Allrightsreserved.
MotivationEnsuredatasafetywithminimalredundancyProactivelyrecognizeimpendingfailuresandmigratevulnerabledatainadvanceMethodologyIdentifyindicatorofimpendingfailureIndicatorcharacterizationProactiveprotectionRAIDSHIELD,TheProactiveProtection13Copyright2015EMCCorporation.
Allrightsreserved.
PotentialindicatorsVariousdiskerrorsCriteriaofagoodindicatorIthappensmuchmorefrequentlyonfaileddisksratherthanworkingdisksApproachQuantifythediscriminationbetweenerrorvalueonfaileddisksandworkingones–DecilescomparisonisusedIdentifyFailureIndicator14Copyright2015EMCCorporation.
Allrightsreserved.
FaileddiskshavemoremediaerrorsthanworkingonesThediscriminationisnotsignificantenoughMediaErrorComparison2359152232478611123471330020406080100123456789faileddiskworkingdiskDecilesA-2MediaErrorCount15Copyright2015EMCCorporation.
Allrightsreserved.
2238718732752281212422025000001262905001000150020002500123456789faileddiskworkingdiskA-2DecilesReallocatedSectorCountA-2DecilesRSisstronglycorrelatedwithdiskfailuresReallocatedSector(RS)Comparison16Copyright2015EMCCorporation.
Allrightsreserved.
MostfaileddrivestendtohavealargernumberofRSthanworkingonesRSisstronglycorrelatedwithwhole-diskfailures,followedbymediaerrors,pendingsectorerrorsanduncorrectablesectorerrorsCorrelationBetweenSectorErrorsAndWhole-diskFailureRSisastrongindicatorofimpendingdiskfailure17Copyright2015EMCCorporation.
Allrightsreserved.
LargerRScountimplieshigherfailurerateintwo-monthwindowDiskFailureRateGivenDifferentRSCount1.
767758083858689909091929393949502040608010004080120160200240280320360400440480520560600DiskFailureRate(%)RScountA-2RSCharacterization(1)18Copyright2015EMCCorporation.
Allrightsreserved.
LargerRScount,fastertofail10%25%median75%90%RSCountTimeMargin(Days)DiskFailureTimeGivenDifferentRSCountRSCharacterization(2)19Copyright2015EMCCorporation.
Allrightsreserved.
RScountindicatesthedegreeofdiskreliabilitydeteriorationUsetheRScounttopredictimpendingdiskfailureinadvancePLATE:SingleDiskProactiveProtection20Copyright2015EMCCorporation.
Allrightsreserved.
70.
166.
66461.
859.
952.
14742.
63936.
94.
52.
82.
11.
71.
40.
80.
70.
40.
30.
27010203040506070809010020406080100200300400500600failurespredictedfalsepositivePercentage(%)BoththepredictedfailureandfalsepositiveratesdecreaseasthethresholdincreasesSimulationResult:FailuresCapturedRateGivenDifferentRSThresholdRSthreshold21Copyright2015EMCCorporation.
Allrightsreserved.
551515801070020406080100WithoutProactiveProtectionWithProactiveProtectionHardwareFailuresOthersTripleFailuresEliminatedTripleFailuresSingleproactiveprotectionreducesabout70%ofRAIDfailures,equivalentto88%ofthetriple-diskfailuresPLATEDeploymentResult:CausesofRecoveryIncidentsPercentage(%)22Copyright2015EMCCorporation.
Allrightsreserved.
10%remainingtriplefailuresPLATEmissesRAIDfailurescausedbymultiplelessreliabledrives,whoseRScountshaven'texceedthethresholdTriagePrioritizediskgroupswithhighestriskMotivationofARMOR:TheRAIDGroupProactiveProtection23Copyright2015EMCCorporation.
Allrightsreserved.
1-11-21-31-42-12-22-32-43-13-23-33-44-14-24-34-4XXXXThreatofFailureImminentFailureGoodDiskHealthyDG1ImminentFailureofDG2ProtectedDG3PossibleFailureofDG4Singlediskprotection:Replace2-3,2-4,3-4(PLATE)Can'tidentifyDG4northedifferencebetweenDG2andDG3Groupprotection:ReplaceDG4orincreaseredundancy(ARMOR)ProtectDG4andrecognizethedifferencebetweenDG2andDG3DiskGroupProtectionExample24Copyright2015EMCCorporation.
Allrightsreserved.
CalculatethesinglediskfailureprobabilityConditionalprobabilitythroughBayesTheoremCalculatetheprobabilityofavulnerableRAIDCombinationofthosesinglediskprobabilitiesthroughjointprobabilityARMORMethodology25Copyright2015EMCCorporation.
Allrightsreserved.
ThediscriminationshowsARMORiseffectivetorecognizeendangeredDGsInpractice,itidentifiesmostDGfailuresthatarenotpredictedbyPLATEProbabilityDecilesdistributionEvaluation0.
250.
330.
390.
440.
460.
50.
630.
730.
930.
150.
20.
230.
250.
270.
280.
30.
310.
3200.
20.
40.
60.
81123456789GroupswithmorethanonefailureGroupswithoutfailure26Copyright2015EMCCorporation.
Allrightsreserved.
GooglereportsSMARTmetricssuchasreallocatedsectorstronglysuggestanimpendingfailure,buttheyalsodeterminethathalfofthefaileddisksshownosucherrors[Pinheiro'07]DifferentworkloadandRAIDrewriteDiskfailurepredictionAveragemaximumlatency[Goldszmidt'12]SMARTfailureprediction[Murray'05,Hughes'02]RelatedWork27Copyright2015EMCCorporation.
Allrightsreserved.
Weanalyzed1millionSATAdrivesObservefailuremodesdegradingRAIDreliabilityRevealRScountreflectsthediskreliabilitydeteriorationDiskfailureispredictableWebuiltRAIDSHIELD,anactivedefensemechanismPLATE:singlediskproactiveprotection–Deploymenteliminates70%ofRAIDfailuresARMOR:diskgroupproactiveprotection–RecognizevulnerableRAIDgroups–HopetodeployinfutureIsaddingextraredundancyanefficientsolutionUseasmuchredundancyasneededtoensureavailabilityProactivereplacementshoulddecreasethelevelneededSummary28Copyright2015EMCCorporation.
Allrightsreserved.
RAIDShield:Characterizing,Monitoring,andProactivelyProtectingAgainstDiskFailuresQuestionsAcknowledgementAndreaArpaci-DusseauandRemziArpaci-DusseauDataDomainengineerteam,membersofADandCTOoffice,StephenManley29Copyright2015EMCCorporation.
Allrightsreserved.
CalculatethesinglediskfailureprobabilityCalculatetheprobabilityofavulnerableRAID
青果网络怎么样?青果网络隶属于泉州市青果网络科技有限公司,青果网络商家成立于2015年4月1日,拥有工信部颁发的全网IDC/ISP/IP-VPN资质,是国内为数不多具有IDC/ISP双资质的综合型云计算服务商。青果网络是APNIC和CNNIC地址分配联盟成员,泉州市互联网协会会员单位,信誉非常有保障。目前,青果网络商家正式开启了618云特惠活动,针对国内外机房都有相应的优惠。点击进入:青果网络官方...
Boomer.Host是一家比较新的国外主机商,虽然LEB自述 we’re now more than 2 year old,商家提供虚拟主机和VPS,其中VPS主机基于OpenVZ架构,数据中心为美国得克萨斯州休斯敦。目前,商家在LET发了两款特别促销套餐,年付最低3.5美元起,特别提醒:低价低配,且必须年付,请务必自行斟酌确定需求再入手。下面列出几款促销套餐的配置信息。CPU:1core内存:...
轻云互联成立于2018年的国人商家,广州轻云互联网络科技有限公司旗下品牌,主要从事VPS、虚拟主机等云计算产品业务,适合建站、新手上车的值得选择,香港三网直连(电信CN2GIA联通移动CN2直连);美国圣何塞(回程三网CN2GIA)线路,所有产品均采用KVM虚拟技术架构,高效售后保障,稳定多年,高性能可用,网络优质,为您的业务保驾护航。官方网站:点击进入广州轻云网络科技有限公司活动规则:用户购买任...
阵列卡为你推荐
1头牛168万人民币一个人650元买头牛?00元卖出;850元又买回来;900元卖出;问是亏还是赚funnymudpee京东的显卡什么时候能降回正常价格啊,想买个1060广东GDP破10万亿在已披露的2017年GDP经济数据中,以下哪个省份GDP总量排名第一?嘉兴商标注册个人如何申请商标注册www.niuav.com给我个看电影的网站百度指数词什么是百度指数抓站工具抓鸡要什么工具?partnersonline电脑内一切浏览器无法打开广告法中国的广告法有哪些。66smsm.comffff66com手机可以观看视频吗?
万网域名注册 二级域名申请 securitycenter 韩国俄罗斯 便宜域名 koss 天猫双十一抢红包 国内php空间 申请个人网站 lol台服官网 php空间购买 闪讯官网 申请网站 外贸空间 vul 数据库空间 ledlamp 网页加速 tracker服务器 privatetracker 更多