硬盘神武连接服务器失败

神武连接服务器失败  时间:2021-04-14  阅读:()
知RAID固件升级SW_BundleAHS周锋2017-06-27发表某局点H3CFlexServerR390服务器阵列失败数据丢失的经验案例某局点一台H3CFlexServerR390服务器,安装有7块硬盘,其中6块硬盘做RAID10,1块硬盘配置成热备盘.
阵列失败,数据丢失,无法正常进入系统.
开机自检时能看到如下的告警信息:1792-Slot0DriveArray-ValidDataFoundinWrite-BackCache.
Datawillautomaticallybewrittentodrivearray.
1779-Slot0DriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational:Port1I:Box2:Bay2Port2I:Box2:Bay5Logicaldrive(s)disabledduetopossibledataloss.
Select"F1"tocontinuewithlogicaldrive(s)disabledSelect"F2"toacceptdatalossandtore-enablelogicaldrive(s)(RESUME="F1"OR"F2"KEY)[default="F1"in45seconds]**TIMEDOUT**1716-Slot0DriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
BackupandRestorerecommended.
分析日志发现问题如下:1.
IML记录有大量的介质错误,如下:Critical,1192,29197,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Critical,1192,29231,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Repaired,1192,29234,0x0013,DriveArray,,,05/30/201709:10:00,4:InternalStorageEnclosureDeviceFailure(Bay5,Box2,Port2I,Slot0)Repaired,1192,29274,0x0013,DriveArray,,,05/30/201709:10:00,5:InternalStorageEnclosureDeviceFailure(Bay2,Box2,Port1I,Slot0)Caution,1193,933,0x000A,POSTMessage,,,05/30/201711:03:00,6:POSTError:1792-SlotXDriveArray-ValidDataFoundinCacheModule.
Datawillautomaticallybewrittentodrivearray.
Caution,1193,934,0x000A,POSTMessage,,,05/30/201711:03:00,7:POSTError:1779-SlotXDriveArray-Replacementdrive(s)detectedORpreviouslyfaileddrive(s)nowappeartobeoperational.
Caution,1193,935,0x000A,POSTMessage,,,05/30/201711:03:00,8:POSTError:1716-SlotXDriveArray-UnrecoverableMediaErrorsDetectedonDrivesduringpreviousRebuildorBackgroundSurfaceAnalysis(ARM)scan.
Errorswillbefixedautomaticallywhenthesector(s)areoverwritten.
·2.
分析ADU日志能发现当前的阵列配置信息情况是使用P420i阵列卡将bay1-bay6硬盘配置RAID10,组建ArrayA,logicaldrive1;bay1和bay4;bay2和bay5;bay3和bay6组成RAID1组互为镜像,然后3个RAID1组再组成一个RAID0阵列.
bay7硬盘是做热备的,上面报错的bay2和bay5硬盘刚好在同一个RAID1组内,具体如下:BigDriveAssignmentMap0x3f0x000x000x000x000x000x000x000x000x000x000x000x000x000x000x00PositionDeviceStatus0PhysicalDrive(500GBSAS)1I:2:1Informational1PhysicalDrive(500GBSAS)1I:2:2Informational2PhysicalDrive(500GBSAS)1I:2:3Informational3PhysicalDrive(500GBSAS)1I:2:4Informational4PhysicalDrive(500GBSAS)2I:2:5Informational5PhysicalDrive(500GBSAS)2I:2:6InformationalFaultToleranceMode10(0x0002)SmartArrayP420iinEmbeddedSlot:SASArrayA:LogicalDrive1:Mirror/ParityGroupInformationPairedDrive0x00030x00040x00050x00000x00010x00020x00060x00070x00080x00090x000a0x000b0x000c0x000d0x000e0x000f0x00100x00110x00120x00130x00140x00150x00160x00170x00180x00190x001a0x001b0x001c0x001d0x001e0x001f0x00200x00210x00220x00230x00240x00250x00260x00270x00280x00290x002a0x002b0x002c0x002d0x002e0x002f0x00300x00310x00320x00330x00340x00350x00360x00370x00380x00390x003a0x003b0x003c0x003d0x003e0x003f0x00400x00410x00420x00430x00440x00450x00460x00470x00480x00490x004a0x004b0x004c0x004d0x004e0x004f0x00500x00510x00520x00530x00540x00550x00560x00570x00580x00590x005a0x005b0x005c0x005d0x005e0x005f0x00600x00610x00620x00630x00640x00650x00660x00670x00680x00690x006a0x006b0x006c0x006d0x006e0x006f0x00700x00710x00720x00730x00740x00750x00760x00770x00780x00790x007a0x007b0x007c0x007d0x007e0x007f0x00800x00810x00820x00830x00840x00850x00860x00870x00880x00890x008a0x008b0x008c0x008d0x008e0x008f0x00900x00910x00920x00930x00940x00950x00960x00970x00980x00990x009a0x009b0x009c0x009d0x009e0x009f0x00a00x00a10x00a20x00a30x00a40x00a50x00a60x00a70x00a80x00a90x00aa0x00ab0x00ac0x00ad0x00ae0x00af0x00b00x00b10x00b20x00b30x00b40x00b50x00b60x00b70x00b80x00b90x00ba0x00bb0x00bc0x00bd0x00be0x00bf0x00c00x00c10x00c20x00c30x00c40x00c50x00c60x00c70x00c80x00c90x00ca0x00cb0x00cc0x00cd0x00ce0x00cf0x00d00x00d10x00d20x00d30x00d40x00d50x00d60x00d70x00d80x00d90x00da0x00db0x00dc0x00dd0x00de0x00df0x00e00x00e10x00e20x00e30x00e40x00e50x00e60x00e70x00e80x00e90x00ea0x00eb0x00ec0x00ed0x00ee0x00ef0x00f00x00f10x00f20x00f30x00f40x00f50x00f60x00f70x00f80x00f90x00fa0x00fb0x00fc0x00fd0x00fe0x00ffPositionDeviceAssociationStatus0PhysicalDrive(500GBSAS)1I:2:1PhysicalDrive(500GBSAS)1I:2:4Informational1PhysicalDrive(500GBSAS)1I:2:2PhysicalDrive(500GBSAS)2I:2:5Informational2PhysicalDrive(500GBSAS)1I:2:3PhysicalDrive(500GBSAS)2I:2:6Informational3PhysicalDrive(500GBSAS)1I:2:4PhysicalDrive(500GBSAS)1I:2:1Informational4PhysicalDrive(500GBSAS)2I:2:5PhysicalDrive(500GBSAS)1I:2:2Informational5PhysicalDrive(500GBSAS)2I:2:6PhysicalDrive(500GBSAS)1I:2:3Informational6PhysicalDrive(500GBSAS)2I:2:7PhysicalDrive(500GBSAS)2I:2:7Informational3.
阵列失败的情况是bay5硬盘发现被拔掉,导致logicaldrive降级,不长时间bay2硬盘又有被拔掉的记录,由于bay2和bay5在同一个RAID1组内,同时和其他硬盘组成RAID10,所以导致阵列失败,逻辑驱动器失败,bay7这个热备盘也在随后被发现有拔除记录,具体如下:Critical,1192,29211,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:21]Hot-plugdriveremoved,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCritical,1192,29212,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:21]Physicaldrivefailure,Port=2IBox=2Bay=5reason=0x14Caution,1192,29213,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:21]Statechange,logicaldrive0,newstate=DEGRADEDCaution,1192,29214,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCaution,1192,29215,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:26]Statechange,logicaldrive0,newstate=REBUILDINGCaution,1192,29216,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveinserted,Port=2IBox=2Bay=5SN=9XF2L38300009411DFVHCaution,1192,29217,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=NEEDS_REBUILDCritical,1192,29218,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:10:03,[05/3010:45:43]Hot-plugdriveremoved,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29219,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:10:03,[05/3010:45:43]Physicaldrivefailure,Port=1IBox=2Bay=2reason=0x14Caution,1192,29220,SmartArray,Logicaldriveexchangedmedia,,0x00,05/30/201709:10:03,[05/3010:45:43]Mediaexchangeddetected,logicaldrive0Caution,1192,29221,SmartArray,Logicaldrivestatuschanged,,0x00,05/30/201709:10:03,[05/3010:45:43]Statechange,logicaldrive0,newstate=FAILEDCaution,1192,29222,SmartArray,Rebuildcompletedespiteuncorrectablemediaerrors,,0x00,05/30/201709:10:03,[05/3010:45:45]RebuildURE,LDrv=0LBA=0x0005E3800-0x0005E4FFFCaution,1192,29239,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:10:08,[05/3010:45:57]Hot-plugdriveinserted,Port=1IBox=2Bay=2SN=9XF2L2JE000094141M37Critical,1192,29314,SmartArray,Physicaldriveremoved,,0x00,05/30/201709:11:18,[05/3010:46:36]Hot-plugdriveremoved,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFDCritical,1192,29315,SmartArray,Physicaldrivefailure,,0x00,05/30/201709:11:18,[05/3010:46:36]Physicaldrivefailure,Port=2IBox=2Bay=7reason=0x14Caution,1192,29316,SmartArray,Physicaldriveinserted,,0x00,05/30/201709:11:18,[05/3010:46:57]Hot-plugdriveinserted,Port=2IBox=2Bay=7SN=9XF2L2BM00009413GJFD4.
分析每块硬盘的M&P记录,发现2块硬盘(bay2,bay7)有读写/恢复错误,同时有指向硬盘背板的busfaults记录,1块硬盘(bay5)本身没有任何错误,只有busfaults记录,如下:SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort1I:Box2:PhysicalDrive(500GBSAS)1I:2:2:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2JE000094141M37FirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002195fb69f4ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078debd2bWriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0002RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:5:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L38300009411DFVHFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x0000002193dd9f06ReadErrorsHard0x00000000ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000078deb745WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x00000000SmartArrayP420iinEmbeddedSlot:InternalDriveCageatPort2I:Box2:PhysicalDrive(500GBSAS)2I:2:7:MonitorandPerformanceStatistics(SinceFactory)SerialNumber9XF2L2BM00009413GJFDFirmwareRevisionHPD8ProductRevisionHPMM0500FBFVQReferenceTime0x00156e40SectorsRead0x000000000004056fReadErrorsHard0x00000001ReadErrorsRetryRecovered0x00000000ReadErrorsECCCorrected0x0000000000000000SectorsWritten0x0000000000234999WriteErrorsHard0x00000000WriteErrorsRetryRecovered0x00000000SeekCount0xffffffffffffffffSeekErrors0xffffffffffffffffSpinCycles0x00000000SpinUpTime0x0000PerformanceTest10x0000PerformanceTest20xffffPerformanceTest30xffffPerformanceTest40xffffReallocationSectors0xffffffffReallocatedSectors0xffffffffDRQTimeOuts0xffffOtherTimeOuts0x0000DriveRebuildCount0(0x0000)SpinRetries65535(0xffff)RecoversFailedRead0x0000RecoversFailedWrite0x0000FormatErrors0x0000SelfTestFailures0xffffNotReadyFailures0x00000000RemapAbortFailures0xffffffffIRQDeglitchCount4294967295(0xffffffff)BusFaults0x00000016HotPlugCount1(0x00000001)TrackRewriteErrors0xffffWriteErrorsAfterRemap0x0000BackgroundFirmwareRevision0x000x000x000x000x000x000x000x00MediaFailures0x0000HardwareErrors0x0000AbortedCommandFailures0x0000SpinUpFailures0x0000BadTargetCount0(0x0000)PredictiveFailureErrors0x000000005.
另外,发现阵列卡固件,BIOS和iLO4固件均偏低,如下:iLO(iLOAdvancedLicense)iLO4v2.
00p67builtonJul302014SystemROM02/10/2014SlotControllerSerial#VersionVersionVersionRevisionRevision0P420i0014380300131606.
001.
9001.
90.
002.
002140综上日志分析,若排除人为拔盘的操作,可以定位主要是硬盘背板的原因导致的阵列失败,同时可以确认2块硬盘(bay2,bay7)有问题,与bay2同一RAID1组的bay5硬盘没有硬件错误,bay7是热备盘,所以如果更换硬盘背板解决连接稳定性后阵列数据是没有丢失的.
1.
更换硬盘背板,然后先拔掉bay2和bay7问题硬盘(拔掉这两个硬盘对阵列数据完整性没有影响);2.
重启机器,然后重新激活阵列后能进入系统,做好数据备份;3.
同时更换掉bay2,bay7问题硬盘,然后使用最新的SWBundle更新机器固件.
1.
从日志中找到阵列失败的时间点和具体硬盘如何组成的阵列对分析问题十分有帮助;2.
针对阵列、存储、硬盘类问题需要收集全AHS和ADU日志;3.
硬盘M&P的记录对分析硬盘是否有硬件问题以及硬盘背板是否正常非常有用.

提速啦(24元/月)河南BGP云服务器活动 买一年送一年4核 4G 5M

提速啦的来历提速啦是 网站 本着“良心 便宜 稳定”的初衷 为小白用户避免被坑 由赣州王成璟网络科技有限公司旗下赣州提速啦网络科技有限公司运营 投资1000万人民币 在美国Cera 香港CTG 香港Cera 国内 杭州 宿迁 浙江 赣州 南昌 大连 辽宁 扬州 等地区建立数据中心 正规持有IDC ISP CDN 云牌照 公司。公司购买产品支持3天内退款 超过3天步退款政策。提速啦的市场定位提速啦主...

racknerd新上架“洛杉矶”VPS$29/年,3.8G内存/3核/58gSSD/5T流量

racknerd发表了2021年美国独立日的促销费用便宜的vps,两种便宜的美国vps位于洛杉矶multacom室,访问了1Gbps的带宽,采用了solusvm管理,硬盘是SSDraid10...近两年来,racknerd的声誉不断积累,服务器的稳定性和售后服务。官方网站:https://www.racknerd.com多种加密数字货币、信用卡、PayPal、支付宝、银联、webmoney,可以付...

tmhhost(100元/季)自带windows系统,香港(三网)cn2 gia、日本cn2、韩国cn2、美国(三网)cn2 gia、美国cn2gia200G高防

tmhhost可谓是相当熟悉国内网络情况(资质方面:ISP\ICP\工商齐备),专业售卖海外高端优质线路的云服务器和独立服务器,包括了:香港的三网cn2 gia、日本 cn2、日本软银云服务器、韩国CN2、美国三网cn2 gia 云服务器、美国 cn2 gia +200G高防的。另外还有国内云服务器:镇江BGP 大连BGP数据盘和系统盘分开,自带windows系统,支持支付宝付款和微信,简直就是专...

神武连接服务器失败为你推荐
操作http360退出北京时间在国外如何把手机时间调回到中国北京时间?解析cuteftp文档下载如何 下载 文库文件腾讯公司电话是多少腾讯公司电话是多少青岛网通测速网通,联通,长城这三个宽带哪个网速最快?我是青岛的即时通如何使用即时通啊qq头像上传失败我怎么总是QQ上传头像失败,社区动力我是一名新入职社区员工,怎样做好社区工作?长沙电话号码升位0731_88602360电话是哪的
虚拟主机代理 vps交流 金万维动态域名 rackspace 卡巴斯基永久免费版 宁波服务器 银盘服务是什么 群英网络 国内空间 mteam googlevoice 电脑主机打不开 网易轻博客 web服务器搭建软件 免费邮件服务器软件 西部数码空间购买 腾讯空间登录首页 视频监控服务器 dell服务器论坛 web服务器软件下载 更多