Today, we foudn the x4500 NFS stopped responding for about 1 minute, but the server itself was idle. After a little looking around, we found this:
Apr 16 09:16:00 x4500-01.unix fmd: [ID 441519 daemon.error] SUNW-MSG-ID: DISK-8000-0X, TYPE: Fault, VER: 1, SEVERITY: Major Apr 16 09:16:00 x4500-01.unix EVENT-TIME: Wed Apr 16 09:16:00 JST 2008 Apr 16 09:16:00 x4500-01.unix PLATFORM: Sun Fire X4500, CSN: 0738AMT047 , HOSTNAME: x4500-01.unix Apr 16 09:16:00 x4500-01.unix SOURCE: eft, REV: 1.16 Apr 16 09:16:00 x4500-01.unix EVENT-ID: 949038eb-f474-421b-bb79-981810c48f25 Apr 16 09:16:00 x4500-01.unix DESC: SMART health-monitoring firmware reported that a disk Apr 16 09:16:00 x4500-01.unix failure is imminent. Apr 16 09:16:00 x4500-01.unix Refer to http://sun.com/msg/DISK-8000-0X for more information. Apr 16 09:16:00 x4500-01.unix AUTO-RESPONSE: None. Apr 16 09:16:00 x4500-01.unix IMPACT: It is likely that the continued operation of Apr 16 09:16:00 x4500-01.unix this disk will result in data loss. Apr 16 09:16:00 x4500-01.unix REC-ACTION: Schedule a repair procedure to replace the affected disk. Apr 16 09:16:00 x4500-01.unix Use fmdump -v -u <EVENT_ID> to identify the disk. # fmdump -v -u 949038eb-f474-421b-bb79-981810c48f25 TIME UUID SUNW-MSG-ID Apr 16 09:16:00.6628 949038eb-f474-421b-bb79-981810c48f25 DISK-8000-0X 100% fault.io.disk.predictive-failure Problem in: hc://:product-id=Sun-Fire-X4500:chassis-id=0738AMT047:server-id=x4500-01.unix:serial=KRVN67ZBHDPX3H:part=HITACHI-HDS7250SASUN500G-0726KDPX3H:revision=K2AOAJ0A/bay=9/disk=0 Affects: dev:///:devid=id1,[EMAIL PROTECTED]//[EMAIL PROTECTED],0/pci1022,[EMAIL PROTECTED]/pci11ab,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 FRU: hc://:product-id=Sun-Fire-X4500:chassis-id=0738AMT047:server-id=x4500-01.unix:serial=KRVN67ZBHDPX3H:part=HITACHI-HDS7250SASUN500G-0726KDPX3H:revision=K2AOAJ0A/bay=9/disk=0 Location: HD_ID_9 The server is back and operating again, so we are happy with how it handled it. It would have been nice if the URL above would have further URLs leading to "how to remove a HDD from x4500/other platforms". Can I tell zfs to just not use the HDD for now, until we can pull it out and replace it? We have 14TB spare, so no risk of running out of space right now. Lund -- Jorgen Lundman | <[EMAIL PROTECTED]> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell) Japan | +81 (0)3 -3375-1767 (home) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss