After having massive problems with a supermicro X7DBE box using AOC-SAT2-MV8 Marvell controllers and opensolaris snv79 (same as described here: http://sunsolve.sun.com/search/document.do?assetkey=1-66-233341-1) we just start over using new hardware and opensolaris 2008.05 upgraded to snv94. We used again a supermicro X7DBE but now with two LSI SAS3081E SAS controllers. And guess what? Now we get these error-messages in /var/adm/messages:
Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd11): Aug 11 18:20:52 thumper2 Error for Command: read(10) Error Level: Retryable Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Requested Block: 1423173120 Error Block: 1423173120 Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: WD-WCAP Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] Sense Key: Unit_Attention Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Along whit these messages there are a lot of this messages: Aug 11 18:20:51 thumper2 scsi: [ID 365881 kern.info] /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1): Aug 11 18:20:51 thumper2 Log info 0x31123000 received for target 5. Aug 11 18:20:51 thumper2 scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc I would believe having a faulty disk, but not two: Aug 11 17:47:47 thumper2 scsi: [ID 365881 kern.info] /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1): Aug 11 17:47:47 thumper2 Log info 0x31123000 received for target 4. Aug 11 17:47:47 thumper2 scsi_status=0x0, ioc_status=0x804b, scsi_state=0xc Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL PROTECTED],0 (sd10): Aug 11 17:47:48 thumper2 Error for Command: read(10) Error Level: Retryable Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Requested Block: 252165120 Error Block: 252165120 Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] Sense Key: Unit_Attention Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Aug 11 17:48:34 thumper2 scsi: [ID 243001 kern.warning] WARNING: /[EMAIL PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0): Does somebody know what is going on here? I have checked the disks with iostat -En : -bash-3.2# iostat -En ... c4t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: FUJITSU Product: MBA3073RC Revision: 0103 Serial No: Size: 73.54GB <73543163904 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c4t5d0 Soft Errors: 4 Hard Errors: 24 Transport Errors: 179 Vendor: ATA Product: ST3750330NS Revision: SN04 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 22 Recoverable: 4 Illegal Request: 0 Predictive Failure Analysis: 0 c4t6d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: ATA Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c6t4d0 Soft Errors: 6 Hard Errors: 17 Transport Errors: 466 Vendor: ATA Product: ST3750640NS Revision: G Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 17 Recoverable: 6 Illegal Request: 0 Predictive Failure Analysis: 0 c6t5d0 Soft Errors: 2 Hard Errors: 23 Transport Errors: 539 Vendor: ATA Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No: Size: 750.16GB <750156374016 bytes> Media Error: 0 Device Not Ready: 0 No Device: 23 Recoverable: 2 Illegal Request: 0 Predictive Failure Analysis: 0 I have check the drives with smartctl: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 115 075 006 Pre-fail Always - 94384069 3 Spin_Up_Time 0x0003 093 093 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 15 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 263091894 9 Power_On_Hours 0x0032 096 096 000 Old_age Always - 4050 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 22 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 068 062 045 Old_age Always - 32 (Lifetime Min/Max 30/34) 194 Temperature_Celsius 0x0022 032 040 000 Old_age Always - 32 (0 25 0 0) 195 Hardware_ECC_Recovered 0x001a 065 056 000 Old_age Always - 173161329 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0 But with no UDMA_CRC_Errors I believe the disks are fine. Message was edited by: a0040 This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss