After having massive problems with a supermicro X7DBE box using AOC-SAT2-MV8 
Marvell controllers and opensolaris snv79 (same as described here: 
http://sunsolve.sun.com/search/document.do?assetkey=1-66-233341-1) we just 
start over using new hardware and opensolaris 2008.05 upgraded to snv94. We 
used again a supermicro X7DBE but now with two LSI SAS3081E SAS controllers. 
And guess what? Now we get these error-messages in /var/adm/messages:

Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd11):
Aug 11 18:20:52 thumper2        Error for Command: read(10)                
Error Level: Retryable
Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice]  Requested Block: 
1423173120                Error Block: 1423173120
Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice]  Vendor: ATA             
                   Serial Number:      WD-WCAP
Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice]  Sense Key: 
Unit_Attention
Aug 11 18:20:52 thumper2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (power on, 
reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0

Along whit these messages there are a lot of this messages:

Aug 11 18:20:51 thumper2 scsi: [ID 365881 kern.info] /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1):
Aug 11 18:20:51 thumper2        Log info 0x31123000 received for target 5.
Aug 11 18:20:51 thumper2        scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc


I would believe having a faulty disk, but not two:

Aug 11 17:47:47 thumper2 scsi: [ID 365881 kern.info] /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt1):
Aug 11 17:47:47 thumper2        Log info 0x31123000 received for target 4.
Aug 11 17:47:47 thumper2        scsi_status=0x0, ioc_status=0x804b, 
scsi_state=0xc
Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.warning] WARNING: /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED]/[EMAIL 
PROTECTED],0 (sd10):
Aug 11 17:47:48 thumper2        Error for Command: read(10)                
Error Level: Retryable
Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice]  Requested Block: 
252165120                 Error Block: 252165120
Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice]  Vendor: ATA             
                   Serial Number:
Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice]  Sense Key: 
Unit_Attention
Aug 11 17:47:48 thumper2 scsi: [ID 107833 kern.notice]  ASC: 0x29 (power on, 
reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Aug 11 17:48:34 thumper2 scsi: [ID 243001 kern.warning] WARNING: /[EMAIL 
PROTECTED],0/pci8086,[EMAIL PROTECTED]/pci1000,[EMAIL PROTECTED] (mpt0):


Does somebody know what is going on here?
I have checked the disks with iostat -En :

-bash-3.2# iostat -En
...
c4t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: FUJITSU  Product: MBA3073RC        Revision: 0103 Serial No:  
Size: 73.54GB <73543163904 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c4t5d0           Soft Errors: 4 Hard Errors: 24 Transport Errors: 179 
Vendor: ATA      Product: ST3750330NS      Revision: SN04 Serial No:  
Size: 750.16GB <750156374016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 22 Recoverable: 4 
Illegal Request: 0 Predictive Failure Analysis: 0 
c4t6d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Vendor: ATA      Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No:  
Size: 750.16GB <750156374016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 Predictive Failure Analysis: 0 
c6t4d0           Soft Errors: 6 Hard Errors: 17 Transport Errors: 466 
Vendor: ATA      Product: ST3750640NS      Revision: G    Serial No:  
Size: 750.16GB <750156374016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 17 Recoverable: 6 
Illegal Request: 0 Predictive Failure Analysis: 0 
c6t5d0           Soft Errors: 2 Hard Errors: 23 Transport Errors: 539 
Vendor: ATA      Product: WDC WD7500AYYS-0 Revision: 4G30 Serial No:  
Size: 750.16GB <750156374016 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 23 Recoverable: 2 
Illegal Request: 0 Predictive Failure Analysis: 0 

I have check the drives with smartctl:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   115   075   006    Pre-fail  Always       
-       94384069
  3 Spin_Up_Time            0x0003   093   093   000    Pre-fail  Always       
-       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       
-       15
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       
-       0
  7 Seek_Error_Rate         0x000f   084   060   030    Pre-fail  Always       
-       263091894
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       
-       4050
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       
-       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       
-       22
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       
-       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       
-       0
190 Airflow_Temperature_Cel 0x0022   068   062   045    Old_age   Always       
-       32 (Lifetime Min/Max 30/34)
194 Temperature_Celsius     0x0022   032   040   000    Old_age   Always       
-       32 (0 25 0 0)
195 Hardware_ECC_Recovered  0x001a   065   056   000    Old_age   Always       
-       173161329
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       
-       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      
-       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       
-       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   Offline      
-       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   Always       
-       0

But with no UDMA_CRC_Errors I believe the disks are fine.

Message was edited by: 
        a0040
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to