> Allow me to clarify a little further, why I care about this so much.  I have
> a solaris file server, with all the company jewels on it.  I had a pair of
> intel X.25 SSD mirrored log devices.  One of them failed.  The replacement
> device came with a newer version of firmware on it.  Now, instead of
> appearing as 29.802 Gb, it appears at 29.801 Gb.  I cannot zpool attach.
> New device is too small.
> 
> So apparently I'm the first guy this happened to.  Oracle is caught totally
> off guard.  They're pulling their inventory of X25's from dispatch
> warehouses, and inventorying all the firmware versions, and trying to figure
> it all out.  Meanwhile, I'm still degraded.  Or at least, I think I am.

This isn't the only problem that SnOracle has had with the X25s. We managed to 
reproduce a problem with the SSDs as ZIL on an x4250. An I/O error of some sort 
caused a retryable write error ... which brought throughput to 0 as if a PCI 
bus reset had occurred. 

Here's a sample of our output... you might want to check and see if you're 
getting similar errors. 

Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0 (mpt1):
Jan 10 21:36:52 tips-fs1.tamu.edu       Log info 31126000 received for target 
15.
Jan 10 21:36:52 tips-fs1.tamu.edu       scsi_status=0, ioc_status=804b, 
scsi_state=c
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 365881 kern.info] 
/p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0 (mpt1):
Jan 10 21:36:52 tips-fs1.tamu.edu       Log info 31126000 received for target 
15.
Jan 10 21:36:52 tips-fs1.tamu.edu       scsi_status=0, ioc_status=804b, 
scsi_state=c
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0/s...@f,0 
(sd28):
Jan 10 21:36:52 tips-fs1.tamu.edu       Error for Command: write Error Level: 
Retryable
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Requested 
Block: 8448                      Error Block: 8448
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Vendor: ATA     
                           Serial Number: CVEM902401BA
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Sense Key: Unit 
Attention
Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] ASC: 0x29 
(power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0


We were lucky to catch the problem before we went live. There were an 
exceptionally large number of I/O errors 

Sun has not gotten back to me with a resolution for this problem yet, but they 
were able to reproduce the issue. 

-K 

Karl Katzke
Systems Analyst II
TAMU / DRGS

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to