> Allow me to clarify a little further, why I care about this so much. I have > a solaris file server, with all the company jewels on it. I had a pair of > intel X.25 SSD mirrored log devices. One of them failed. The replacement > device came with a newer version of firmware on it. Now, instead of > appearing as 29.802 Gb, it appears at 29.801 Gb. I cannot zpool attach. > New device is too small. > > So apparently I'm the first guy this happened to. Oracle is caught totally > off guard. They're pulling their inventory of X25's from dispatch > warehouses, and inventorying all the firmware versions, and trying to figure > it all out. Meanwhile, I'm still degraded. Or at least, I think I am.
This isn't the only problem that SnOracle has had with the X25s. We managed to reproduce a problem with the SSDs as ZIL on an x4250. An I/O error of some sort caused a retryable write error ... which brought throughput to 0 as if a PCI bus reset had occurred. Here's a sample of our output... you might want to check and see if you're getting similar errors. Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0 (mpt1): Jan 10 21:36:52 tips-fs1.tamu.edu Log info 31126000 received for target 15. Jan 10 21:36:52 tips-fs1.tamu.edu scsi_status=0, ioc_status=804b, scsi_state=c Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 365881 kern.info] /p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0 (mpt1): Jan 10 21:36:52 tips-fs1.tamu.edu Log info 31126000 received for target 15. Jan 10 21:36:52 tips-fs1.tamu.edu scsi_status=0, ioc_status=804b, scsi_state=c Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,2...@4/pci111d,8...@0/pci111d,8...@4/pci1000,3...@0/s...@f,0 (sd28): Jan 10 21:36:52 tips-fs1.tamu.edu Error for Command: write Error Level: Retryable Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Requested Block: 8448 Error Block: 8448 Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Vendor: ATA Serial Number: CVEM902401BA Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] Sense Key: Unit Attention Jan 10 21:36:52 tips-fs1.tamu.edu scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 We were lucky to catch the problem before we went live. There were an exceptionally large number of I/O errors Sun has not gotten back to me with a resolution for this problem yet, but they were able to reproduce the issue. -K Karl Katzke Systems Analyst II TAMU / DRGS _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss