Re: [zfs-discuss] SNV_125 MPT warning in logfile

Richard Elling Sat, 24 Oct 2009 09:44:07 -0700

more below...

On Oct 24, 2009, at 2:49 AM, Adam Cheal wrote:

The iostat I posted previously was from a system we had alreadytuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible bythe max of about 10 in actv per disk).
I reset this value in /etc/system to 7, rebooted, and started ascrub. iostat output showed busier disks (%b is higher, which seemedodd) but a cap of about 7 queue items per disk, proving the tuningwas effective. iostat at a high-water mark during the test lookedlike this:
                   extended device statistics
   r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t0d0
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t1d0
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t2d0
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t3d0
8344.5    0.0 359640.4    0.0  0.1 300.5    0.0   36.0   0 4362 c9
 190.0    0.0 6800.4    0.0  0.0  6.6    0.0   34.8   0  99 c9t8d0
 185.0    0.0 6917.1    0.0  0.0  6.1    0.0   32.9   0  94 c9t9d0
 187.0    0.0 6640.9    0.0  0.0  6.5    0.0   34.6   0  98 c9t10d0
 186.5    0.0 6543.4    0.0  0.0  7.0    0.0   37.5   0 100 c9t11d0
 180.5    0.0 7203.1    0.0  0.0  6.7    0.0   37.2   0 100 c9t12d0
 195.5    0.0 7352.4    0.0  0.0  7.0    0.0   35.8   0 100 c9t13d0
 188.0    0.0 6884.9    0.0  0.0  6.6    0.0   35.2   0  99 c9t14d0
 204.0    0.0 6990.1    0.0  0.0  7.0    0.0   34.3   0 100 c9t15d0
 199.0    0.0 7336.7    0.0  0.0  7.0    0.0   35.2   0 100 c9t16d0
 180.5    0.0 6837.9    0.0  0.0  7.0    0.0   38.8   0 100 c9t17d0
 198.0    0.0 7668.9    0.0  0.0  7.0    0.0   35.3   0 100 c9t18d0
 203.0    0.0 7983.2    0.0  0.0  7.0    0.0   34.5   0 100 c9t19d0
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c9t20d0
 195.5    0.0 7096.4    0.0  0.0  6.7    0.0   34.1   0  98 c9t21d0
 189.5    0.0 7757.2    0.0  0.0  6.4    0.0   33.9   0  97 c9t22d0
 195.5    0.0 7645.9    0.0  0.0  6.6    0.0   33.8   0  99 c9t23d0
 194.5    0.0 7925.9    0.0  0.0  7.0    0.0   36.0   0 100 c9t24d0
 188.5    0.0 6725.6    0.0  0.0  6.2    0.0   32.8   0  94 c9t25d0
 188.5    0.0 7199.6    0.0  0.0  6.5    0.0   34.6   0  98 c9t26d0
 196.0    0.0 6666.9    0.0  0.0  6.3    0.0   32.1   0  95 c9t27d0
 193.5    0.0 7455.4    0.0  0.0  6.2    0.0   32.0   0  95 c9t28d0
 189.0    0.0 7400.9    0.0  0.0  6.3    0.0   33.2   0  96 c9t29d0
 182.5    0.0 9397.0    0.0  0.0  7.0    0.0   38.3   0 100 c9t30d0
 192.5    0.0 9179.5    0.0  0.0  7.0    0.0   36.3   0 100 c9t31d0
 189.5    0.0 9431.8    0.0  0.0  7.0    0.0   36.9   0 100 c9t32d0
 187.5    0.0 9082.0    0.0  0.0  7.0    0.0   37.3   0 100 c9t33d0
 188.5    0.0 9368.8    0.0  0.0  7.0    0.0   37.1   0 100 c9t34d0
 180.5    0.0 9332.8    0.0  0.0  7.0    0.0   38.8   0 100 c9t35d0
 183.0    0.0 9690.3    0.0  0.0  7.0    0.0   38.2   0 100 c9t36d0
 186.0    0.0 9193.8    0.0  0.0  7.0    0.0   37.6   0 100 c9t37d0
 180.5    0.0 8233.4    0.0  0.0  7.0    0.0   38.8   0 100 c9t38d0
 175.5    0.0 9085.2    0.0  0.0  7.0    0.0   39.9   0 100 c9t39d0
 177.0    0.0 9340.0    0.0  0.0  7.0    0.0   39.5   0 100 c9t40d0
 175.5    0.0 8831.0    0.0  0.0  7.0    0.0   39.9   0 100 c9t41d0
 190.5    0.0 9177.8    0.0  0.0  7.0    0.0   36.7   0 100 c9t42d0
   0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c9t43d0
 196.0    0.0 9180.5    0.0  0.0  7.0    0.0   35.7   0 100 c9t44d0
 193.5    0.0 9496.8    0.0  0.0  7.0    0.0   36.2   0 100 c9t45d0
 187.0    0.0 8699.5    0.0  0.0  7.0    0.0   37.4   0 100 c9t46d0
 198.5    0.0 9277.0    0.0  0.0  7.0    0.0   35.2   0 100 c9t47d0
 185.5    0.0 9778.3    0.0  0.0  7.0    0.0   37.7   0 100 c9t48d0
 192.0    0.0 8384.2    0.0  0.0  7.0    0.0   36.4   0 100 c9t49d0
 198.5    0.0 8864.7    0.0  0.0  7.0    0.0   35.2   0 100 c9t50d0
 192.0    0.0 9369.8    0.0  0.0  7.0    0.0   36.4   0 100 c9t51d0
 182.5    0.0 8825.7    0.0  0.0  7.0    0.0   38.3   0 100 c9t52d0
 202.0    0.0 7387.9    0.0  0.0  7.0    0.0   34.6   0 100 c9t55d0

...and sure enough about 20 minutes into it I get this (bus reset?):
scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@34,0 (sd49):
      incomplete read- retrying
scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@21,0 (sd30):
      incomplete read- retrying
scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@1e,0 (sd27):
      incomplete read- retrying
scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
      Rev. 8 LSI, Inc. 1068E found.
scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
      mpt0 supports power management.
scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0(mpt0):
      mpt0: IOC Operational.

During the "bus reset", iostat output looked like this:
extended device statistics ----errors ---r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/wtrn tot device0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c80.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c8t0d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c8t1d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c8t2d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c8t3d00.0 0.0 0.0 0.0 0.0 88.0 0.0 0.0 0 2200 03 0 3 c90.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t8d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t9d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t10d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t11d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t12d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t13d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t14d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t15d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t16d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t17d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t18d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t19d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t20d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t21d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t22d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t23d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t24d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t25d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t26d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t27d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t28d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t29d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 01 0 1 c9t30d0


OK, here we see 4 I/Os pending outside  of the host.  The host has
sent them on and is waiting for them to return. This means they are
getting dropped either at the disk or somewhere between the disk
and the controller.

When this happens, the sd driver will time them out, try to clear
the fault by reset, and retry. In other words, the resets you see
are when the system tries to recover.

Since there are many disks with 4 stuck I/Os, I would lean towards
a common cause. What do these disks have in common?  Firmware?
Do they share a SAS expander?
 -- richard

0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t31d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t32d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 01 0 1 c9t33d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t34d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t35d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t36d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t37d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t38d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t39d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t40d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t41d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t42d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t43d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t44d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t45d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t46d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t47d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t48d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t49d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t50d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 00 0 0 c9t51d00.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 01 0 1 c9t52d00.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 00 0 0 c9t55d0
During our previous testing, we had tried even setting thismax_pending value down to 1, but we still hit the problem (albeit ittook a little longer to hit it) and I couldn't find anything else Icould set to throttle IO to the disk, hence the frustration.
If you hadn't seen this output, would you say that 7 was a"reasonable" value for that max_pending queue for our architectureand should give the LSI controller in this situation enoughbreathing room to operate? If so, I *should* be able to scrub thedisks successfully (ZFS isn't to blame) and therefore have to pointthe finger at the mpt-driver/LSI-firmware/disk-firmware instead.
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] SNV_125 MPT warning in logfile

Reply via email to