What bug# is this under? I'm having what I believe is the same problem. Is
it possible to just take the mpt driver from a prior build in the time
being?
The below is from the load the zpool scrub creates. This is on a dell t7400
workstation with a 1068E oemed lsi. I updated the firmware to the newest
available from dell. The errors follow whichever of the 4 drives has the
highest load.

Streaming doesn't seem to trigger it as I can push 60 MiB a second to a
mirrored rpool all day, it's only when there are a lot of metadata
operations.


Oct 23 06:25:44 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:25:44 systurbo5       Disconnected command timeout for Target 1
Oct 23 06:27:15 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:27:15 systurbo5       Disconnected command timeout for Target 1
Oct 23 06:28:26 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:28:26 systurbo5       Disconnected command timeout for Target 1
Oct 23 06:29:47 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:29:47 systurbo5       Disconnected command timeout for Target 1
Oct 23 06:30:58 systurbo5 scsi: [ID 107833 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:30:58 systurbo5       Disconnected command timeout for Target 1
Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:28 systurbo5       mpt_handle_event_sync: IOCStatus=0x8000,
IOCLogInfo=0x31123000
Oct 23 06:31:28 systurbo5 scsi: [ID 243001 kern.warning] WARNING: /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:28 systurbo5       mpt_handle_event: IOCStatus=0x8000,
IOCLogInfo=0x31123000
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5       Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5       scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5       Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5       scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5       Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5       scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc
Oct 23 06:31:29 systurbo5 scsi: [ID 365881 kern.info] /p...@0
,0/pci8086,4...@9/pci8086,3...@0/pci8086,3...@0/pci1028,2...@0 (mpt0):
Oct 23 06:31:29 systurbo5       Log info 0x31123000 received for target 1.
Oct 23 06:31:29 systurbo5       scsi_status=0x0, ioc_status=0x804b,
scsi_state=0xc


On Fri, Oct 23, 2009 at 7:13 AM, Adam Cheal <ach...@pnimedia.com> wrote:

> Our config is:
> OpenSolaris snv_118 x64
> 1 x LSISAS3801E controller
> 2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives)
> Each of the two external ports on the LSI connects to a 23-disk JBOD.
> ZFS-wise we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD).
> Each zpool has one ZFS filesystem containing millions of files/directories.
> This data is served up via CIFS (kernel), which is why we went with snv_118
> (first release post-2009.06 that had stable CIFS server). Like I mentioned
> to James, we know that the server won't be a star performance-wise
> especially because of the wide vdevs but it shouldn't hiccup under load
> either. A guaranteed way for us to cause these IO errors is to load up the
> zpool with about 30 TB of data (90% full) then scrub it. Within 30 minutes
> we start to see the errors, which usually evolves into "failing" disks
> (because of excessive retry errors) which just makes things worse.
> --
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to