Re: [zfs-discuss] mpt errors on snv 127

2009-12-01 Thread Adam Cheal
> > What's the earliest build someone has seen this > problem? i.e. if we binary chop, has anyone seen it > in > b118? > We have used every "stable" build from b118 up, as b118 was the first reliable one that could be used is a CIFS-heavy environment. The problem occurs on all of them. - Adam

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-30 Thread Adam Cheal
> Can folks confirm/deny each of these? > > o The problems are not seen with Sun's version of > this card On the Thumper x4540 (which uses 6 of the same LSI 1068E controller chips), we do not see this problem. Then again, it uses a one-to-one mapping of controller PHY ports to internal disks;

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-29 Thread Adam Cheal
> > I thought you had just set > > set xpv_psm:xen_support_msi = -1 > > which is different, because that sets the > xen_support_msi variable > which lives inside the xpv_psm module. > > Setting mptsas:* will have no effect on your system > if you do not > have an mptsas card installed. The mpts

Re: [zfs-discuss] Workaround for mpt timeouts in snv_127

2009-11-29 Thread Adam Cheal
> Hi Adam, > thanks for this info. I've talked with my colleagues > in Beijing (since > I'm in Beijing this week) and we'd like you to try > disabling MSI/MSI-X > for your mpt instances. In /etc/system, add > > set mpt:mpt_enable_msi = 0 > > then regen your boot archive and reboot. > I had alre

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-25 Thread Adam Cheal
So, while we are working on resolving this issue with Sun, let me approach this from the another perspective: what kind of controller/drive ratio would be the minimum recommended to support a functional OpenSolaris-based archival solution? Given the following: - the vast majority of IO to the s

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Adam Cheal
The controller connects to two disk shelves (expanders), one per port on the card. If you look back in the thread, you'll see our zpool config has one vdev per shelf. All of the disks are Western Digital (model WD1002FBYS-18A6B0) 1TB 7.2K, firmware rev. 03.00C06. Without actually matching up the

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-24 Thread Adam Cheal
The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in actv per disk). I reset this value in /etc/system to 7, rebooted, and started a scrub. iostat output showed busier disks (%b is higher, which

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Here is example of the pool config we use: # zpool status pool: pool002 state: ONLINE scrub: scrub stopped after 0h1m with 0 errors on Fri Oct 23 23:07:52 2009 config: NAME STATE READ WRITE CKSUM pool002 ONLINE 0 0 0 raidz2 ONLINE

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
And therein lies the issue. The excessive load that causes the IO issues is almost always generated locally from a scrub or a local recursive "ls" used to warm up the SSD-based zpool cache with metadata. The regular network IO to the box is minimal and is very read-centric; once we load the box

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
LSI's sales literature on that card specs "128 devices" which I take with a few hearty grains of salt. I agree that with all 46 drives pumping out streamed data, the controller would be overworked BUT the drives will only deliver data as fast as the OS tells them to. Just because the speedometer

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
I don't think there was any intention on Sun's part to ignore the problem...obviously their target market wants a performance-oriented box and the x4540 delivers that. Each 1068E controller chip supports 8 SAS PHY channels = 1 channel per drive = no contention for channels. The x4540 is a monste

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Just submitted the bug yesterday, under advice of James, so I don't have a number you can refer to you...the "change request" number is 6894775 if that helps or is directly related to the future bugid. >From what I seen/read this problem has been around for awhile but only rears >its ugly head

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-23 Thread Adam Cheal
Our config is: OpenSolaris snv_118 x64 1 x LSISAS3801E controller 2 x 23-disk JBOD (fully populated, 1TB 7.2k SATA drives) Each of the two external ports on the LSI connects to a 23-disk JBOD. ZFS-wise we use 1 zpool with 2 x 22-disk raidz2 vdevs (1 vdev per JBOD). Each zpool has one ZFS filesyst

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
I've filed the bug, but was unable to include the "prtconf -v" output as the comments field only accepted 15000 chars total. Let me know if there is anything else I can provide/do to help figure this problem out as it is essentially preventing us from doing any kind of heavy IO to these pools,

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
James: We are running Phase 16 on our LSISAS3801E's, and have also tried the recently released Phase 17 but it didn't help. All firmware NVRAM settings are default. Basically, when we put the disks behind this controller under load (e.g. scrubbing, recursive ls on large ZFS filesystem) we get th

Re: [zfs-discuss] SNV_125 MPT warning in logfile

2009-10-22 Thread Adam Cheal
Cindy: How can I view the bug report you referenced? Standard methods show my the bug number is valid (6694909) but no content or notes. We are having similar messages appear with snv_118 with a busy LSI controller, especially during scrubbing, and I'd be interested to see what they mentioned in