comments below… On Dec 18, 2011, at 6:53 AM, Jan-Aage Frydenbø-Bruvoll wrote:
> Dear List, > > I have a storage server running OpenIndiana with a number of storage > pools on it. All the pools' disks come off the same controller, and > all pools are backed by SSD-based l2arc and ZIL. Performance is > excellent on all pools but one, and I am struggling greatly to figure > out what is wrong. > > A very basic test shows the following - pretty much typical > performance at the moment: > > root@stor:/# for a in pool1 pool2 pool3; do dd if=/dev/zero of=$a/file > bs=1M count=10; done > 10+0 records in > 10+0 records out > 10485760 bytes (10 MB) copied, 0.00772965 s, 1.4 GB/s > 10+0 records in > 10+0 records out > 10485760 bytes (10 MB) copied, 0.00996472 s, 1.1 GB/s > 10+0 records in > 10+0 records out > 10485760 bytes (10 MB) copied, 71.8995 s, 146 kB/s Enable compression and they should all go fast :-) But seriously, you could be getting tripped up by the allocator. There are several different allocator algorithms and they all begin to thrash at high utilization. Some are better than others for various cases. For OpenIndiana, you might be getting bit by the allocator. One troubleshooting tip would be to observe the utilization of the metaslabs: zdb -m pool3 If there are metaslabs that are > 96% full, then look more closely at the allocator algorithms. > > The zpool status of the affected pool is: > > root@stor:/# zpool status pool3 > pool: pool3 > state: ONLINE > scan: resilvered 222G in 24h2m with 0 errors on Wed Dec 14 15:20:11 2011 > config: > > NAME STATE READ WRITE CKSUM > pool3 ONLINE 0 0 0 > c1t0d0 ONLINE 0 0 0 > c1t1d0 ONLINE 0 0 0 > c1t2d0 ONLINE 0 0 0 > c1t3d0 ONLINE 0 0 0 > c1t4d0 ONLINE 0 0 0 > c1t5d0 ONLINE 0 0 0 > c1t6d0 ONLINE 0 0 0 > c1t7d0 ONLINE 0 0 0 > c1t8d0 ONLINE 0 0 0 > c1t9d0 ONLINE 0 0 0 > c1t10d0 ONLINE 0 0 0 > mirror-12 ONLINE 0 0 0 > c1t26d0 ONLINE 0 0 0 > c1t27d0 ONLINE 0 0 0 > mirror-13 ONLINE 0 0 0 > c1t28d0 ONLINE 0 0 0 > c1t29d0 ONLINE 0 0 0 > mirror-14 ONLINE 0 0 0 > c1t34d0 ONLINE 0 0 0 > c1t35d0 ONLINE 0 0 0 > logs > mirror-11 ONLINE 0 0 0 > c2t2d0p8 ONLINE 0 0 0 > c2t3d0p8 ONLINE 0 0 0 > cache > c2t2d0p12 ONLINE 0 0 0 > c2t3d0p12 ONLINE 0 0 0 > > errors: No known data errors > > Ditto for the disk controller - MegaCli reports zero errors, be that > on the controller itself, on this pool's disks or on any of the other > attached disks. > > I am pretty sure I am dealing with a disk-based problem here, i.e. a > flaky disk that is "just" slow without exhibiting any actual data > errors, holding the rest of the pool back, but I am at a miss as how > to pinpoint what is going on. "iostat -x" shows the average service time of each disk. If one disk or set of disks is a lot slower, when also busy, then it should be clearly visible in the iostat output. Personally, I often use something like "iostat -zxCn 10" for 10-second samples. -- richard > > Would anybody on the list be able to give me any pointers as how to > dig up more detailed information about the pool's/hardware's > performance? > > Thank you in advance for your kind assistance. > > Best regards > Jan > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- ZFS and performance consulting http://www.RichardElling.com LISA '11, Boston, MA, December 4-9 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss