I know this topic has been discussed many times... but what the hell makes zpool resilvering so slow? I'm running OpenSolaris 2009.06.

I have had a large number of problematic disks due to a bad production batch, leading me to resilver quite a few times, progressively replacing each disk as it dies (and now preemptively removing disks.) My complaint is that resilvering ends up taking... days! The average write rate to the disk being resilvered is 1 to 3 MB/sec.

You can see zpool status and iostat -v output here:
http://pastebin.com/mcbb8dfd

When I read files off the zpool, I get quite a few MB/sec even in a degraded state, although the zpool is idle while resilvering in this case - no snapshots or anything happening on it. The system has 3 GB of RAM and a 2.8 GHz dual core CPU which is always >90% idle while resilvering. The number of I/O operations per second is nowhere near the disk's limits. Scrubbing takes 3-4 hours at the most, so it's clearly not a read bottleneck. Even if I have a configuration where only one disk is being replaced (and all others are OK), I never pass the 1-3 MB/sec limit.

What is going on? I have had to resilver 4 times so far, and I have to resilver at least once more. Each resilvering takes a day or two, and I cant see why... it's not CPU, it's not sustained read throughput, it's not IOPS, so what is it??

Galen
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to