I know this topic has been discussed many times... but what the hell
makes zpool resilvering so slow? I'm running OpenSolaris 2009.06.
I have had a large number of problematic disks due to a bad production
batch, leading me to resilver quite a few times, progressively
replacing each disk as it dies (and now preemptively removing disks.)
My complaint is that resilvering ends up taking... days! The average
write rate to the disk being resilvered is 1 to 3 MB/sec.
You can see zpool status and iostat -v output here:
http://pastebin.com/mcbb8dfd
When I read files off the zpool, I get quite a few MB/sec even in a
degraded state, although the zpool is idle while resilvering in this
case - no snapshots or anything happening on it. The system has 3 GB
of RAM and a 2.8 GHz dual core CPU which is always >90% idle while
resilvering. The number of I/O operations per second is nowhere near
the disk's limits. Scrubbing takes 3-4 hours at the most, so it's
clearly not a read bottleneck. Even if I have a configuration where
only one disk is being replaced (and all others are OK), I never pass
the 1-3 MB/sec limit.
What is going on? I have had to resilver 4 times so far, and I have to
resilver at least once more. Each resilvering takes a day or two, and
I cant see why... it's not CPU, it's not sustained read throughput,
it's not IOPS, so what is it??
Galen
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss