Hi Folks,

We have put together a 25T ZFS raidz2 zpool (16x2TB 5900 RPM 32MB
Cache SATA 3.0Gb/s drives with 2x LSI SAS3081E-R SAS RAID Controllers
presenting the drives as JBOD straight thru to the backplane) with 2
hot-spares on OpenSolaris snv_133. The pool contains roughly 800
Million files which are all very small (~10-200k map tiles). We had a
hiccup with one of the drives and the resilvering process was
initiated ... the problem is that zpool status is estimating something
like 650 hours currently. This estimate has varied from 400 to 1800 as
it has run over the last couple of days, but it seems to have settled
around 650 now. That is just WAY too long ... we fear that if the end
user of this device ever has to replace a drive in the pool, it will
take this long to rebuild again.

So, we are wondering if a) there is some way we can optimize or tune
the pool to deal with this number of small files better and speed up
the resilvering process or b) some way we can tweak the resilvering
code to handle for this type of situation better.

One of our engineers is looking at setting up a VM on another machine
and using dtrace to find out where the bottleneck is, but we thought
we might have more luck on this list.

Thanks,

Jeff
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to