On Tue, Jul 13, 2010 at 2:44 PM, Brent Jones <br...@servuhome.net> wrote:
> I have been running a pair of X4540's for almost 2 years now, the
> usual spec (Quad core, 64GB RAM, 48x 1TB).
> I have a pair of mirrored drives for rpool, and a Raidz set with 5-6
> disks in each vdev for the rest of the disks.
> I am running snv_132 on both systems.
>
> I noticed an oddity on one particular system, that when running a
> scrub, or a zfs list -t snapshot, the results take forever.
> Mind you, these are identical systems in hardware, and software. The
> primary system replicates all data sets to the secondary nightly, so
> there isn't much of a discrepancy of space used.
>
> Primary system:
> # time zfs list -t snapshot | wc -l
> 979
>
> real    1m23.995s
> user    0m0.360s
> sys     0m4.911s
>
> Secondary system:
> # time zfs list -t snapshot | wc -l
> 979
>
> real    0m1.534s
> user    0m0.223s
> sys     0m0.663s
>
>
> At the time of running both of those, no other activity was happening,
> load average of .05 or so. Subsequent runs also take just as long on
> the primary, no matter how many times I run it, it will take about 1
> minute and 25 seconds each time, very little drift (+- 1 second if
> that)
>
> Both systems are at about 77% used space on the storage pool, no other
> distinguishing factors that I can discern.
> Upon a reboot, performance is respectable for a little while, but
> within days, it will sink back to those levels. I suspect a memory
> leak, but both systems run the same software versions and packages, so
> I can't envision that.
>
> Would anyone have any ideas what may cause this?

It could be a disk failing and dragging I/O down with it.

Try to check for high asvc_t with `iostat -XCn 1` and errors in `iostat -En`

Any timeouts or retries in /var/adm/messages ?

-- 
Giovanni Tirloni
gtirl...@sysdroid.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to