Hello Robert, Monday, April 23, 2007, 11:12:39 PM, you wrote:
RM> Hello Robert, RM> Monday, April 23, 2007, 10:44:00 PM, you wrote: RM>> Hello Peter, RM>> Monday, April 23, 2007, 9:27:56 PM, you wrote: PT>>> On 4/23/07, Robert Milkowski <[EMAIL PROTECTED]> wrote: >>>> >>>> Relatively low traffic to the pool but sync takes too long to complete >>>> and other operations are also not that fast. >>>> >>>> Disks are on 3510 array. zil_disable=1. >>>> >>>> >>>> bash-3.00# ptime sync >>>> >>>> real 1:21.569 >>>> user 0.001 >>>> sys 0.027 PT>>> Hey, that is *quick*! PT>>> On Friday afternoon I typed sync mid-afternoon. Nothing had happened PT>>> a couple of hours later when I went home. It looked as though it had finished PT>>> by 11pm, when I checked in from home. PT>>> This was on a thumper running S10U3. As far as I could tell, all writes PT>>> to the pool stopped completely. There were applications trying to write, PT>>> but they had just stopped (and picked up later in the evening). A fairly PT>>> consistent few hundred K per second of reads; no writes; and pretty low PT>>> system load. PT>>> It did recover, but write latencies of a few hours is rather undesirable. PT>>> What on earth was it doing? RM>> I've seen it too :( RM>> Other that that I can see that while I can observe reads and writes RM>> zfs is issuing write cache flush commands even in minutes instead of RM>> 5s default. And nfsd goes crazy then. RM>> Then zfs commands like zpool status, zfs list, etc. can hung for RM>> hours... nothing unusual with iostat. RM> Also stopping nfsd can take dozen of minutes to complete. RM> I've never observed this with nfsd/ufs. Run on server itself. ZFS: bash-3.00# dtrace -n fbt::fop_*:entry'{self->t=timestamp;}' -n fbt::fop_*:return'/self->t/[EMAIL PROTECTED]((timestamp-self->t)/1000000000);self->t=0;}' -n tick-10s'{printa(@);}' [after some time] [only longer ops] fop_readdir value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 35895 1 | 81 2 | 4 4 | 0 fop_mkdir value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 864 1 | 9 2 | 5 4 | 0 8 | 0 16 | 1 32 | 2 64 | 2 128 | 0 fop_space value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 426 1 | 0 2 | 0 4 | 0 8 | 0 16 | 0 32 | 0 64 | 0 128 | 3 256 | 0 fop_lookup value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1181242 1 | 311 2 | 47 4 | 3 8 | 0 fop_read value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 100799 1 | 26 2 | 1 4 | 3 8 | 5 16 | 5 32 | 9 64 | 3 128 | 3 256 | 3 512 | 0 fop_remove value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16085 1 | 43 2 | 6 4 | 0 8 | 0 16 | 1 32 | 29 64 | 54 128 | 75 256 | 0 fop_create value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 21883 1 |@ 300 2 | 243 4 | 118 8 | 31 16 | 15 32 | 69 64 | 228 128 |@ 359 256 | 1 512 | 0 fop_symlink value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 8067 1 |@ 215 2 |@ 183 4 | 114 8 | 47 16 | 6 32 | 35 64 |@ 180 128 |@@@ 689 256 | 2 512 | 0 fop_write value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 134052 1 | 174 2 | 20 4 | 1 8 | 3 16 | 179 32 | 148 64 | 412 128 | 632 256 | 0 ^C And the same environment but on UFS (both are nfs servers, the same HW): bash-3.00# dtrace -n fbt::fop_*:entry'{self->t=timestamp;}' -n fbt::fop_*:return'/self->t/[EMAIL PROTECTED]((timestamp-self->t)/1000000000);self->t=0;}' -n tick-10s'{printa(@);}' bash-3.00# [after some time] [only ops over 1s] fop_putpage value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 540731 1 | 1 2 | 0 fop_read value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 122344 1 | 4 2 | 6 4 | 0 8 | 0 16 | 0 32 | 0 64 | 0 128 | 0 256 | 1 512 | 0 ^C Well, looks much better on ufs/nfsd than zfs/nfsd. The hardware is the same, workload the same, the same time. ZFS server is with zil_disable=1. Under smaller load ZFS rocks, with higher load it "suxx" :( At least in a nfsd environment. -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss