In kernel threadlist (guds), I see txg_sync_thread is doing spa_sync()-> dsl_pool_sync() - responsible for writing all dirty datasets. Due to NFS operation, it may be large number of synchronous writes resulting in frequent spa_sync(). Customer noticed that even disabling zfs zil did n't show any improvement. System is almost unusable considering nfs operations such as: rfs3_write(), rfs3_create(), rfs3_setattr() are taking in the order of 30 seconds to complete.
We requested customer to provide a core dump when problem is occurring. If you have additional comments on how to best debug this issue, please let us know. 000002a102203ca0 185b240 0 0 60 3001b1062c8 PC: cv_wait+0x38 THREAD: txg_sync_thread() stack pointer for thread 2a102203ca0: 2a102202e91 [ 000002a102202e91 cv_wait+0x38() ] zio_wait+0x30() dsl_pool_sync+0xb4() spa_sync+0x208() txg_sync_thread+0x1e4() thread_start+4() On 11/23/08 06:29 PM, Prabahar Jeyaram wrote: > The reason rfs3_{write|create} waiting longer in txg_wait_open is because > there is a syncing txg taking longer to complete. > > You may want to trace and track the syncing txg to get the reason for the > delay. > > -- > Prabahar. > > On Sun, Nov 23, 2008 at 05:51:44PM -0800, Amer Ather wrote: >> IHAC who is seeing very slow NFS transactions over ZFS. rfs3_write(), >> rfs3_create() and others are taking in the order of 17-20 seconds to >> complete. Profiling these transactions showing most of the time is spent >> in txg_wait_open() - waiting for transaction group to open. >> >> We tried "zfs_nocacheflush: but it did n't help. iostat showing good >> service time (<10ms). >> >> System is running Solaris 10 - 137137-09 >> >> 00000301b616ec00 3000813e168 3017b9e3798 2 59 6002296f966 >> PC: cv_wait+0x38 CMD: >> mv camshot_081118_000029.jpg camshot_081118_000059.jpg >> camshot_081118_000129.j >> p >> stack pointer for thread 301b616ec00: 2a104a9ceb1 >> [ 000002a104a9ceb1 cv_wait+0x38() ] >> txg_wait_open+0x54() >> zfs_write+0x34c() >> fop_write+0x20() >> write+0x268() >> syscall_trap32+0xcc() >> >> Timing txg_wait_open() shows: >> >> txg_wait_open delay >> DELAY >> value ------------- Distribution ------------- count >> 4294967296 | 0 >> 8589934592 |@ 1 >> 17179869184 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 33 >> 34359738368 | 0 >> >> txg_wait_open delay >> DELAY >> value ------------- Distribution ------------- count >> 134217728 | 0 >> 268435456 |@@ 2 >> 536870912 | 0 >> 1073741824 | 0 >> 2147483648 |@ 1 >> 4294967296 | 0 >> 8589934592 |@@@@@@@ 7 >> 17179869184 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 29 >> 34359738368 | 0 >> >> DTrace script: >> #!/usr/sbin/dtrace -qs >> fbt::txg_wait_open:entry >> { >> self->t = timestamp; >> } >> fbt::txg_wait_open:return >> /self->t/ >> { >> @a["DELAY"] = quantize(timestamp - self->t); >> } >> tick-5sec { >> printf("\ntxg_wait_open delay"); >> printa(@a); >> trunc(@a); >> } >> >> >> I have also profiled rfs3_write(), rfs3_create() and other using DTrace >> and taken time delta and stack at each frame on all functions called by >> rfs3_* and see txg_wait_open() is the one taking majority of the time. >> >> Thanks, >> Amer. >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss