Looks that the txg_sync_thread for this pool has been blocked and
never return, which leads to many other threads have been
blocked. I have tried to change zfs_vdev_max_pending value from 10 to
35 and retested the workload serveral times, this issue
does not happen. But if I change it back to 10, it happens very
easily. Any known bug on this or any suggestion to solve this issue?

> ffffff0502c3378c::wchaninfo -v
ADDR             TYPE NWAITERS   THREAD           PROC
ffffff0502c3378c cond     1730:  ffffff051cc6b500 go_filebench
                                 ffffff051ce61020 go_filebench
                                 ffffff051cc4e4e0 go_filebench
                                 ffffff051d115120 go_filebench
                                 ffffff051e9ed000 go_filebench
                                 ffffff051bf644c0 go_filebench
                                 ffffff051c65b000 go_filebench
                                 ffffff051c728500 go_filebench
                                 ffffff050d83a8c0 go_filebench
                                 ffffff051c528c00 go_filebench
                                 ffffff051b750800 go_filebench
                                 ffffff051cdd7520 go_filebench
                                 ffffff051ce71bc0 go_filebench
                                 ffffff051cb5e840 go_filebench
                                 ffffff051cbdec60 go_filebench
                                 ffffff0516473c60 go_filebench
                                 ffffff051d132820 go_filebench
                                 ffffff051d13a400 go_filebench
                                 ffffff050fbf0b40 go_filebench
                                 ffffff051ce7a400 go_filebench
                                 ffffff051b781820 go_filebench
                                 ffffff051ce603e0 go_filebench
                                 ffffff051d1bf840 go_filebench
                                 ffffff051c6c24c0 go_filebench
                                 ffffff051d204100 go_filebench
                                 ffffff051cbdf160 go_filebench
                                 ffffff051ce52c00 go_filebench
                                 .......
> ffffff051cc6b500::findstack -v
stack pointer for thread ffffff051cc6b500: ffffff0020a76ac0
[ ffffff0020a76ac0 _resume_from_idle+0xf1() ]
  ffffff0020a76af0 swtch+0x145()
  ffffff0020a76b20 cv_wait+0x61(ffffff0502c3378c, ffffff0502c33700)
  ffffff0020a76b70 zil_commit+0x67(ffffff0502c33700, 6b255, 14)
  ffffff0020a76d80 zfs_write+0xaaf(ffffff050b5c9140, ffffff0020a76e40,
40, ffffff0502dab258, 0)
  ffffff0020a76df0 fop_write+0x6b(ffffff050b5c9140, ffffff0020a76e40,
40, ffffff0502dab258, 0)
  ffffff0020a76ec0 pwrite64+0x244(1a, b6f2a000, 800, b841a800, 0)
  ffffff0020a76f10 sys_syscall32+0xff()

>From the zil_commit code, I try to find the thread whose stack have
function call zil_commit_writer. This thread did not
return back from zil_commit_write so that it will not call
cv_broadcast to wake up the waiting threads.

> ffffff051d10fba0::findstack -v
stack pointer for thread ffffff051d10fba0: ffffff0021ab9a10
[ ffffff0021ab9a10 _resume_from_idle+0xf1() ]
  ffffff0021ab9a40 swtch+0x145()
  ffffff0021ab9a70 cv_wait+0x61(ffffff051ae1b988, ffffff051ae1b980)
  ffffff0021ab9ab0 zio_wait+0x5d(ffffff051ae1b680)
  ffffff0021ab9b20 zil_commit_writer+0x249(ffffff0502c33700, 6b250, e)
  ffffff0021ab9b70 zil_commit+0x91(ffffff0502c33700, 6b250, e)
  ffffff0021ab9d80 zfs_write+0xaaf(ffffff050b5c9540, ffffff0021ab9e40,
40, ffffff0502dab258, 0)
  ffffff0021ab9df0 fop_write+0x6b(ffffff050b5c9540, ffffff0021ab9e40,
40, ffffff0502dab258, 0)
  ffffff0021ab9ec0 pwrite64+0x244(14, bfbfb800, 800, 88f3f000, 0)
  ffffff0021ab9f10 sys_syscall32+0xff()

> ffffff051ae1b680::zio -r
ADDRESS                                  TYPE  STAGE            WAITER
ffffff051ae1b680                         NULL  CHECKSUM_VERIFY  ffffff051d10fba0
 ffffff051a9c1978                        WRITE VDEV_IO_START    -
  ffffff052454d348                       WRITE VDEV_IO_START    -
 ffffff051572b960                        WRITE VDEV_IO_START    -
  ffffff050accb330                       WRITE VDEV_IO_START    -
 ffffff0514453c80                        WRITE VDEV_IO_START    -
  ffffff0524537648                       WRITE VDEV_IO_START    -
 ffffff05090e9660                        WRITE VDEV_IO_START    -
  ffffff05151cb698                       WRITE VDEV_IO_START    -
 ffffff0514668658                        WRITE VDEV_IO_START    -
  ffffff0514835690                       WRITE VDEV_IO_START    -
 ffffff05198979a0                        WRITE VDEV_IO_START    -
  ffffff0507e1d038                       WRITE VDEV_IO_START    -
 ffffff0510727028                        WRITE VDEV_IO_START    -
  ffffff0523a25018                       WRITE VDEV_IO_START    -
 ffffff0523d729c0                        WRITE VDEV_IO_START    -
  ffffff052465b990                       WRITE VDEV_IO_START    -
 ffffff052395f008                        WRITE DONE             -
 ffffff0514cbc350                        WRITE VDEV_IO_START    -
  ffffff05146f2688                       WRITE VDEV_IO_START    -
 ffffff0509454048                        WRITE VDEV_IO_START    -
  ffffff0524186038                       WRITE VDEV_IO_START    -
 ffffff051166e9a0                        WRITE DONE             -
 ffffff0515256960                        WRITE VDEV_IO_START    -
  ffffff0518edf010                       WRITE VDEV_IO_START    -
 ffffff0514b2f688                        WRITE VDEV_IO_START    -
  ffffff05158b4040                       WRITE VDEV_IO_START    -
 ffffff052448d648                        WRITE DONE             -
 ffffff0512354380                        WRITE VDEV_IO_START    -
  ffffff051aafe6a0                       WRITE VDEV_IO_START    -
 ffffff051524e350                        WRITE VDEV_IO_START    -
  ffffff051a707058                       WRITE VDEV_IO_START    -
 ffffff0524679c88                        WRITE DONE             -
 ffffff051acef058                        WRITE DONE             -

> ffffff051acef058::print zio_t io_executor
io_executor = 0xffffff002089ac40
> 0xffffff002089ac40::findstack -v
stack pointer for thread ffffff002089ac40: ffffff002089a720
[ ffffff002089a720 _resume_from_idle+0xf1() ]
  ffffff002089a750 swtch+0x145()
  ffffff002089a800 turnstile_block+0x760(ffffff051d186418, 0,
ffffff051fcf0340, fffffffffbc07db8, 0, 0)
  ffffff002089a860 mutex_vector_enter+0x261(ffffff051fcf0340)
  ffffff002089a890 txg_rele_to_sync+0x2a(ffffff05121bece8)
  ffffff002089a8c0 dmu_tx_commit+0xee(ffffff05121bec98)
  ffffff002089a8f0 zil_lwb_write_done+0x5f(ffffff051acef058)
  ffffff002089a960 zio_done+0x383(ffffff051acef058)
  ffffff002089a990 zio_execute+0x8d(ffffff051acef058)
  ffffff002089a9f0 zio_notify_parent+0xa6(ffffff051acef058, ffffff052391b9b8, 1)
  ffffff002089aa60 zio_done+0x3e2(ffffff052391b9b8)
  ffffff002089aa90 zio_execute+0x8d(ffffff052391b9b8)
  ffffff002089ab30 taskq_thread+0x248(ffffff050c418910)
  ffffff002089ab40 thread_start+8()
> ffffff05121bece8::print -t txg_handle_t
txg_handle_t {
    tx_cpu_t *th_cpu = 0xffffff051fcf0340
    uint64_t th_txg = 0xf36
}


> ffffff051fcf0340::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffff051fcf0340 adapt ffffff050dc5d3a0      -      -     yes

> ffffff050dc5d3a0::findstack -v
stack pointer for thread ffffff050dc5d3a0: ffffff0023589970
[ ffffff0023589970 _resume_from_idle+0xf1() ]
  ffffff00235899a0 swtch+0x145()
  ffffff0023589a50 turnstile_block+0x760(ffffff051ce0c948, 0,
ffffff05083403c8, fffffffffbc07db8, 0, 0)
  ffffff0023589ab0 mutex_vector_enter+0x261(ffffff05083403c8)
  ffffff0023589b30 dmu_tx_try_assign+0xab(ffffff0514395018, 2)
  ffffff0023589b70 dmu_tx_assign+0x2a(ffffff0514395018, 2)
  ffffff0023589d80 zfs_write+0x65f(ffffff050b5c9640, ffffff0023589e40,
40, ffffff0502dab258, 0)
  ffffff0023589df0 fop_write+0x6b(ffffff050b5c9640, ffffff0023589e40,
40, ffffff0502dab258, 0)
  ffffff0023589ec0 pwrite64+0x244(16, b6f7c000, 800, a7ef7800, 0)
  ffffff0023589f10 sys_syscall32+0xff()
> ffffff0514395018::print dmu_tx_t
{
    tx_holds = {
        list_size = 0x50
        list_offset = 0x8
        list_head = {
            list_next = 0xffffff0508054840
            list_prev = 0xffffff050da3b1f8
        }
    }
    tx_objset = 0xffffff05028c8940
    tx_dir = 0xffffff04e7785400
    tx_pool = 0xffffff0502ceac00
    tx_txg = 0xf36
    tx_lastsnap_txg = 0x1
    tx_lasttried_txg = 0
    tx_txgh = {
        th_cpu = 0xffffff051fcf0340
        th_txg = 0xf36
    }
    tx_tempreserve_cookie = 0
    tx_needassign_txh = 0
    tx_callbacks = {
        list_size = 0x20
        list_offset = 0
        list_head = {
            list_next = 0xffffff0514395098
            list_prev = 0xffffff0514395098
        }
    }
    tx_anyobj = 0
    tx_err = 0
}
> ffffff05083403c8::mutex
            ADDR  TYPE             HELD MINSPL OLDSPL WAITERS
ffffff05083403c8 adapt ffffff002035cc40      -      -     yes

> ffffff002035cc40::findstack -v
stack pointer for thread ffffff002035cc40: ffffff002035c590
[ ffffff002035c590 _resume_from_idle+0xf1() ]
  ffffff002035c5c0 swtch+0x145()
  ffffff002035c5f0 cv_wait+0x61(ffffff05123ce350, ffffff05123ce348)
  ffffff002035c630 zio_wait+0x5d(ffffff05123ce048)
  ffffff002035c690 dbuf_read+0x1e8(ffffff0509c758e0, 0, a)
  ffffff002035c710 dmu_buf_hold+0xac(ffffff05028c8940,
ffffffffffffffff, 0, 0, ffffff002035c748, 1)
  ffffff002035c7b0 zap_lockdir+0x6d(ffffff05028c8940,
ffffffffffffffff, 0, 1, 1, 0, ffffff002035c7d8)
  ffffff002035c840 zap_lookup_norm+0x55(ffffff05028c8940,
ffffffffffffffff, ffffff002035c920, 8, 1, ffffff002035c8b8, 0, 0, 0
  , 0)
  ffffff002035c8a0 zap_lookup+0x2d(ffffff05028c8940, ffffffffffffffff,
ffffff002035c920, 8, 1, ffffff002035c8b8)
  ffffff002035c910 zap_increment+0x64(ffffff05028c8940,
ffffffffffffffff, ffffff002035c920, fffffffeffef7e00,
  ffffff0511d9bc80)
  ffffff002035c990 zap_increment_int+0x68(ffffff05028c8940,
ffffffffffffffff, 0, fffffffeffef7e00, ffffff0511d9bc80)
  ffffff002035c9f0 do_userquota_update+0x69(ffffff05028c8940,
100108000, 3, 0, 0, 1, ffffff0511d9bc80)
  ffffff002035ca50
dmu_objset_do_userquota_updates+0xde(ffffff05028c8940,
ffffff0511d9bc80)
  ffffff002035cad0 dsl_pool_sync+0x112(ffffff0502ceac00, f34)
  ffffff002035cb80 spa_sync+0x37b(ffffff0501269580, f34)
  ffffff002035cc20 txg_sync_thread+0x247(ffffff0502ceac00)
  ffffff002035cc30 thread_start+8()
> ffffff05123ce048::zio -r
ADDRESS                                  TYPE  STAGE            WAITER
ffffff05123ce048                         NULL  CHECKSUM_VERIFY  ffffff002035cc40
 ffffff051a9a9338                        READ  VDEV_IO_START    -
  ffffff050e3a4050                       READ  VDEV_IO_DONE     -
   ffffff0519173c90                      READ  VDEV_IO_START    -

>ffffff0519173c90::print zio_t io_done
io_done = vdev_cache_fill

The zio ffffff0519173c90 is vdec cach read rquest and can not be done
so that txt_sync_thread isblocked. I dont know why this zio can not be
satisfied and enter into done stage. I have tried to dd the raw device
which consists the pool when this zfs hangs, it works ok.

Thanks
Zhihui

On Mon, Jul 5, 2010 at 7:56 PM, zhihui Chen <zhch...@gmail.com> wrote:
> I tried to run "zfs list" on my system, but looks that this command
> will hangs. This command can not return even if I press "contrl+c" as
> following:
> r...@intel7:/export/bench/io/filebench/results# zfs list
> ^C^C^C^C
>
> ^C^C^C^C
>
>
>
>
> ..
> When this happens, I am running filebench benchmark with oltp
> workload. But "zpool status" shows that all pools are in good statu
> like following:
> r...@intel7:~# zpool status
>  pool: rpool
>  state: ONLINE
> status: The pool is formatted using an older on-disk format.  The pool can
>        still be used, but some features are unavailable.
> action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
>        pool will no longer be accessible on older software versions.
>  scan: none requested
> config:
>
>        NAME        STATE     READ WRITE CKSUM
>        rpool       ONLINE       0     0     0
>          c8t0d0s0  ONLINE       0     0     0
>
> errors: No known data errors
>
>  pool: tpool
>  state: ONLINE
>  scan: none requested
> config:
>
>        NAME        STATE     READ WRITE CKSUM
>        tpool       ONLINE       0     0     0
>          c10t1d0   ONLINE       0     0     0
>
> errors: No known data errors
>
>
> My system is running B141 and tpool is using the latest version 26.
> Tried command "truss -p `pgrep zfs`", but  it failes like following:
>
> r...@intel7:~# truss -p `pgrep zfs`
> truss: unanticipated system error: 5060
>
> Looks that zfs is in deadlock state, but I dont know what is the
> cause. I have tried to run filebench/oltp workload several times, each
> time it will leads to this state. But if I run filebench with other
> workload such as fileserver, webwerver, this issue does not happen.
>
> Thanks
> Zhihui
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to