On 11/13/2016 11:06, Andriy Gapon wrote:
On 12/11/2016 14:40, Henri Hennebert wrote:
I attatch it

Thank you!
So, these two threads are trying to get the lock in the exclusive mode:
Thread 687 (Thread 101243):
#0  sched_switch (td=0xfffff800b642b500, newtd=0xfffff8000285ea00, flags=<value
optimized out>) at /usr/src/sys/kern/sched_ule.c:1973
#1  0xffffffff80561ae2 in mi_switch (flags=<value optimized out>, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0xffffffff8052f854 in sleeplk (lk=<value optimized out>, flags=<value
optimized out>, ilk=<value optimized out>, wmesg=0xffffffff813be535 "zfs",
pri=<value optimized out>, timo=51) at /usr/src/sys/kern/kern_lock.c:222
#4  0xffffffff8052f39d in __lockmgr_args (lk=<value optimized out>, flags=<value
optimized out>, ilk=<value optimized out>, wmesg=<value optimized out>,
pri=<value optimized out>, timo=<value optimized out>, file=<value optimized
out>, line=<value optimized out>) at /usr/src/sys/kern/kern_lock.c:958
#5  0xffffffff80616a8c in vop_stdlock (ap=<value optimized out>) at lockmgr.h:98
#6  0xffffffff8093784d in VOP_LOCK1_APV (vop=<value optimized out>, a=<value
optimized out>) at vnode_if.c:2087
#7  0xffffffff8063c5b3 in _vn_lock (vp=<value optimized out>, flags=548864,
file=<value optimized out>, line=<value optimized out>) at vnode_if.h:859
#8  0xffffffff8062a5f7 in vget (vp=0xfffff80049c2c000, flags=548864,
td=0xfffff800b642b500) at /usr/src/sys/kern/vfs_subr.c:2523
#9  0xffffffff806118b9 in cache_lookup (dvp=<value optimized out>, vpp=<value
optimized out>, cnp=<value optimized out>, tsp=<value optimized out>,
ticksp=<value optimized out>) at /usr/src/sys/kern/vfs_cache.c:686
#10 0xffffffff806133dc in vfs_cache_lookup (ap=<value optimized out>) at
/usr/src/sys/kern/vfs_cache.c:1081
#11 0xffffffff80935777 in VOP_LOOKUP_APV (vop=<value optimized out>, a=<value
optimized out>) at vnode_if.c:127
#12 0xffffffff8061cdf1 in lookup (ndp=<value optimized out>) at vnode_if.h:54
#13 0xffffffff8061c492 in namei (ndp=<value optimized out>) at
/usr/src/sys/kern/vfs_lookup.c:306
#14 0xffffffff80509395 in kern_execve (td=<value optimized out>, args=<value
optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443
#15 0xffffffff80508ccc in sys_execve (td=0xfffff800b642b500,
uap=0xfffffe010182cb80) at /usr/src/sys/kern/kern_exec.c:218
#16 0xffffffff808d449e in amd64_syscall (td=<value optimized out>, traced=0) at
subr_syscall.c:135
#17 0xffffffff808b7ddb in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396

Thread 681 (Thread 101147):
#0  sched_switch (td=0xfffff80065f4e500, newtd=0xfffff8000285f000, flags=<value
optimized out>) at /usr/src/sys/kern/sched_ule.c:1973
#1  0xffffffff80561ae2 in mi_switch (flags=<value optimized out>, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:455
#2  0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:646
#3  0xffffffff8052f854 in sleeplk (lk=<value optimized out>, flags=<value
optimized out>, ilk=<value optimized out>, wmesg=0xffffffff813be535 "zfs",
pri=<value optimized out>, timo=51) at /usr/src/sys/kern/kern_lock.c:222
#4  0xffffffff8052f39d in __lockmgr_args (lk=<value optimized out>, flags=<value
optimized out>, ilk=<value optimized out>, wmesg=<value optimized out>,
pri=<value optimized out>, timo=<value optimized out>, file=<value optimized
out>, line=<value optimized out>) at /usr/src/sys/kern/kern_lock.c:958
#5  0xffffffff80616a8c in vop_stdlock (ap=<value optimized out>) at lockmgr.h:98
#6  0xffffffff8093784d in VOP_LOCK1_APV (vop=<value optimized out>, a=<value
optimized out>) at vnode_if.c:2087
#7  0xffffffff8063c5b3 in _vn_lock (vp=<value optimized out>, flags=548864,
file=<value optimized out>, line=<value optimized out>) at vnode_if.h:859
#8  0xffffffff8062a5f7 in vget (vp=0xfffff80049c2c000, flags=548864,
td=0xfffff80065f4e500) at /usr/src/sys/kern/vfs_subr.c:2523
#9  0xffffffff806118b9 in cache_lookup (dvp=<value optimized out>, vpp=<value
optimized out>, cnp=<value optimized out>, tsp=<value optimized out>,
ticksp=<value optimized out>) at /usr/src/sys/kern/vfs_cache.c:686
#10 0xffffffff806133dc in vfs_cache_lookup (ap=<value optimized out>) at
/usr/src/sys/kern/vfs_cache.c:1081
#11 0xffffffff80935777 in VOP_LOOKUP_APV (vop=<value optimized out>, a=<value
optimized out>) at vnode_if.c:127
#12 0xffffffff8061cdf1 in lookup (ndp=<value optimized out>) at vnode_if.h:54
#13 0xffffffff8061c492 in namei (ndp=<value optimized out>) at
/usr/src/sys/kern/vfs_lookup.c:306
#14 0xffffffff80509395 in kern_execve (td=<value optimized out>, args=<value
optimized out>, mac_p=0x0) at /usr/src/sys/kern/kern_exec.c:443
#15 0xffffffff80508ccc in sys_execve (td=0xfffff80065f4e500,
uap=0xfffffe01016b8b80) at /usr/src/sys/kern/kern_exec.c:218
#16 0xffffffff808d449e in amd64_syscall (td=<value optimized out>, traced=0) at
subr_syscall.c:135
#17 0xffffffff808b7ddb in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:396

This 2 threads are innd processes. In core.txt.4:

8 14789 29165 0 24 4 40040 6612 zfs DN - 0:00.00 [innd] 8 29165 1 0 20 0 42496 6888 select Ds - 0:01.33 [innd] 8 49778 29165 0 24 4 40040 6900 zfs DN - 0:00.00 [innd] 8 82034 29165 0 24 4 132 0 zfs DN - 0:00.00 [innd]

the corresponding info treads are:

687 Thread 101243 (PID=49778: innd) sched_switch (td=0xfffff800b642b500, newtd=0xfffff8000285ea00, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 681 Thread 101147 (PID=14789: innd) sched_switch (td=0xfffff80065f4e500, newtd=0xfffff8000285f000, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 669 Thread 101250 (PID=82034: innd) sched_switch (td=0xfffff800b6429000, newtd=0xfffff8000285ea00, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973 665 Thread 101262 (PID=29165: innd) sched_switch (td=0xfffff800b6b54a00, newtd=0xfffff8000285ea00, flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973

So your missing tread must be 101250:

(kgdb) tid 101250
[Switching to thread 669 (Thread 101250)]#0 sched_switch (td=0xfffff800b6429000, newtd=0xfffff8000285ea00,
    flags=<value optimized out>) at /usr/src/sys/kern/sched_ule.c:1973
1973                    cpuid = PCPU_GET(cpuid);
Current language:  auto; currently minimal
(kgdb) bt
#0 sched_switch (td=0xfffff800b6429000, newtd=0xfffff8000285ea00, flags=<value optimized out>)
    at /usr/src/sys/kern/sched_ule.c:1973
#1 0xffffffff80561ae2 in mi_switch (flags=<value optimized out>, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:455 #2 0xffffffff805ae8da in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:646 #3 0xffffffff8052f854 in sleeplk (lk=<value optimized out>, flags=<value optimized out>, ilk=<value optimized out>, wmesg=0xffffffff813be535 "zfs", pri=<value optimized out>, timo=51) at /usr/src/sys/kern/kern_lock.c:222 #4 0xffffffff8052f39d in __lockmgr_args (lk=<value optimized out>, flags=<value optimized out>, ilk=<value optimized out>, wmesg=<value optimized out>, pri=<value optimized out>, timo=<value optimized out>, file=<value optimized out>,
    line=<value optimized out>) at /usr/src/sys/kern/kern_lock.c:958
#5 0xffffffff80616a8c in vop_stdlock (ap=<value optimized out>) at lockmgr.h:98 #6 0xffffffff8093784d in VOP_LOCK1_APV (vop=<value optimized out>, a=<value optimized out>) at vnode_if.c:2087 #7 0xffffffff8063c5b3 in _vn_lock (vp=<value optimized out>, flags=525312, file=<value optimized out>, line=<value optimized out>)
    at vnode_if.h:859
#8 0xffffffff804e16cf in exec_elf64_imgact (imgp=<value optimized out>) at /usr/src/sys/kern/imgact_elf.c:899 #9 0xffffffff8050983d in kern_execve (td=<value optimized out>, args=<value optimized out>, mac_p=0x0)
    at /usr/src/sys/kern/kern_exec.c:602
#10 0xffffffff80508ccc in sys_execve (td=0xfffff800b6429000, uap=0xfffffe010184fb80) at /usr/src/sys/kern/kern_exec.c:218 #11 0xffffffff808d449e in amd64_syscall (td=<value optimized out>, traced=0) at subr_syscall.c:135 #12 0xffffffff808b7ddb in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:396
#13 0x000000080217edaa in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)


And the original stuck thread wants to get the lock in the shared mode.
And there should be another thread that already holds the lock in the shared
mode.  But I am not able to identify it.  I wonder if the original thread could
be trying to get the lock recursively...

It would be interesting to get more details from thread 101112.
You can switch to it using tid command, you can use 'fr' to select frames, 'info
local' and 'info args' to see what variables are available (not optimized out)
and the you can print any that look interesting.  It would be nice to get a file
path and a directory vnode where the lookup is called.

Thank you.

I find nothing interresting:

(kgdb) fr 15
#15 0xffffffff806369cc in sys_fstatat (td=0x0, uap=0xfffffe010161db80) at /usr/src/sys/kern/vfs_syscalls.c:2136
2136            error = kern_statat(td, uap->flag, uap->fd, uap->path,
(kgdb) print *uap
$1 = {fd_l_ = 0xfffffe010161db80 "\234ÿÿÿ", fd = -100, fd_r_ = 0xfffffe010161db84 "", path_l_ = 0xfffffe010161db88 "\210£â", path = 0x800e2a388 <Address 0x800e2a388 out of bounds>, path_r_ = 0xfffffe010161db90 "\020£â", buf_l_ = 0xfffffe010161db90 "\020£â", buf = 0x800e2a310, buf_r_ = 0xfffffe010161db98 "", flag_l_ = 0xfffffe010161db98 "",
  flag = 512, flag_r_ = 0xfffffe010161db9c ""}
(kgdb) print $1->path
$2 = 0x800e2a388 <Address 0x800e2a388 out of bounds>
(kgdb) fr 8
#8 0xffffffff8062a5f7 in vget (vp=0xfffff80049c2c000, flags=2121728, td=0xfffff80009ba0500) at /usr/src/sys/kern/vfs_subr.c:2523
2523            if ((error = vn_lock(vp, flags)) != 0) {
(kgdb) print *vp
$3 = {v_tag = 0xffffffff813be535 "zfs", v_op = 0xffffffff813d0f70, v_data = 0xfffff80049c1f420, v_mount = 0xfffff800093aa660, v_nmntvnodes = {tqe_next = 0xfffff80049c2c938, tqe_prev = 0xfffff80049c2bb30}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = { tqh_first = 0xfffff800bfc8e3f0, tqh_last = 0xfffff800bfc8e410}, v_cache_dd = 0x0, v_lock = {lock_object = { lo_name = 0xffffffff813be535 "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 23, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96}, v_interlock = {lock_object = {lo_name = 0xffffffff8099e9e0 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xfffff80049c2c068, v_actfreelist = { tqe_next = 0xfffff80049c2c938, tqe_prev = 0xfffff80049ae9bd0}, v_bufobj = {bo_lock = {lock_object = { lo_name = 0xffffffff8099e9f0 "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, bo_ops = 0xffffffff80c4bf70, bo_object = 0xfffff800b62e9c60, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff80049c2c000, __bo_vnode = 0xfffff80049c2c000, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80049c2c120}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80049c2c140}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0x0, tqh_last = 0xfffff80049c2c188}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 10, v_usecount = 6, v_iflag = 512,
  v_vflag = 32, v_writecount = 0, v_hash = 4833984, v_type = VREG}
(kgdb)


I also try to get information from the execve of the other treads:

for tid 101250:
(kgdb) fr 10
#10 0xffffffff80508ccc in sys_execve (td=0xfffff800b6429000, uap=0xfffffe010184fb80) at /usr/src/sys/kern/kern_exec.c:218
218                     error = kern_execve(td, &args, NULL);
(kgdb) print *uap
$4 = {fname_l_ = 0xfffffe010184fb80 "`\220\217\002\b", fname = 0x8028f9060 <Address 0x8028f9060 out of bounds>, fname_r_ = 0xfffffe010184fb88 "`¶ÿÿÿ\177", argv_l_ = 0xfffffe010184fb88 "`¶ÿÿÿ\177", argv = 0x7fffffffb660, argv_r_ = 0xfffffe010184fb90 "\bÜÿÿÿ\177", envv_l_ = 0xfffffe010184fb90 "\bÜÿÿÿ\177", envv = 0x7fffffffdc08,
  envv_r_ = 0xfffffe010184fb98 ""}
(kgdb)

for tid 101243:

(kgdb) f 15
#15 0xffffffff80508ccc in sys_execve (td=0xfffff800b642b500, uap=0xfffffe010182cb80) at /usr/src/sys/kern/kern_exec.c:218
218                     error = kern_execve(td, &args, NULL);
(kgdb) print *uap
$5 = {fname_l_ = 0xfffffe010182cb80 "ÀÏ\205\002\b", fname = 0x80285cfc0 <Address 0x80285cfc0 out of bounds>, fname_r_ = 0xfffffe010182cb88 "`¶ÿÿÿ\177", argv_l_ = 0xfffffe010182cb88 "`¶ÿÿÿ\177", argv = 0x7fffffffb660, argv_r_ = 0xfffffe010182cb90 "\bÜÿÿÿ\177", envv_l_ = 0xfffffe010182cb90 "\bÜÿÿÿ\177", envv = 0x7fffffffdc08,
  envv_r_ = 0xfffffe010182cb98 ""}
(kgdb)

Henri
_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to