Hi Slawa, On 9/26/16 7:22 PM, Slawa Olhovchenkov wrote: > On Mon, Sep 26, 2016 at 11:33:12AM +0200, Julien Charbon wrote: >> On 9/25/16 2:46 PM, Slawa Olhovchenkov wrote: >>> On Fri, Sep 23, 2016 at 11:01:43PM +0300, Slawa Olhovchenkov wrote: >>>>> On 9/21/16 9:51 PM, Slawa Olhovchenkov wrote: >>>>>> On Wed, Sep 21, 2016 at 09:11:24AM +0200, Julien Charbon wrote: >>>>>>> You can also use Dtrace and lockstat (especially with the lockstat -s >>>>>>> option): >>>>>>> >>>>>>> https://wiki.freebsd.org/DTrace/One-Liners#Kernel_Locks >>>>>>> https://www.freebsd.org/cgi/man.cgi?query=lockstat&manpath=FreeBSD+11.0-RELEASE >>>>>>> >>>>>>> But I am less familiar with Dtrace/lockstat tools. >>>>>> >>>>>> I am still use old kernel and got lockdown again. >>>>>> Try using lockstat (I am save more output), interesting may be next: >>>>>> >>>>>> R/W writer spin on writer: 190019 events in 1.070 seconds (177571 >>>>>> events/sec) >>>>>> >>>>>> ------------------------------------------------------------------------------- >>>>>> Count indv cuml rcnt nsec Lock Caller >>>>>> >>>>>> 140839 74% 74% 0.00 24659 tcpinp >>>>>> tcp_tw_2msl_scan+0xc6 >>>>>> >>>>>> nsec ------ Time Distribution ------ count Stack >>>>>> >>>>>> 4096 | 913 tcp_twstart+0xa3 >>>>>> >>>>>> 8192 |@@@@@@@@@@@@ 58191 >>>>>> tcp_do_segment+0x201f >>>>>> 16384 |@@@@@@ 29594 tcp_input+0xe1c >>>>>> >>>>>> 32768 |@@@@ 23447 ip_input+0x15f >>>>>> >>>>>> 65536 |@@@ 16197 >>>>>> 131072 |@ 8674 >>>>>> 262144 | 3358 >>>>>> 524288 | 456 >>>>>> 1048576 | 9 >>>>>> ------------------------------------------------------------------------------- >>>>>> Count indv cuml rcnt nsec Lock Caller >>>>>> >>>>>> 49180 26% 100% 0.00 15929 tcpinp >>>>>> tcp_tw_2msl_scan+0xc6 >>>>>> >>>>>> nsec ------ Time Distribution ------ count Stack >>>>>> >>>>>> 4096 | 157 pfslowtimo+0x54 >>>>>> >>>>>> 8192 |@@@@@@@@@@@@@@@ 24796 >>>>>> softclock_call_cc+0x179 >>>>>> 16384 |@@@@@@ 11223 softclock+0x44 >>>>>> >>>>>> 32768 |@@@@ 7426 >>>>>> intr_event_execute_handlers+0x95 >>>>>> 65536 |@@ 3918 >>>>>> 131072 | 1363 >>>>>> 262144 | 278 >>>>>> 524288 | 19 >>>>>> ------------------------------------------------------------------------------- >>>>> >>>>> This is interesting, it seems that you have two call paths competing >>>>> for INP locks here: >>>>> >>>>> - pfslowtimo()/tcp_tw_2msl_scan(reuse=0) and >>>>> >>>>> - tcp_input()/tcp_twstart()/tcp_tw_2msl_scan(reuse=1) >>>> >>>> My current hypothesis: >>>> >>>> nginx do write() (or may be close()?) to socket, kernel lock >>>> first inp in V_twq_2msl, happen callout for pfslowtimo() on the same >>>> CPU core and tcp_tw_2msl_scan infinity locked on same inp. >>>> >>>> In this case you modification can't help, before next try we need some >>>> like yeld(). >>> >>> Or may be locks leaks. >>> Or both. >> >> You are totally right, pfslowtimo()/tcp_tw_2msl_scan(reuse=0) is >> infinitely blocked on INP_WLOCK() by "something" (that could be related >> to write()). >> >> As I reached my limit of debugging without WITNESS, could you share >> your /etc/sysctl.conf, /boot/loader.conf files? And any specific >> configuration you have (like having a Nginx workers affinity, Nginx >> special options, etc.). Like that I can try to reproduce it on releng/11.0. > > Some more traces from ddb: > > Tracing command intr pid 12 tid 100103 td 0xfffff8012508ea00 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe2020ea8330 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020ea8360 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2020ea83a0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe2020ea8420 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020ea8460 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020ea8490 > tcp_do_segment() at 0xffffffff80610226 = tcp_do_segment+0x1666/frame > 0xfffffe2020ea8590 > tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020ea86e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020ea8740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020ea87a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020ea87d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020ea8830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020ea8890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020ea88b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020ea8990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020ea89e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020ea8a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020ea8a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020ea8ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020ea8ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100105 td 0xfffff8012508e000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe2020eb2330 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020eb2360 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2020eb23a0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe2020eb2420 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020eb2460 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020eb2490 > tcp_do_segment() at 0xffffffff80610226 = tcp_do_segment+0x1666/frame > 0xfffffe2020eb2590 > tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020eb26e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020eb2740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020eb27a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020eb27d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020eb2830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020eb2890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020eb28b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020eb2990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020eb29e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020eb2a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020eb2a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020eb2ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020eb2ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100107 td 0xfffff8012508d500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe2020ebc2b0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020ebc2e0 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2020ebc320 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe2020ebc3a0 > soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe2020ebc3f0 > sonewconn() at 0xffffffff8051bb9f = sonewconn+0xbf/frame 0xfffffe2020ebc430 > syncache_expand() at 0xffffffff8061b85b = syncache_expand+0x78b/frame > 0xfffffe2020ebc590 > tcp_input() at 0xffffffff8060e10e = tcp_input+0xdae/frame 0xfffffe2020ebc6e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020ebc740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020ebc7a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020ebc7d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020ebc830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020ebc890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020ebc8b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020ebc990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020ebc9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020ebca20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020ebca70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020ebcab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020ebcab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100111 td 0xfffff801250a2000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe2020f302f0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020f30320 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2020f30360 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe2020f303e0 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020f30420 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020f30450 > tcp_twstart() at 0xffffffff8061f0e7 = tcp_twstart+0x2b7/frame > 0xfffffe2020f30490 > tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame > 0xfffffe2020f30590 > tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f306e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f30740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f307a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020f307d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020f30830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f30890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020f308b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020f30990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020f309e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020f30a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020f30a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f30ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020f30ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100113 td 0xfffff801250a1500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe2020f3a2f0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe2020f3a320 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2020f3a360 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe2020f3a3e0 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe2020f3a420 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe2020f3a450 > tcp_twstart() at 0xffffffff8061f0e7 = tcp_twstart+0x2b7/frame > 0xfffffe2020f3a490 > tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame > 0xfffffe2020f3a590 > tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f3a6e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f3a740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f3a7a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020f3a7d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020f3a830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f3a890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020f3a8b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020f3a990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020f3a9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020f3aa20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020f3aa70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f3aab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020f3aab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100115 td 0xfffff801250a0a00 > cpustop_handler() at 0xffffffff80775998 = cpustop_handler+0x28/frame > 0xfffffe1f9e182cf0 > ipi_nmi_handler() at 0xffffffff8077595a = ipi_nmi_handler+0x4a/frame > 0xfffffe1f9e182d10 > trap() at 0xffffffff806e2e4a = trap+0x3a/frame 0xfffffe1f9e182f20 > nmi_calltrap() at 0xffffffff806cb413 = nmi_calltrap+0x8/frame > 0xfffffe1f9e182f20 > --- trap 0x13, rip = 0xffffffff8059b9f9, rsp = 0xfffffe2020f44420, rbp = > 0xfffffe2020f44420 --- > in_pcbref() at 0xffffffff8059b9f9 = in_pcbref+0x9/frame 0xfffffe2020f44420 > tcp_tw_2msl_scan() at 0xffffffff8061f1a3 = tcp_tw_2msl_scan+0x73/frame > 0xfffffe2020f44450 > tcp_twstart() at 0xffffffff8061eed3 = tcp_twstart+0xa3/frame > 0xfffffe2020f44490 > tcp_do_segment() at 0xffffffff80610bdf = tcp_do_segment+0x201f/frame > 0xfffffe2020f44590 > tcp_input() at 0xffffffff8060e17c = tcp_input+0xe1c/frame 0xfffffe2020f446e0 > ip_input() at 0xffffffff805a087f = ip_input+0x15f/frame 0xfffffe2020f44740 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f447a0 > ether_demux() at 0xffffffff80575b3a = ether_demux+0x12a/frame > 0xfffffe2020f447d0 > ether_nh_input() at 0xffffffff80576792 = ether_nh_input+0x322/frame > 0xfffffe2020f44830 > netisr_dispatch_src() at 0xffffffff80583db5 = netisr_dispatch_src+0xa5/frame > 0xfffffe2020f44890 > ether_input() at 0xffffffff80575db6 = ether_input+0x26/frame > 0xfffffe2020f448b0 > ixgbe_rxeof() at 0xffffffff813df36b = ixgbe_rxeof+0x7ab/frame > 0xfffffe2020f44990 > ixgbe_msix_que() at 0xffffffff813da57c = ixgbe_msix_que+0x8c/frame > 0xfffffe2020f449e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe2020f44a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe2020f44a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe2020f44ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe2020f44ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command ps pid 37011 tid 101992 td 0xfffff80144378000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe20224367b0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe20224367e0 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe2022436820 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe20224368a0 > soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe20224368f0 > socreate() at 0xffffffff8051b617 = socreate+0xa7/frame 0xfffffe2022436940 > sys_socket() at 0xffffffff8052144d = sys_socket+0xed/frame 0xfffffe20224369a0 > amd64_syscall() at 0xffffffff806e4051 = amd64_syscall+0x2c1/frame > 0xfffffe2022436ab0 > Xfast_syscall() at 0xffffffff806cb2bb = Xfast_syscall+0xfb/frame > 0xfffffe2022436ab0 > --- syscall (97, FreeBSD ELF64, sys_socket), rip = 0x8011c413a, rsp = > 0x7fffffffc748, rbp = 0x7fffffffc770 --- > > Tracing command cron pid 37008 tid 102228 td 0xfffff801a4090000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe20228d67b0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe20228d67e0 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe20228d6820 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe20228d68a0 > soalloc() at 0xffffffff8051b914 = soalloc+0x1b4/frame 0xfffffe20228d68f0 > socreate() at 0xffffffff8051b617 = socreate+0xa7/frame 0xfffffe20228d6940 > sys_socket() at 0xffffffff8052144d = sys_socket+0xed/frame 0xfffffe20228d69a0 > amd64_syscall() at 0xffffffff806e4051 = amd64_syscall+0x2c1/frame > 0xfffffe20228d6ab0 > Xfast_syscall() at 0xffffffff806cb2bb = Xfast_syscall+0xfb/frame > 0xfffffe20228d6ab0 > --- syscall (97, FreeBSD ELF64, sys_socket), rip = 0x800d8c13a, rsp = > 0x7fffffffd658, rbp = 0x7fffffffd6f0 --- > > [many likes] > > Tracing command intr pid 12 tid 100015 td 0xfffff8011422b000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1cf760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1cf790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1cf7d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1cf850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1cf890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1cf8c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1cf8f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1cf9c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1cf9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1cfa20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1cfa70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1cfab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1cfab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100016 td 0xfffff8011422aa00 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1d4760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1d4790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1d47d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1d4850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1d4890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1d48c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1d48f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1d49c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1d49e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1d4a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1d4a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1d4ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1d4ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100017 td 0xfffff8011422a500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1d9760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1d9790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1d97d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1d9850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1d9890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1d98c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1d98f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1d99c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1d99e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1d9a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1d9a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1d9ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1d9ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100018 td 0xfffff8011422a000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1de760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1de790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1de7d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1de850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1de890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1de8c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1de8f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1de9c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1de9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1dea20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1dea70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1deab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1deab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100019 td 0xfffff8011424da00 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1e3760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1e3790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1e37d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1e3850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1e3890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1e38c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1e38f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1e39c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1e39e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = intr_event_execute-- > 7:zsh -- time-stamp -- Sep/26/16 20:00:13 -- > -- 7:zsh -- time-stamp -- Sep/26/16 20:00:13 -- > _handlers+0x95/frame 0xfffffe1f9e1e3a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1e3a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1e3ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1e3ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100020 td 0xfffff8011424d500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1e8760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1e8790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1e87d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1e8850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1e8890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1e88c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1e88f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1e89c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1e89e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1e8a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1e8a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1e8ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1e8ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100021 td 0xfffff8011424d000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1ed760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1ed790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1ed7d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1ed850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1ed890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1ed8c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1ed8f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1ed9c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1ed9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1eda20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1eda70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1edab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1edab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100022 td 0xfffff8011424ca00 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1f2760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1f2790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1f27d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1f2850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1f2890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1f28c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1f28f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1f29c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1f29e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1f2a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1f2a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1f2ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1f2ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100023 td 0xfffff8011424c500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1f7760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1f7790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1f77d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1f7850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1f7890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1f78c0 > tcp_timer_rexmt() at 0xffffffff8061e454 = tcp_timer_rexmt+0x114/frame > 0xfffffe1f9e1f78f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1f79c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1f79e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1f7a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1f7a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1f7ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1f7ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100024 td 0xfffff8011424c000 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe1f9e1fc760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe1f9e1fc790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe1f9e1fc7d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe1f9e1fc850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe1f9e1fc890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe1f9e1fc8c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe1f9e1fc8f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe1f9e1fc9c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe1f9e1fc9e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe1f9e1fca20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe1f9e1fca70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe1f9e1fcab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe1f9e1fcab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100025 td 0xfffff8011424ba00 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe0000382760 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe0000382790 > turnstile_wait() at 0xffffffff804ef177 = turnstile_wait+0x2a7/frame > 0xfffffe00003827d0 > __mtx_lock_sleep() at 0xffffffff80484d9d = __mtx_lock_sleep+0x13d/frame > 0xfffffe0000382850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe0000382890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe00003828c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe00003828f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe00003829c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe00003829e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe0000382a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe0000382a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe0000382ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe0000382ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > > Tracing command intr pid 12 tid 100026 td 0xfffff8011424b500 > sched_switch() at 0xffffffff804c956d = sched_switch+0x6ad/frame > 0xfffffe00003876f0 > mi_switch() at 0xffffffff804a8d92 = mi_switch+0xd2/frame 0xfffffe0000387720 > critical_exit() at 0xffffffff804a6bee = critical_exit+0x7e/frame > 0xfffffe0000387740 > ipi_bitmap_handler() at 0xffffffff80775629 = ipi_bitmap_handler+0x79/frame > 0xfffffe0000387780 > Xipi_intr_bitmap_handler() at 0xffffffff806cc15e = > Xipi_intr_bitmap_handler+0x8e/frame 0xfffffe0000387780 > --- interrupt, rip = 0xffffffff80484c1f, rsp = 0xfffffe0000387850, rbp = > 0xfffffe0000387850 --- > __mtx_lock_flags() at 0xffffffff80484c1f = __mtx_lock_flags+0x2f/frame > 0xfffffe0000387850 > sodealloc() at 0xffffffff8051b992 = sodealloc+0x32/frame 0xfffffe0000387890 > tcp_close() at 0xffffffff80618150 = tcp_close+0xd0/frame 0xfffffe00003878c0 > tcp_timer_2msl() at 0xffffffff8061dda3 = tcp_timer_2msl+0x1f3/frame > 0xfffffe00003878f0 > softclock_call_cc() at 0xffffffff804b4ca9 = softclock_call_cc+0x179/frame > 0xfffffe00003879c0 > softclock() at 0xffffffff804b5034 = softclock+0x44/frame 0xfffffe00003879e0 > intr_event_execute_handlers() at 0xffffffff8046c605 = > intr_event_execute_handlers+0x95/frame 0xfffffe0000387a20 > ithread_loop() at 0xffffffff8046cc26 = ithread_loop+0xa6/frame > 0xfffffe0000387a70 > fork_exit() at 0xffffffff8046a211 = fork_exit+0x71/frame 0xfffffe0000387ab0 > fork_trampoline() at 0xffffffff806cb50e = fork_trampoline+0xe/frame > 0xfffffe0000387ab0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
Nice stack traces, all threads are blocked in sodealloc() or soalloc() and if you look at how mtx_lock(&so_global_mtx) and mtx_unlock(&so_global_mtx) are used, it is hard to think about a scenario that can lead to this state. I am still trying to reproduce your issue, without success so far. -- Julien
signature.asc
Description: OpenPGP digital signature