netinet

Ryota Ozaki Thu, 28 Dec 2017 00:35:04 -0800

On Thu, Dec 28, 2017 at 5:05 PM, Tom Ivar Helbekkmo
<[email protected]> wrote:
> Ryota Ozaki <[email protected]> writes:
>
>> I think the below patch fixes the above issue, but probably
>> there is a better solution.
>
> Looks like didn't -- it just changed it a little bit.  Just like the
> last time, the hang happened while reading email over IMAP, which
> exercises disk and network at the same time, while the machine was busy
> doing a parallellized system build in the background.  This time,
> though, I got a core dump.  Here's the hang (the active process on this
> CPU is the IMAP server):


Oh, my patch failed to keep SPL at IPL_VM because mutex_exit
tries to restore an SPL where mutex_enter is called. So I had to
put splvm before mutex_enter. Could you try the 2nd patch:
  http://www.netbsd.org/~ozaki-r/fix-pool_catchup.diff


>
> __cpu_simple_lock_try() at __cpu_simple_lock_try+0x9
> pool_grow() at pool_grow+0x55d
> pool_catchup() at pool_catchup+0x32
> pool_get() at pool_get+0x492
> pool_cache_get_slow() at pool_cache_get_slow+0x1b4
> pool_cache_get_paddr() at pool_cache_get_paddr+0x275
> m_get() at m_get+0x2a
> m_gethdr() at m_gethdr+0x9
> wm_add_rxbuf() at wm_add_rxbuf+0x3a
> wm_rxeof() at wm_rxeof+0x146
> wm_intr_legacy() at wm_intr_legacy+0xa1
> intr_biglock_wrapper() at intr_biglock_wrapper+0x1d
> Xintr_ioapic_level2() at Xintr_ioapic_level2+0xf7
> --- interrupt ---
> Xspllower() at Xspllower+0xe
> uvm_km_kmem_alloc() at uvm_km_kmem_alloc+0x139
> pool_page_alloc() at pool_page_alloc+0x2c
> pool_grow() at pool_grow+0x24f
> pool_catchup() at pool_catchup+0x32
> pool_get() at pool_get+0x492
> pool_cache_get_slow() at pool_cache_get_slow+0x1b4
> pool_cache_get_paddr() at pool_cache_get_paddr+0x275
> m_get() at m_get+0x2a
> m_gethdr() at m_gethdr+0x9
> sosend() at sosend+0x35a
> soo_write() at soo_write+0x2c
> dofilewrite() at dofilewrite+0x97
> sys_write() at sys_write+0x5f
> syscall() at syscall+0x1d8
> --- syscall (number 4) ---
>
> The only other CPU that looks interesting has this (copied from a
> photograph of the console, as crash(8) doesn't know about CPUs):
>
> _kernel_lock()
> ip_slowtimo()
> pfslowtimo()
> callout_softclock()
> softint_dispatch()

This is correct. intr_biglock_wrapper in the first backtrace holds
KERNEL_LOCK and this _kernel_lock() waits for it to be released.

Thanks,
  ozaki-r

Re: CVS commit: src/sys/netinet

Reply via email to