On Tue, Dec 26, 2017 at 4:50 PM, Tom Ivar Helbekkmo <t...@hamartun.priv.no> wrote: > Ryota Ozaki <ozak...@netbsd.org> writes: > >> One possible fix has been committed. >> >> Can you update the source code and try a new kernel? > > Will do.
Thanks. > > Meanwhile, before I got around to building a kernel with debug options > enabled, I had another hang. Got it into DDB successfully, but then I > managed, while looking for a way to switch between CPUs (the man page is > wrong in this respect), to get DDB to look at something it shouldn't > have, so it just said "fatal pag", and that was that. > > I did get a backtrace of CPU 0, though, and it looked interesting: > > pool_catchup() > pool_get() > pool_cache_get_slow() > pool_cache_get_paddr() > m_get() > m_gethdr() > wm_add_rxbuf() > wm_rxeof() > wm_intr_legacy() > intr_biglock_wrapper() > Xintr_ioapic_level2() > --- interrupt --- > Xspllower() > uvm_km_kmem_alloc() > pool_page_alloc() > pool_grow() > pool_catchup() > pool_get() > pool_cache_get_slow() > pool_cache_get_paddr() > m_get() > m_gethdr() > tcp_output() > tcp_send_wrapper() > sosend() > soo_write() > dofilewrite() > sys_write() > syscall() > --- syscall (number 4) --- Looks the below infinite loop is happening? I think we need to summon a pool expert. ozaki-r (Copied and modified the diagram from PR 52858) [lwp #1] | [pool_grow with PR_NOWAIT [set PR_GROWING and PR_GROWINGNOWAIT [mutex_exit(&pp->pr_lock) | (interrupted) [intr #1] | [pool_catchup [pool_grow with PR_NOWAIT [see PR_GROWING and PR_GROWINGNOWAIT are set [return ERESTART [repeat pool_grow in pool_catchup...