17.01.2013 17:03, J. Bruce Fields пишет:
On Thu, Jan 17, 2013 at 09:05:51AM +0400, Stanislav Kinsbursky wrote:
17.01.2013 02:51, Mark Lord пишет:
On 13-01-16 12:20 AM, Stanislav Kinsbursky wrote:

Mark, could you provide any call traces?

Call traces from where/what?
There's this one, posted earlier in the BUG report:

kernel BUG at net/sunrpc/svc_xprt.c:921!
Call Trace:
  [<ffffffffa016a56a>] ? svc_recv+0xcc/0x338 [sunrpc]
  [<ffffffffa0318bfc>] ? nfs_callback_authenticate+0x20/0x20 [nfsv4]
  [<ffffffffa0318c19>] ? nfs4_callback_svc+0x1d/0x3c [nfsv4]
  [<ffffffff810407e6>] ? kthread+0x81/0x89
  [<ffffffff81040765>] ? kthread_freezable_should_stop+0x36/0x36
  [<ffffffff812ea62c>] ? ret_from_fork+0x7c/0xb0
  [<ffffffff81040765>] ? kthread_freezable_should_stop+0x36/0x36


Thanks!
I haven't seen the bug report.
Could you provide the link, please?

There's no bz if that's what you're asking for.

See the first message in the thread for the original report:

        http://mid.gmane.org/<50f42f85.50...@teksavvy.com>


Thanks, Bruce.
This looks like the old issue I was trying to fix with "SUNRPC: protect service 
sockets lists during per-net shutdown".
So, here is the problem as I see it: there is a transport, which is processed 
by service thread and it's processing is racing with per-net service shutdown:

CPU#0:                                                  CPU#1:

svc_recv                                                svc_close_net
svc_get_next_xprt (list_del_init(xpt_ready))
                                                        svc_close_list (set 
XPT_BUSY and XPT_CLOSE)
                                                        svc_clear_pools(xprt 
was gained on CPU#0 already)
                                                        svc_delete_xprt (set 
XPT_DEAD)
svc_handle_xprt (is XPT_CLOSE => svc_delete_xprt()
BUG()

So, from my POW, we need some way to:
1) Skip such in-progress transports on svc_close_net() call (there is not way 
to detect them, or at  least I don't see one)
2) Delete the transport after somewhere after svc_xprt_received()

But there is a problem with svc_xprt_received(): there is a call for svc_xprt_put() in it (svc_recv->svc_handle_xprt->svc_xprt_received->svc_xprt_put) . And if we are the only user - then the transport will be destroyed. But transport is dereferenced later in svc_recv() after the svc_handle_xprt call.

What do you think, Bruce?


--b.



--
Best regards,
Stanislav Kinsbursky
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to