On 20/02/16 06:00, Gonglei (Arei) wrote: > Hi, > > Thanks for rapid feedback :) > >> From: David Miller [mailto:da...@davemloft.net] >> Sent: Saturday, February 20, 2016 12:37 PM >> >> From: Gonglei <arei.gong...@huawei.com> >> Date: Sat, 20 Feb 2016 09:27:26 +0800 >> >>> It's possible for a race condition to exist between xennet_open() and >>> talk_to_netback(). After invoking netfront_probe() then other >>> threads or processes invoke xennet_open (such as NetworkManager) >>> immediately may trigger BUG_ON(). Besides, we also should reset >>> real_num_tx_queues in xennet_destroy_queues(). >> >> One should really never invoke register_netdev() until the device is >> %100 fully initialized. >> >> This means you cannot call register_netdev() until it is completely >> legal to invoke your ->open() method. >> >> And I think that is what the real problem is here. >> >> If you follow the correct rules for ordering wrt. register_netdev() >> there are no "races". Because ->open() must be legally invokable >> from the exact moment you call register_netdev(). >> > > Yes, I agree. Though that's the historic legacy problem. ;) > >> I'm not applying this, as it really sounds like the fundamental issue >> is the order in which the xen-netfront private data is initialized >> or setup before being registered. > > That means register_netdev() should be invoked after xennet_connect(), right?
No. This would mean that the network device is removed and re-added when a guest is migrated which at best would result in considerably more downtime (e.g., the IP address has to be renegotiated with DHCP). David