On Thu, Oct 10, 2013 at 8:56 PM, Jesse Gross <je...@nicira.com> wrote: > On Thu, Oct 10, 2013 at 5:47 PM, Alexei Starovoitov <a...@plumgrid.com> wrote: >> On Thu, Oct 10, 2013 at 3:38 PM, Pravin Shelar <pshe...@nicira.com> wrote: >>> On Thu, Oct 10, 2013 at 1:48 PM, Alexei Starovoitov <a...@plumgrid.com> >>> wrote: >>>> On Thu, Oct 10, 2013 at 11:21 AM, Pravin Shelar <pshe...@nicira.com> wrote: >>>>> On Wed, Oct 9, 2013 at 11:26 PM, Alexei Starovoitov <a...@plumgrid.com> >>>>> wrote: >>>>>> On Wed, Oct 9, 2013 at 11:07 PM, Pravin Shelar <pshe...@nicira.com> >>>>>> wrote: >>>>>>> On Wed, Oct 9, 2013 at 9:11 PM, Alexei Starovoitov <a...@plumgrid.com> >>>>>>> wrote: >>>>>>>> On Wed, Oct 9, 2013 at 8:02 PM, Pravin Shelar <pshe...@nicira.com> >>>>>>>> wrote: >>>>>>>>> On Tue, Oct 8, 2013 at 8:07 PM, Alexei Starovoitov >>>>>>>>> <a...@plumgrid.com> wrote: >>>>>>>>>> The combination of two commits >>>>>>>>>> >>>>>>>>>> commit 8e4e1713e4 >>>>>>>>>> ("openvswitch: Simplify datapath locking.") >>>>>>>>>> >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>> commit 2537b4dd0a >>>>>>>>>> ("openvswitch:: link upper device for port devices") >>>>>>>>>> >>>>>>>>>> introduced a bug where upper_dev wasn't unlinked upon >>>>>>>>>> netdev_unregister notification >>>>>>>>>> >>>>>>>>>> The following steps: >>>>>>>>>> >>>>>>>>>> modprobe openvswitch >>>>>>>>>> ovs-dpctl add-dp test >>>>>>>>>> ip tuntap add dev tap1 mode tap >>>>>>>>>> ovs-dpctl add-if test tap1 >>>>>>>>>> ip tuntap del dev tap1 mode tap >>>>>>>>>> >>>>>>>>>> are causing multiple warnings: >>>>>>>>>> diff --git a/net/openvswitch/dp_notify.c >>>>>>>>>> b/net/openvswitch/dp_notify.c >>>>>>>>>> index c323567..e9380bd 100644 >>>>>>>>>> --- a/net/openvswitch/dp_notify.c >>>>>>>>>> +++ b/net/openvswitch/dp_notify.c >>>>>>>>>> @@ -88,6 +88,11 @@ static int dp_device_event(struct notifier_block >>>>>>>>>> *unused, unsigned long event, >>>>>>>>>> return NOTIFY_DONE; >>>>>>>>>> >>>>>>>>>> if (event == NETDEV_UNREGISTER) { >>>>>>>>>> + /* rx_handler_unregister and upper_dev_unlink >>>>>>>>>> immediately */ >>>>>>>>>> + if (dev->reg_state == NETREG_UNREGISTERING) >>>>>>>>>> + ovs_netdev_unlink_dev(vport); >>>>>>>>>> + >>>>>>>>> >>>>>>>>> Rather than doing vport destroy here, we can just unlink upper device >>>>>>>>> and let workq do rest of work. >>>>>>>> >>>>>>>> isn't it what it's doing? >>>>>>> >>>>>>> I meant just call netdev_upper_dev_unlink() here in event handler and >>>>>>> rest of vport destroy can be done in workq. >>>>>> >>>>>> netdev_upper_dev_unlink() without netdev_rx_handler_unregister() ?! >>>>>> that's dangerous. >>>>> why is it dangerous? ovs already had ref to net-device. >>>> >>>> comment from dev.c: >>>> /* Notify protocols, that we are about to destroy >>>> this device. They should clean all the things. >>>> */ >>>> call_netdevice_notifiers(NETDEV_UNREGISTER, dev); >>>> >>>> so here you're suggesting to just netdev_upper_dev_unlink() to silence >>>> the warning, >>>> but then do dev_set_promisc(-1) in workqueue? >>>> >>> promiscuity setting is different issue. If you want to have discussion >>> then you can post separate patch for same. Lets fix the warning here. >>> >>>> well, as a minimum audit of promiscuity will be wrong. >>>> ndo_change_rx_flags will be called after ndo_uninit, >>>> all sorts of other cleanups are done. >>> >>> change_rx_flags() checks for UP flag for given device. >>> >>>> I cannot track all possible scenarios, but it seems much safer to >>>> cleanup everything possible >>>> as soon as ovs received NETDEV_UNREGISTER event. >>>> >>>> May be all these risks are worth taking, then please explain what is >>>> the problem with the proposed patch? >>>> >>> Problem is that this is causing layering issues in OVS. dp_notify is >>> suppose to work at dp layer. your patch directly calls vport-netdev >>> implementation function from dp_notify. >>> I could not think of a simple approach that will do this in completely >>> clean manner. Therefore I am trying to minimize layering issues. So >>> just calling netdev_upper_dev_unlink() looks better than doing >>> anything extra. >> >> dp_notify is per net, not per dp. >> notifier can only be called for net_device and the first thing it does: >> if (!ovs_is_internal_dev(dev)) >> vport = ovs_netdev_get_vport(dev); >> where ovs_netdev_get_vport() is defined in vport-netdev.c >> >> once it gets into workq, it checks for: >> if (vport->ops->type != OVS_VPORT_TYPE_NETDEV) >> continue; >> and >> netdev_vport = netdev_vport_priv(vport); >> where netdev_vport_priv() is defined in netdev-vport.h >> >> only then it proceeds with generic ovs_dp_detach_port(). >> >> Is that the layering you talking about? >> >> So to minimize layering issues you want to call 'upper_dev_link' from >> netdev_create() in vport-netdev.c >> and 'upper_dev_unlink' directly from dp_device_event() in dp_notify.c? >> >> That will look better than calling ovs_netdev_unlink_dev() ? > > At a minimum, this function name is misleading because it does a bunch > of other things beyond unlink the device.
sure. Please propose a different name. > At this point, it basically seems like six of one, half dozen of the > other. I don't think that either solution is ideal but it doesn't > really matter all that much. > > However, the check dev->reg_state in netdev_destroy() looks racy to > me, as it could already be in NETREG_UNREGISTERED even if we already > processed this device. you mean that netdev_destroy() will see reg_state == netreg_unregistered, while dp_device_event() didn't see reg_state == netreg_unregistering yet? or dp_device_event() saw it, proceeded to do unlink and netdev_destroy() ran in parallel? well, that's why reg_state == netreg_unregistering check in netdev_destroy() is done with rtnl_lock() held. reg_state cannot go into netreg_unregistered state skipping netreg_unregistering and notifier. therefore I don't think it's racy. In ovs_dp_notify_wq() you're checking for both unregistering and unregistered and that makes sense, since workq can run after unregistering notifier called and netdev_run_todo() already changed the state to unregistered. But here it's not the case. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev