On Sun, Oct 13, 2019 at 12:09:43PM -0700, Pravin Shelar wrote: > On Thu, Oct 10, 2019 at 12:07 PM Guillaume Nault <gna...@redhat.com> wrote: > > > > In rtnl_net_notifyid(), we certainly can't pass a null GFP flag to > > rtnl_notify(). A GFP_KERNEL flag would be fine in most circumstances, > > but there are a few paths calling rtnl_net_notifyid() from atomic > > context or from RCU critical section. The later also precludes the use > > of gfp_any() as it wouldn't detect the RCU case. Also, the nlmsg_new() > > call is wrong too, as it uses GFP_KERNEL unconditionally. > > > > Therefore, we need to pass the GFP flags as parameter. The problem then > > propagates recursively to the callers until the proper flags can be > > determined. The problematic call chains are: > > > > * ovs_vport_cmd_fill_info -> peernet2id_alloc -> rtnl_net_notifyid > > > > * rtnl_fill_ifinfo -> rtnl_fill_link_netnsid -> peernet2id_alloc > > -> rtnl_net_notifyid > > > > For openvswitch, ovs_vport_cmd_get() and ovs_vport_cmd_dump() prevent > > ovs_vport_cmd_fill_info() from using GFP_KERNEL. It'd be nice to move > > the call out of the RCU critical sections, but struct vport doesn't > > have a reference counter, so that'd probably require taking the ovs > > lock. Also, I don't get why ovs_vport_cmd_build_info() used GFP_ATOMIC > > in nlmsg_new(). I've changed it to GFP_KERNEL for consistency, as this > > functions seems to be allowed to sleep (as stated in the comment, it's > > called from a workqueue, under the protection of a mutex). > > > It is safe to change GFP flags to GFP_KERNEL in ovs_vport_cmd_build_info(). > The patch looks good to me. > Thanks for your feedback.
The point of my RFC is to know if it's possible to avoid all these gfp_t flags, by allowing ovs_vport_cmd_fill_info() to sleep (at least I'd like to figure out if it's worth spending time investigating this path). To do so, we'd requires moving the ovs_vport_cmd_fill_info() call of ovs_vport_cmd_{get,dump}() out of RCU critical section. Since we have no reference counter, I believe we'd have to protect these calls with ovs_lock() instead of RCU. Is that acceptable? If not, is there any other way?