On Mon, Oct 23, 2017 at 07:17:41AM +0000, Ophir Munk wrote: > Hi Gaetan, > Thanks for your quick reply. Please see comments inline. > > Regards, > Ophir > > > -----Original Message----- > > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com] > > Sent: Friday, October 20, 2017 1:35 PM > > To: Ophir Munk <ophi...@mellanox.com> > > Cc: dev@dpdk.org; Thomas Monjalon <tho...@monjalon.net>; Olga Shern > > <ol...@mellanox.com>; sta...@dpdk.org > > Subject: Re: [PATCH v3] net/failsafe: fix calling device during RMV events > > > > Hi Ophir, > > > > Sorry about the delay, > > I have a few remarks, I think this patch could be simpler. > > > > First, about the commit logline: > > "calling device" is not descriptive enough. I'd suggest > > > > net/failsafe: fix device configuration during RMV events > > > > But I'm not a native speaker either, so use it if you think it is better, > > or don't, > > it's only a suggestion :). > > > > On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > > > This commit prevents control path operations from failing after a sub > > > device removal. > > > > > > Following are the failure steps: > > > 1. The physical device is removed due to change in one of PF > > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > > Within 2 seconds Interrupt thread initializes the actual device > > > removal, then every 2 seconds it tries to re-sync (plug in) the > > > device. The trials fail as long as VF parameter mismatches the PF > > parameter. > > > 4. A control thread initiates a control operation on failsafe which > > > initiates this operation on the device. > > > 5. A race condition occurs between the control thread and interrupt > > > thread when accessing the device data structures. > > > > > > This commit prevents the race condition in step 5. Before this commit > > > if a device was removed and then a control thread operation was > > > initiated on failsafe - in some cases failsafe called the sub device > > > operation instead of avoiding it. Such cases could lead to operations > > failures. > > > > > > > This is a nitpick, but as said earlier, this is not preventing the race > > condition. > > This race is still present and can still wreak havok on unsuspecting users. > > > > If an application has a weak threading model, it will be subject to this > > race > > condition still. It is possible to prevent it fully with proper care from > > the > > application standpoint, but this is not specific to fail-safe and does not > > concern us here. > > > > Anyway, it's really a nitpick, I just wanted to point it out. This is not > > too > > important for this patch. > > > > > This commit fixes failsafe criteria to determine when the device is > > > removed such that it will avoid calling the sub device operations > > > during that time and will only call them otherwise. > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > Cc: sta...@dpdk.org > > > > > > Signed-off-by: Ophir Munk <ophi...@mellanox.com> > > > --- > > > v3: > > > 1. Rebase v2 > > > > > > 2. Please ignore checkpatch checks on arguments re-usage - they are > > confirmed. > > > CHECK:MACRO_ARG_REUSE: Macro argument reuse ... possible side- > > effects? > > > #217: FILE: drivers/net/failsafe/failsafe_private.h:241: > > > > > > 3. Add rationales (copy from an email which accompanied v2): > > > > > > On Monday, September 11, 2017 11:31 AM, Gaetan Rivet wrote: > > > > > > > > Hi Ophir, > > > > > > > > On Sat, Sep 09, 2017 at 07:27:11PM +0000, Ophir Munk wrote: > > > > > This commit prevents control path operations from failing after a > > > > > sub device has informed failsafe it has been removed. > > > > > > > > > > Before this commit if a device was removed and then a control path > > > > > > > > Here are the steps if I understood correctly: > > > > > > > > 0. The physical device is removed > > > > 1. The interrupt thread flags the device 2. A control lcore > > > > initiates a control operation 3. The alarm triggers, waking up the > > > > eal-intr- > > thread, > > > > initiating the actual device removal. > > > > 4. Race condition occurs between control lcore and interrupt thread. > > > > > > > > "if a device was removed" is ambiguous I think (are we speaking > > > > about the physical port? Is it only flagged? Is it after the removal of > > > > the > > device itself?). > > > > From the context I gather that you mean the device is flagged to be > > > > removed, but it won't be as clear in a few month when we revisit this > > > > bug > > :) . > > > > > > > > Could you please rephrase this so that the whole context of the > > > > issue is available? > > > > > > > > > > Done. Commit message was rephrased based on your comments > > > > > > > > operations was initiated on failsafe - in some cases failsafe > > > > > called the sub device operation instead of avoiding it. Such cases > > > > > could lead to operations failures. > > > > > > > > > > This commit fixes failsafe criteria to determine when the device > > > > > is removed such that it will avoid calling the sub device > > > > > operations during that time and will only call them otherwise. > > > > > > > > > > > > > This commit mitigates the race condition, reducing the probability > > > > for it to have an effect. It does not, however, remove this race > > > > condition, which is inherent to the DPDK architecture at the moment. > > > > > > > > A proper fix, a more detailed workaround and additional > > > > documentation warning users writing applications to mind their threads > > could be interesting. > > > > > > > > > > The race condition occurs in the last step and may lead to > > > segmentation faults (accessing data structures of the same device by 2 > > > threads) The previous steps ("the physical device is removed", etc) were > > > not > > recreated and tested but probably cannot lead to segmentation fault. > > > > > > > But let's focus on this patch for the time being. > > > > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > > > Cc: sta...@dpdk.org > > > > > > > > > > Signed-off-by: Ophir Munk <ophi...@mellanox.com> > > > > > --- > > > > > drivers/net/failsafe/failsafe_ether.c | 1 + > > > > > drivers/net/failsafe/failsafe_ops.c | 52 > > > > +++++++++++++++++++++++++++++------ > > > > > 2 files changed, 45 insertions(+), 8 deletions(-) > > > > > > > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > > > > b/drivers/net/failsafe/failsafe_ether.c > > > > > index a3a8cce..1def110 100644 > > > > > --- a/drivers/net/failsafe/failsafe_ether.c > > > > > +++ b/drivers/net/failsafe/failsafe_ether.c > > > > > @@ -378,6 +378,7 @@ > > > > > > > > Could you please generate your patches with the function name in the > > diff? > > > > > > Done > > > > > > > > > > > > i); > > > > > goto err_remove; > > > > > } > > > > > + sdev->remove = 0; > > > > > > > > You are adding this here, within failsafe_eth_dev_state_sync, and > > > > removing it from the dev_configure ops. > > > > > > > > 10 lines above, the call to dev_configure is done, meaning that the > > > > remove flag was resetted at this point. > > > > > > > > Can you explain why you prefer resetting the flag here? > > > > > > > > The position of this flag reset will be dependent upon my subsequent > > > > remarks anyway, so hold that thought :) . > > > > > > > > > > The motivation for resetting the "remove" flag within > > failsafe_eth_dev_state_sync is as follows: > > > Previously to this patch the "remove" flag was designed to signal the need > > to remove the sub device. > > > Once the sub device was removed and before being reconfigured the > > "remove" flag was reset. > > > > > > After this patch the scope of the "remove" flag was *extended* to > > > indicate the sub device status as being "plugged out" by resetting this > > > flag > > only after a successful call to failsafe_eth_dev_state_sync(). > > > The "plug out" status could last a very long time (seconds, minutes, days, > > weeks, ...). > > > > > > Previously to this patch failsafe based the "plugged out" status on > > > the sub device state as being below ACTIVE however every 2 seconds > > > dev_configure() was called where the sub device was assigned sdev- > > > >state = DEV_ACTIVE; therefore the sub device state became ACTIVE for > > some time every 2 seconds. > > > This is where the race condition occurred: failsafe considered the sub > > > device as "Plugged in" for some time every 2 seconds (based on its ACTIVE > > state) while it was actually plugged out. > > > > > > After this patch the "Plugged out" status is based on the "remove" flag. > > > > > > > Sorry, I do not agree with this semantical change on the "remove" flag. > > You are essentially adding a new device state, which could be fine per se, > > but > > should not be done here. > > > > The enum dev_state is there for this purpose. > > > > The flag dev->remove, calls for an operation to be done upon the concerned > > device. It is not meant to become a new device state. > > > > A point about the work methodoly here: if you wanted to change this > > semantic, which could be legitimate and sometimes called for, you should > > have proposed it either during a discussion in a response to my previous > > email, or introducing the change as a separate patch. This point is > > important > > enough for it to have its own patch, meaning we would have a whole thread > > dedicated to it instead of having to interleave commentaries between > > related-but-separate diffs on the code. > > > > But anyway, if you think you need to express a PLUGOUT state, I'd suggest > > adding a state between DEV_UNDEFINED and DEV_PARSED. > > DEV_UNDEFINED means that the device is in limbo and has no existence per > > se (its parsing failed for example, it is not clear whether the parameters > > are > > correct, etc...). DEV_PLUGOUT could mean then that the device has been > > successfully probed at least once, meaning that it could possibly have > > residuals from this probing still there, or specific care to be taken when > > manipulating it. > > > > However, I'm not yet convinced that this new state is necessary. I think you > > can mitigate this race condition without having to add it. If you insist in > > introducing this state, please do so in a separate patch, with proper > > definition about the meaning of this state: > > > > + When it should be valid for a device to be in this state. > > + Which operation corresponds to getting into and out of this state. > > + Why this state is interesting and what could not be expressed before > > that is thus being fixed by introducing this state. > > > > But please verify twice whether you absolutely need to complexify the > > current fail-safe internals before going all in and basing your work upon > > it :) > > > > Indeed what I am currently missing in failsafe is knowing the device is in a > PLUGOUT state. > Your suggestion to add a new state DEV_PLUGOUT cannot be used with the > current implementation > as the device states are modified during an alarm hotplug handling every 2 > seconds. > In fs_hotplug_alarm() we call failsafe_eth_dev_state_sync(dev) which > eventually calls ->dev_configure(dev) > where we assign: sdev->state = DEV_ACTIVE; > then when sync fails fs_hotplug_alarm() calls failsafe_dev_remove(dev) which > will call fs_dev_remove(sdev); where the sub devices > states are changed from ACTIVE down to DEV_UNDEFINED. > Having said that it means that during a PLUGOUT event - the states are > modified with each invocation of the fs_hotplug_alarm > every 2 seconds. So even if we added DEV_PLUGOUT state - it will not remain > fixed during the hotplug alarm handling. > I have also verified all of this with printouts. > When seeing a sub device in state "DEV_ACTIVE" - we cannot tell whether the > device is currently in "PLUGOUT situation" or "PLUGIN situation" > This allows operations such as fs_mtu_set() on sub-devices which are in > "PLUGOUT situation" while their state is > DEV_ACTIVE to be manipulated, which I think should have been avoided. > > Please note fs_mtu_set() implementation: > tatic int fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) > { > .. > FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { > // ***** We are here while the device can be in a "PLUGOUT situation" > *** > > To summarize: > I am missing a way to know in failsafe that a sub-device is currently plugged > out
(sdev->state < DEV_ACTIVE && !sdev->remove) means that the device is plugged out. > 1. I suggested extending the "remove" flag scope for this purpose. It has > minimal changes with current failsafe implementation. You prefer not using > "remove". I prefer using it, but as a flag, not as a device state. > 2. You suggested adding a new state DEV_PLUGOUT. I don't think it will work > with current implementation (as explained above) or may require a redesign of > current implementation. > I do not suggest adding a new state DEV_PLUGOUT. I suggest using sdev->remove properly. > Can you suggest another way? > 0. In a separate commit, move the sdev->remove = 0; from fs_dev_configure, into the case DEV_UNDEFINED of the switch within fs_dev_remove. This is cleaner and more logical anyway. 1. Check that sdev->remove is not set in fs_find_next. If sdev->remove is set, then the device should be skipped. 2. In failsafe_dev_remove, do not use the FOREACH_SUBDEV_STATE iterator, but manually iterate over all sub-devices using the subs_tail and subs_head values. As the generic iterator would skip over devices which have sdev->remove set, this function would not work anymore. 3. Find the places I have missed that needs this manual iterator to be used instead of the FOREACH_SUBDEV{,_STATE} ones. I think there is at least other place that I cannot recall, there might be more. If you think this does not work, please tell me why, because then it means that I have misunderstood something about the race condition you are trying to fix. -- Gaëtan Rivet 6WIND