Hi Gaetan, Thanks for your quick reply. Please see comments inline. Regards, Ophir
> -----Original Message----- > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com] > Sent: Friday, October 20, 2017 1:35 PM > To: Ophir Munk <ophi...@mellanox.com> > Cc: dev@dpdk.org; Thomas Monjalon <tho...@monjalon.net>; Olga Shern > <ol...@mellanox.com>; sta...@dpdk.org > Subject: Re: [PATCH v3] net/failsafe: fix calling device during RMV events > > Hi Ophir, > > Sorry about the delay, > I have a few remarks, I think this patch could be simpler. > > First, about the commit logline: > "calling device" is not descriptive enough. I'd suggest > > net/failsafe: fix device configuration during RMV events > > But I'm not a native speaker either, so use it if you think it is better, or > don't, > it's only a suggestion :). > > On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote: > > This commit prevents control path operations from failing after a sub > > device removal. > > > > Following are the failure steps: > > 1. The physical device is removed due to change in one of PF > > parameters (e.g. MTU) 2. The interrupt thread flags the device 3. > > Within 2 seconds Interrupt thread initializes the actual device > > removal, then every 2 seconds it tries to re-sync (plug in) the > > device. The trials fail as long as VF parameter mismatches the PF > parameter. > > 4. A control thread initiates a control operation on failsafe which > > initiates this operation on the device. > > 5. A race condition occurs between the control thread and interrupt > > thread when accessing the device data structures. > > > > This commit prevents the race condition in step 5. Before this commit > > if a device was removed and then a control thread operation was > > initiated on failsafe - in some cases failsafe called the sub device > > operation instead of avoiding it. Such cases could lead to operations > failures. > > > > This is a nitpick, but as said earlier, this is not preventing the race > condition. > This race is still present and can still wreak havok on unsuspecting users. > > If an application has a weak threading model, it will be subject to this race > condition still. It is possible to prevent it fully with proper care from the > application standpoint, but this is not specific to fail-safe and does not > concern us here. > > Anyway, it's really a nitpick, I just wanted to point it out. This is not too > important for this patch. > > > This commit fixes failsafe criteria to determine when the device is > > removed such that it will avoid calling the sub device operations > > during that time and will only call them otherwise. > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Ophir Munk <ophi...@mellanox.com> > > --- > > v3: > > 1. Rebase v2 > > > > 2. Please ignore checkpatch checks on arguments re-usage - they are > confirmed. > > CHECK:MACRO_ARG_REUSE: Macro argument reuse ... possible side- > effects? > > #217: FILE: drivers/net/failsafe/failsafe_private.h:241: > > > > 3. Add rationales (copy from an email which accompanied v2): > > > > On Monday, September 11, 2017 11:31 AM, Gaetan Rivet wrote: > > > > > > Hi Ophir, > > > > > > On Sat, Sep 09, 2017 at 07:27:11PM +0000, Ophir Munk wrote: > > > > This commit prevents control path operations from failing after a > > > > sub device has informed failsafe it has been removed. > > > > > > > > Before this commit if a device was removed and then a control path > > > > > > Here are the steps if I understood correctly: > > > > > > 0. The physical device is removed > > > 1. The interrupt thread flags the device 2. A control lcore > > > initiates a control operation 3. The alarm triggers, waking up the > > > eal-intr- > thread, > > > initiating the actual device removal. > > > 4. Race condition occurs between control lcore and interrupt thread. > > > > > > "if a device was removed" is ambiguous I think (are we speaking > > > about the physical port? Is it only flagged? Is it after the removal of > > > the > device itself?). > > > From the context I gather that you mean the device is flagged to be > > > removed, but it won't be as clear in a few month when we revisit this bug > :) . > > > > > > Could you please rephrase this so that the whole context of the > > > issue is available? > > > > > > > Done. Commit message was rephrased based on your comments > > > > > > operations was initiated on failsafe - in some cases failsafe > > > > called the sub device operation instead of avoiding it. Such cases > > > > could lead to operations failures. > > > > > > > > This commit fixes failsafe criteria to determine when the device > > > > is removed such that it will avoid calling the sub device > > > > operations during that time and will only call them otherwise. > > > > > > > > > > This commit mitigates the race condition, reducing the probability > > > for it to have an effect. It does not, however, remove this race > > > condition, which is inherent to the DPDK architecture at the moment. > > > > > > A proper fix, a more detailed workaround and additional > > > documentation warning users writing applications to mind their threads > could be interesting. > > > > > > > The race condition occurs in the last step and may lead to > > segmentation faults (accessing data structures of the same device by 2 > > threads) The previous steps ("the physical device is removed", etc) were not > recreated and tested but probably cannot lead to segmentation fault. > > > > > But let's focus on this patch for the time being. > > > > > > > Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD") > > > > Cc: sta...@dpdk.org > > > > > > > > Signed-off-by: Ophir Munk <ophi...@mellanox.com> > > > > --- > > > > drivers/net/failsafe/failsafe_ether.c | 1 + > > > > drivers/net/failsafe/failsafe_ops.c | 52 > > > +++++++++++++++++++++++++++++------ > > > > 2 files changed, 45 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > > > b/drivers/net/failsafe/failsafe_ether.c > > > > index a3a8cce..1def110 100644 > > > > --- a/drivers/net/failsafe/failsafe_ether.c > > > > +++ b/drivers/net/failsafe/failsafe_ether.c > > > > @@ -378,6 +378,7 @@ > > > > > > Could you please generate your patches with the function name in the > diff? > > > > Done > > > > > > > > > i); > > > > goto err_remove; > > > > } > > > > + sdev->remove = 0; > > > > > > You are adding this here, within failsafe_eth_dev_state_sync, and > > > removing it from the dev_configure ops. > > > > > > 10 lines above, the call to dev_configure is done, meaning that the > > > remove flag was resetted at this point. > > > > > > Can you explain why you prefer resetting the flag here? > > > > > > The position of this flag reset will be dependent upon my subsequent > > > remarks anyway, so hold that thought :) . > > > > > > > The motivation for resetting the "remove" flag within > failsafe_eth_dev_state_sync is as follows: > > Previously to this patch the "remove" flag was designed to signal the need > to remove the sub device. > > Once the sub device was removed and before being reconfigured the > "remove" flag was reset. > > > > After this patch the scope of the "remove" flag was *extended* to > > indicate the sub device status as being "plugged out" by resetting this flag > only after a successful call to failsafe_eth_dev_state_sync(). > > The "plug out" status could last a very long time (seconds, minutes, days, > weeks, ...). > > > > Previously to this patch failsafe based the "plugged out" status on > > the sub device state as being below ACTIVE however every 2 seconds > > dev_configure() was called where the sub device was assigned sdev- > > >state = DEV_ACTIVE; therefore the sub device state became ACTIVE for > some time every 2 seconds. > > This is where the race condition occurred: failsafe considered the sub > > device as "Plugged in" for some time every 2 seconds (based on its ACTIVE > state) while it was actually plugged out. > > > > After this patch the "Plugged out" status is based on the "remove" flag. > > > > Sorry, I do not agree with this semantical change on the "remove" flag. > You are essentially adding a new device state, which could be fine per se, but > should not be done here. > > The enum dev_state is there for this purpose. > > The flag dev->remove, calls for an operation to be done upon the concerned > device. It is not meant to become a new device state. > > A point about the work methodoly here: if you wanted to change this > semantic, which could be legitimate and sometimes called for, you should > have proposed it either during a discussion in a response to my previous > email, or introducing the change as a separate patch. This point is important > enough for it to have its own patch, meaning we would have a whole thread > dedicated to it instead of having to interleave commentaries between > related-but-separate diffs on the code. > > But anyway, if you think you need to express a PLUGOUT state, I'd suggest > adding a state between DEV_UNDEFINED and DEV_PARSED. > DEV_UNDEFINED means that the device is in limbo and has no existence per > se (its parsing failed for example, it is not clear whether the parameters are > correct, etc...). DEV_PLUGOUT could mean then that the device has been > successfully probed at least once, meaning that it could possibly have > residuals from this probing still there, or specific care to be taken when > manipulating it. > > However, I'm not yet convinced that this new state is necessary. I think you > can mitigate this race condition without having to add it. If you insist in > introducing this state, please do so in a separate patch, with proper > definition about the meaning of this state: > > + When it should be valid for a device to be in this state. > + Which operation corresponds to getting into and out of this state. > + Why this state is interesting and what could not be expressed before > that is thus being fixed by introducing this state. > > But please verify twice whether you absolutely need to complexify the > current fail-safe internals before going all in and basing your work upon it > :) > Indeed what I am currently missing in failsafe is knowing the device is in a PLUGOUT state. Your suggestion to add a new state DEV_PLUGOUT cannot be used with the current implementation as the device states are modified during an alarm hotplug handling every 2 seconds. In fs_hotplug_alarm() we call failsafe_eth_dev_state_sync(dev) which eventually calls ->dev_configure(dev) where we assign: sdev->state = DEV_ACTIVE; then when sync fails fs_hotplug_alarm() calls failsafe_dev_remove(dev) which will call fs_dev_remove(sdev); where the sub devices states are changed from ACTIVE down to DEV_UNDEFINED. Having said that it means that during a PLUGOUT event - the states are modified with each invocation of the fs_hotplug_alarm every 2 seconds. So even if we added DEV_PLUGOUT state - it will not remain fixed during the hotplug alarm handling. I have also verified all of this with printouts. When seeing a sub device in state "DEV_ACTIVE" - we cannot tell whether the device is currently in "PLUGOUT situation" or "PLUGIN situation" This allows operations such as fs_mtu_set() on sub-devices which are in "PLUGOUT situation" while their state is DEV_ACTIVE to be manipulated, which I think should have been avoided. Please note fs_mtu_set() implementation: tatic int fs_mtu_set(struct rte_eth_dev *dev, uint16_t mtu) { .. FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { // ***** We are here while the device can be in a "PLUGOUT situation" *** To summarize: I am missing a way to know in failsafe that a sub-device is currently plugged out 1. I suggested extending the "remove" flag scope for this purpose. It has minimal changes with current failsafe implementation. You prefer not using "remove". 2. You suggested adding a new state DEV_PLUGOUT. I don't think it will work with current implementation (as explained above) or may require a redesign of current implementation. Can you suggest another way? > > > > } > > > > } > > > > /* > > > > diff --git a/drivers/net/failsafe/failsafe_ops.c > > > > b/drivers/net/failsafe/failsafe_ops.c > > > > index ff9ad15..314d53d 100644 > > > > --- a/drivers/net/failsafe/failsafe_ops.c > > > > +++ b/drivers/net/failsafe/failsafe_ops.c > > > > @@ -232,7 +232,6 @@ > > > > dev->data->dev_conf.intr_conf.lsc = 0; > > > > } > > > > DEBUG("Configuring sub-device %d", i); > > > > - sdev->remove = 0; > > > > ret = rte_eth_dev_configure(PORT_ID(sdev), > > > > dev->data->nb_rx_queues, > > > > dev->data->nb_tx_queues, > > > > @@ -311,6 +310,8 @@ > > > > int ret; > > > > > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > > > + if (sdev->remove) > > > > + continue; > > > > DEBUG("Calling rte_eth_dev_set_link_up on sub_device > > > > %d", > > > i); > > > > ret = rte_eth_dev_set_link_up(PORT_ID(sdev)); > > > > if (ret) { > > > > @@ -330,6 +331,8 @@ > > > > int ret; > > > > > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > > > + if (sdev->remove) > > > > + continue; > > > > > > For this change and all the others: > > > > > > I think it might be best to have this check added to fs_find_next > > > directly. > > > > > > Most of the call to the iterators are done within dev_ops, so it > > > makes sense I think to have it there. > > > > > > But then there'd be an issue with the sub-EAL iterations done on > > > previously- removed ports, as the removed flag is precisely resetted > > > too late. The function failsafe_dev_remove would also need to have a > > > manual iteration upon the sub-devices instead of using the macro. > > > > > > I think you can actually reset this flag within fs_dev_remove, > > > instead of the next plug-in, then having this check within > > > fs_find_next > > > *should* not be a problem. > > > > > > > With the new scope of "remove" flag (remaining set to 1 as long as the sub > device is "plugged out" > > which may last for a very long time) we cannot reset it in > > fs_dev_remove which is called every 2 seconds. > > > > With the remove flag staying as it is, I think it should thus be resetted > within > fs_dev_remove. Actually I think it both helps you write you fix, and clarify > the > meaning and intended purpose of this flag. > > > > I think you should break up those changes in two: first move the > > > flag reset to fs_dev_remove instead of fs_dev_configure, then add > > > this check to the iterator. > > > > > Please, do this fix this way. I think moving the dev->remove flag can have > subtile consequences, and I'd like to have a specific commit to trace back > which one is responsible. > > > > This way, a git bisect should allow us to pinpoint more easily any > > > new bug as both changes have the potential to introduce subtle ones. > > > > > Well, like I said :). > > > > > I suggest defining a new macro > > > > FOREACH_SUBDEV_ACTIVE(sdev, i, dev) { ... > > > > that will replace all cases of: > > > > FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > if (sdev->remove) > > continue; > > > > In order to support the new macro I added a "check_remove" flag to > > fs_find_next (which is based on your idea above: "I think it might be best > > to > have this check added to fs_find_next directly"). > > > > I'd prefer avoiding multiplying the macros. I agree. Should be avoided. > There are already two iterators. You add one, which now means that there > are two ways of iterating upon active devices: using you new macro, and > using the old one. The difference between the two would be difficult to > know, without profound knowledge of the rest of the code: that in one > place the flag is checked, and in the other it is not. > > As such, I suggest you check in all cases that the flag is not set. This > simplifies the use of these macros and the conditions in which their use > is correct. > > This means that you have to manually iterate in places where this flag > should be ignored. I listed these places in my previous email, but I may > have missed some, please be careful. > I already did it in v1 then changed it in V2/3 based on reviews (probably my misunderstanding) > Thanks, > -- > Gaëtan Rivet > 6WIND