29/11/2017 20:17, Ferruh Yigit:
> >>> On Thu, Oct 05, 2017 at 10:42:08PM +0000, Ophir Munk wrote:
> >>>> This commit prevents control path operations from failing after a sub
> >>>> device removal.
> >>>>
> >>>> Following are the failure steps:
> >>>> 1. The physical device is removed due to change in one of PF
> >>>> parameters (e.g. MTU) 2. The interrupt thread flags the device 3.
> >>>> Within 2 seconds Interrupt thread initializes the actual device
> >>>> removal, then every 2 seconds it tries to re-sync (plug in) the
> >>>> device. The trials fail as long as VF parameter mismatches the PF
> >>> parameter.
> >>>> 4. A control thread initiates a control operation on failsafe which
> >>>> initiates this operation on the device.
> >>>> 5. A race condition occurs between the control thread and interrupt
> >>>> thread when accessing the device data structures.
> >>>>
> >>>> This commit prevents the race condition in step 5. Before this commit
> >>>> if a device was removed and then a control thread operation was
> >>>> initiated on failsafe - in some cases failsafe called the sub device
> >>>> operation instead of avoiding it. Such cases could lead to operations
> >>> failures.
[...]
> 
> Reminder of this patch remaining from previous release.

Gaetan, what is the decision for this possible race condition?
Can we try to fix it in 18.02?

Reply via email to