Gaetan, Matan, Please could you check?
25/09/2020 22:30, Long Li: > HI Matan, > > While troubleshooting a failure in DPDK on device removal when VF device > briefly disappears and comes back, I notice the failsafe driver is trying > repeatedly to start a sub device (after this sub device has been successfully > configured, but later hot removed from the kernel). This is due to repeated > alarms calling fs_dev_start(). I trace into this commit: > > ae80146 net/failsafe: fix removed device handling > > The implementation of fs_err() is interesting: > > +fs_err(struct sub_device *sdev, int err) > +{ > + /* A device removal shouldn't be reported as an error. */ > + if (sdev->remove == 1 || err == -EIO) > + return rte_errno = 0; > + return err; > +} > > If I change this function to: > @@ -497,7 +497,7 @@ int failsafe_eth_new_event_callback(uint16_t port_id > fs_err(struct sub_device *sdev, int err) > { > /* A device removal shouldn't be reported as an error. */ > - if (sdev->remove == 1 || err == -EIO) > + if (sdev->remove == 1 && err == -EIO) > return rte_errno = 0; > return err; > } > > The hung is going away. I don't know the reason why we use a || in the if(). > If a call to rte_eth_dev_start() returning EIO (as the case in fs_dev_start), > the best choice would be bail out and fail this sub device. > > Can you please take a look? > > Thanks, > Long > > ________________________________ > From: Matan Azrad <ma...@nvidia.com> > Sent: Tuesday, September 15, 2020 12:00 AM > To: Long Li <lon...@microsoft.com>; Stephen Hemminger > <step...@networkplumber.org> > Cc: ma...@mellanox.com <ma...@mellanox.com>; gr...@u246.net <gr...@u246.net>; > dev@dpdk.org <dev@dpdk.org>; Raslan Darawsheh <rasl...@nvidia.com> > Subject: RE: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of associated > pci device > > Hi Li > > From: Long Li <lon...@microsoft.com> > > >Subject: Re: [dpdk-dev] [PATCH] net/vdev_netvsc: handle removal of > > >associated pci device > > > > > >Hi Stephen > > > > > >From: Stephen Hemminger: > > >> On Sun, 6 Sep 2020 12:38:18 +0000 > > >> Matan Azrad <ma...@nvidia.com> wrote: > > >> > > >> > Hi Stephen > > >> > > > >> > From: Stephen Hemminger: > > >> > > The vdev_netvsc was not detecting when the associated PCI device > > >> > > (SRIOV) was removed. Because of that it would keep feeding the > > >> > > same > > >> > > (removed) device to failsafe PMD which would then unsuccessfully > > >> > > try and probe for it. > > >> > > > > >> > > Change to use a mark/sweep method to detect that PCI device was > > >> > > removed, and also only tell failsafe about new PCI devices. > > >> > > Vdev_netvsc does not have to keep stuffing the pipe with the same > > >> > > already existing PCI device. > > >> > > > >> > As I know, the vdev_netvsc driver doesn't call to failsafe if the > > >> > PCI device is > > >> not detected by the readlink command(considered as removed)... > > >> > Am I missing something? > > >> > > >> The original code is broken because ctx_yield is not cleared, it > > >> keeps sending the same value. > > > > > >Looking on the code again, It looks like ctx->yield has no effect on > > >the next pipe write, It is just used for log. > > > > > >After the PCI interface matching to the netvsc interface, the pipe > > >write is triggered only if the readlink commands success to see the > > >plugged-in PCI > > >device: > > >readlink /sys/class/net/[iface]/device/subsystem shows "pci" > > >readlink /sys/class/net/[iface]/device shows the pci device ID. > > > > > >So, the assumption is when the above readlink failed on the interface > > >the device is removed(plugged-out) and the fd write will not happen. > > > > > >The code will continue to retry probe again and again until success > > >only for plugged-in pci device matched the netvsc device. > > > > Hi Matan, > > > > The original code keeps writing to pipe even it's the same PCI device. > > Yes, the vdev_netvsc writes any plugged-in device to the associated netvsc > device fd. > > > The > > new code writes to pipe for a new device, only once. See the following code: > > > > + /* Skip if this is same device already sent to failsafe */ > > + if (strcmp(addr, ctx->yield) == 0) > > + return 0; > > > > I understand you want to optimize the pipe writing to be written only after > plugged-in hot event. > > The current solution suffers from race: the PCI device may be plugged-out and > plugged-in in short time shorter than the driver alarm delay, then the PCI > device plugged-in detection will lost. > > My suggestion: > Add validation to the plugged-in device probing state and that it is owned by > failsafe(using ownership API) - don't write the pipe if so. > > Matan > > > > > This patch also saves lots of CPU since it no longer writes to pipe all the > > time. > > You are correct about the code will continue to probe on a new PCI device. > > But someone has to do it to handle hot-add. > > > > Thanks, > > Long > > > > > > > > > >> It looks like device removal and add was never tested. > > > > > >This is basic test we have to test plug-in plug-out and it passed every > > >day in the last years. > > > > > >Maybe something new and special in your setup? > > > > > >> If you test removal you will see that vdev_netvsc: > > >> 1. Sends same PCI device repeatedly to failsafe (every alarm call) > > >> This is harmless, but useless. > > >> 2. When device is removed, keeps doing #1 >