Hi > -----Original Message----- > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com] > Sent: Thursday, December 14, 2017 3:27 PM > To: Matan Azrad <ma...@mellanox.com> > Cc: Adrien Mazarguil <adrien.mazarg...@6wind.com>; Thomas Monjalon > <tho...@monjalon.net>; dev@dpdk.org; sta...@dpdk.org > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device handling > > On Thu, Dec 14, 2017 at 01:07:31PM +0000, Matan Azrad wrote: > > Hi Gaetan > > > > > -----Original Message----- > > > From: Gaëtan Rivet [mailto:gaetan.ri...@6wind.com] > > > Sent: Thursday, December 14, 2017 12:49 PM > > > To: Matan Azrad <ma...@mellanox.com> > > > Cc: Adrien Mazarguil <adrien.mazarg...@6wind.com>; Thomas Monjalon > > > <tho...@monjalon.net>; dev@dpdk.org; sta...@dpdk.org > > > Subject: Re: [PATCH v2 4/4] net/failsafe: fix removed device > > > handling > > > > > > On Thu, Dec 14, 2017 at 10:40:22AM +0000, Matan Azrad wrote: > > > > Hi Gaetan > > > > > > > > > <snip> > > > > > > Ok, actually you were right here to do it this way. The "is_removed" > > > > > check needs to happen after the operation attempt to effectively > > > > > mitigate the possible race. Checking before attempting the call > > > > > will be much less effective. > > > > > > > > > > That being said, would it be cleaner to have eth_dev ops return > > > > > -ENODEV directly, and check against it within fail-safe? > > > > > > > > > > > > > I think that according to "is_removed" semantic we must return a > > > > Boolean > > > value (Each value different from '0' means that the device is > > > removed) like other functions in c library (for example isspace()). > > > > > > > > > > Sure, I wasn't discussing the interface proposed by > > > rte_eth_dev_is_removed(). > > > > > > What I meant was to ask whether checking rte_eth_dev_is_removed() > > > would be more interesting in the ethdev layer, making the > > > eth_dev_ops return -ENODEV regardless of the previous error if this > > > check is supported by the driver and signal that the port is removed. > > > > > > I think this information could be interesting to other systems, not > > > just fail- safe. > > > > > > > Ok. Got you now. > > Interesting approach - plan: > > 1. update fs_link_update to use rte_eth* functions. > > I'm surprised it doesn't already. > Either the rte_eth* function was introduced after the failsafe, or be wary of > potential issues. I don't see a problem right now though. > > > 2. maybe -EIO is preferred because -ENODEV is used for no port > error? > > Good point, didn't think about it. > Prepare yourself maybe to some arguments about the most relevant error > code. -EIO seems fine to me, but maybe use a wrapper for all this. > > Something like: > > ---8<--- > > static int > eth_error(pid, int original_ret) > { > int ret; > > if (original_ret == 0) > return original_ret; > ret = rte_eth_is_removed(pid); > if (ret == 0 || ret == -ENOTSUP) > return original_ret; > return -EIO; > } > > int > rte_eth_ops_xyz(pid) > { > int ret; > ret = eth_dev(pid).ops_xyz(); > return eth_error(pid, ret); > } > > --->8--- > > This way you would be able to change it easily and the logic would be > insulated. >
Nice. > > 3. update all relevant rte_eth* to use "is_removed" in error flows(1 > patch for flow APIs and 1 for the others). > > 4. Change fs checks in error flows to check rte_eth* return values. > > 5. Remove CC stable from commit massage. > > > > What do you think? > > > > Agreed otherwise. > Will create V3, thanks! > Thanks, > > > > -- > > > Gaëtan Rivet > > > 6WIND > > -- > Gaëtan Rivet > 6WIND