> -----Original Message----- > From: Yigit, Ferruh > Sent: Friday, January 19, 2018 4:19 PM > To: Matan Azrad <ma...@mellanox.com>; Adrien Mazarguil > <adrien.mazarg...@6wind.com>; Gaetan Rivet <gaetan.ri...@6wind.com> > Cc: Thomas Monjalon <tho...@monjalon.net>; dev@dpdk.org; Andrew Rybchenko > <arybche...@solarflare.com>; Ananyev, Konstantin > <konstantin.anan...@intel.com>; Alejandro Lucero > <alejandro.luc...@netronome.com>; Jerin Jacob > <jerin.ja...@caviumnetworks.com>; Hemant Agrawal <hemant.agra...@nxp.com>; > Shahaf Shuler <shah...@mellanox.com>; Adrien > Mazarguil <adrien.mazarg...@6wind.com>; Olivier MATZ > <olivier.m...@6wind.com>; Zhang, Helin <helin.zh...@intel.com> > Subject: Re: [PATCH v6 4/6] ethdev: adjust APIs removal error report > > On 1/18/2018 6:10 PM, Matan Azrad wrote: > > Hi Ferruh > > > > From: Ferruh Yigit, Thursday, January 18, 2018 7:31 PM > >> On 1/18/2018 11:27 AM, Matan Azrad wrote: > >>> rte_eth_dev_is_removed API was added to detect a device removal > >>> synchronously. > >>> > >>> When a device removal occurs during control command execution, many > >>> different errors can be reported to the user. > >>> > >>> Adjust all ethdev APIs error reports to return -EIO in case of device > >>> removal using rte_eth_dev_is_removed API. > >>> > >>> Signed-off-by: Matan Azrad <ma...@mellanox.com> > >>> Acked-by: Thomas Monjalon <tho...@monjalon.net> > >>> --- > >>> lib/librte_ether/rte_ethdev.c | 192 > >>> +++++++++++++++++++++++++++--------------- > >>> lib/librte_ether/rte_ethdev.h | 51 ++++++++++- > >>> 2 files changed, 170 insertions(+), 73 deletions(-) > >>> > >>> diff --git a/lib/librte_ether/rte_ethdev.c > >>> b/lib/librte_ether/rte_ethdev.c index c93cec1..7044159 100644 > >>> --- a/lib/librte_ether/rte_ethdev.c > >>> +++ b/lib/librte_ether/rte_ethdev.c > >>> @@ -338,6 +338,16 @@ struct rte_eth_dev * > >>> return -ENODEV; > >>> } > >>> > >>> +static int > >>> +eth_err(uint16_t port_id, int ret) > >>> +{ > >>> + if (ret == 0) > >>> + return 0; > >>> + if (rte_eth_dev_is_removed(port_id)) > >>> + return -EIO; > >>> + return ret; > >>> +} > >>> + > >>> /* attach the new device, then store port_id of the device */ int > >>> rte_eth_dev_attach(const char *devargs, uint16_t *port_id) @@ -492,7 > >>> +502,8 @@ struct rte_eth_dev * > >>> return 0; > >>> } > >>> > >>> - return dev->dev_ops->rx_queue_start(dev, rx_queue_id); > >>> + return eth_err(port_id, dev->dev_ops->rx_queue_start(dev, > >>> + rx_queue_id)); > >>> > >>> } > >> > >> This patch updates *all* ethdev public APIs to add if device is removed > >> check? > > > > Yes. > > > >> And each check goes to ethdev is_removed() dev_ops to ask if dev is > >> removed. > > Probably, if the REMOVED state setted in will not call device is_remove. > > > >> These must be better way of doing this, am I missing something. > > > > Suggest. > > With a silly analogy, this is like a blind person asking each time if he is > dead > before talking to other person.
I am agree with Ferruh that it looks a bit clumsy... Though I don't have any bright ideas here too. As I can see right now is_removed() is implemented only for mlx PMD. Would I make sense to hide that check inside mlx implementation then? BTW: int +rte_eth_dev_is_removed(uint16_t port_id) +{ + struct rte_eth_dev *dev; + int ret; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0); + + dev = &rte_eth_devices[port_id]; + + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->is_removed, 0); I'd says these 2 checks have to be swapped. Konstantin + + if (dev->state == RTE_ETH_DEV_REMOVED) + return 1; + > > At first glance I can think of a kind of watchdog timer can be implemented in > ethdev layer. It provides periodic checks and if device is dead it calls the > registered user callback function. > > This method presented as synchronous method but not triggered from side where > event happens, I mean not triggered from PMD but from application. > So does application doing polling continuously if device is dead? > Or if application is relying this patch to add a check in each API, what > happens > if device removed during data processing, will app rely on asynchronous > method? > > I am including a few consumers of the ethdev to the mail thread, clearly I am > not very supportive of this patch, but specially taking release is being close > to the account, if there is no objection than me I will take as consensus to > get > the patch in. > > > > > This code will replace similar code in each PMD. > > > >> I definitely would like to see more comments for this patch. > >> > >> Another question is what happens if device removed while or before > >> dev_ops called? There is no synchronizations in drivers for removal, right? > >> > > > > Yes. You right, the device removal can be changed a moment after the call. > > Actually the caller suspected in removal before call it(and want to > > validate it) - so it makes sense. > > From this reason the check in ethdev APIs is called generally in error > > flows. > > > > > >> <...>