> -----Original Message----- > From: Konstantin Ananyev <konstantin.v.anan...@yandex.ru> > Sent: Thursday, March 2, 2023 6:22 PM > To: dev@dpdk.org > Subject: Re: [PATCH 1/5] ethdev: fix race-condition of proactive error > handling > mode > > > > > > >> -----Original Message----- > >> From: Chengwen Feng <fengcheng...@huawei.com> > >> Sent: Tuesday, February 28, 2023 9:06 PM > >> To: tho...@monjalon.net; ferruh.yi...@amd.com; > >> konstantin.anan...@huawei.com; Andrew Rybchenko > >> <andrew.rybche...@oktetlabs.ru>; Kalesh AP <kalesh- > >> anakkur.pura...@broadcom.com>; Ajit Khaparde > >> (ajit.khapa...@broadcom.com) <ajit.khapa...@broadcom.com> > >> Cc: dev@dpdk.org > >> Subject: [PATCH 1/5] ethdev: fix race-condition of proactive error > >> handling mode > >> > >> In the proactive error handling mode, the PMD will set the data path > >> pointers to dummy functions and then try recovery, in this period the > >> application may still invoking data path API. This will introduce a > >> race-condition with data path which may lead to crash [1]. > >> > >> Although the PMD added delay after setting data path pointers to > >> cover the above race-condition, it reduces the probability, but it > >> doesn't solve the problem. > >> > >> To solve the race-condition problem fundamentally, the following > >> requirements are added: > >> 1. The PMD should set the data path pointers to dummy functions after > >> report RTE_ETH_EVENT_ERR_RECOVERING event. > > Do you mean to say, PMD should set the data path pointers after calling the > call back function? > > The PMD is running in the context of multiple EAL threads. How do these > threads synchronize such that only one thread sets these data pointers? > > As I understand this event callback supposed to be called in the context of > EAL > interrupt thread (whoever is more familiar with original idea, feel free to > correct > me if I missed something). I could not figure this out. It looks to be called from the data plane thread context. I also have a thought on alternate design at the end, appreciate if you can take a look. > How it is going to signal data-path threads that they need to stop/suspend > calling data-path API - that's I suppose is left to application to decide... > Same as right now it is application responsibility to stop data-path threads > before doing dev_stop()/dev/_config()/etc. Ok, good, this expectation is not new. The application must have a mechanism already.
> > > > > >> 2. The application should stop data path API invocation when process > >> the RTE_ETH_EVENT_ERR_RECOVERING event. > > Any thoughts on how an application can do this? We can ignore this question as there is already similar expectation set for earlier functionalities. > > > >> 3. The PMD should set the data path pointers to valid functions before > >> report RTE_ETH_EVENT_RECOVERY_SUCCESS event. > >> 4. The application should enable data path API invocation when process > >> the RTE_ETH_EVENT_RECOVERY_SUCCESS event. > > Do you mean to say that the application should not call the datapath APIs > while the PMD is running the recovery process? > > Yes, I believe that's the intention. Ok, this is good and makes sense. > > >> > >> Also, this patch introduce a driver internal function > >> rte_eth_fp_ops_setup which used as an help function for PMD. > >> > >> [1] > >> > http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2 > >> - > >> ashok.k.kal...@intel.com/ > >> > >> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode") > >> Cc: sta...@dpdk.org > >> > >> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com> > >> --- > >> doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++--------- > >> lib/ethdev/ethdev_driver.c | 8 +++++++ > >> lib/ethdev/ethdev_driver.h | 10 ++++++++ > >> lib/ethdev/rte_ethdev.h | 32 +++++++++++++++---------- > >> lib/ethdev/version.map | 1 + > >> 5 files changed, 46 insertions(+), 25 deletions(-) > >> > >> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst > >> b/doc/guides/prog_guide/poll_mode_drv.rst > >> index c145a9066c..e380ff135a 100644 > >> --- a/doc/guides/prog_guide/poll_mode_drv.rst > >> +++ b/doc/guides/prog_guide/poll_mode_drv.rst > >> @@ -638,14 +638,9 @@ different from the application invokes recovery > >> in PASSIVE mode, the PMD automatically recovers from error in > >> PROACTIVE mode, and only a small amount of work is required for the > application. > >> > >> -During error detection and automatic recovery, -the PMD sets the > >> data path pointers to dummy functions -(which will prevent the > >> crash), -and also make sure the control path operations fail with a return > code ``-EBUSY``. > >> - > >> -Because the PMD recovers automatically, -the application can only > >> sense that the data flow is disconnected for a while -and the control > >> API returns an error in this period. > >> +During error detection and automatic recovery, the PMD sets the data > >> +path pointers to dummy functions and also make sure the control path > >> +operations failed with a return code ``-EBUSY``. > >> > >> In order to sense the error happening/recovering, as well as to > >> restore some additional configuration, @@ -653,9 +648,9 @@ three events > are available: > >> > >> ``RTE_ETH_EVENT_ERR_RECOVERING`` > >> Notify the application that an error is detected > >> - and the recovery is being started. > >> + and the recovery is about to start. > >> Upon receiving the event, the application should not invoke > >> - any control path function until receiving > >> + any control and data path API until receiving > >> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or > >> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event. > >> > >> .. note:: > >> @@ -666,8 +661,9 @@ three events are available: > >> > >> ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` > >> Notify the application that the recovery from error is successful, > >> - the PMD already re-configures the port, > >> - and the effect is the same as a restart operation. > >> + the PMD already re-configures the port. > >> + The application should restore some additional configuration, and > >> + then > > What is the additional configuration? Is this specific to each NIC/PMD? > > I thought, this is an auto recovery process and the application does not > > require > to reconfigure anything. If the application has to restore the configuration, > how > does auto recovery differ from typical recovery process? > > > >> + enable data path API invocation. > >> > >> ``RTE_ETH_EVENT_RECOVERY_FAILED`` > >> Notify the application that the recovery from error failed, diff > >> --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index > >> 0be1e8ca04..f994653fe9 100644 > >> --- a/lib/ethdev/ethdev_driver.c > >> +++ b/lib/ethdev/ethdev_driver.c > >> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev > >> *dev, const char *ring_name, > >> return rc; > >> } > >> > >> +void > >> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev) { > >> + if (dev == NULL) > >> + return; > >> + eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev); } > >> + > >> const struct rte_memzone * > >> rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char > >> *ring_name, > >> uint16_t queue_id, size_t size, unsigned int align, > >> diff - > -git > >> a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index > >> 2c9d615fb5..0d964d1f67 100644 > >> --- a/lib/ethdev/ethdev_driver.h > >> +++ b/lib/ethdev/ethdev_driver.h > >> @@ -1621,6 +1621,16 @@ int > >> rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char > >> *name, > >> uint16_t queue_id); > >> > >> +/** > >> + * @internal > >> + * Setup eth fast-path API to ethdev values. > >> + * > >> + * @param dev > >> + * Pointer to struct rte_eth_dev. > >> + */ > >> +__rte_internal > >> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev); > >> + > >> /** > >> * @internal > >> * Atomically set the link status for the specific device. > >> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index > >> 049641d57c..44ee7229c1 100644 > >> --- a/lib/ethdev/rte_ethdev.h > >> +++ b/lib/ethdev/rte_ethdev.h > >> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type { > >> */ > >> RTE_ETH_EVENT_RX_AVAIL_THRESH, > >> /** Port recovering from a hardware or firmware error. > >> - * If PMD supports proactive error recovery, > >> - * it should trigger this event to notify application > >> - * that it detected an error and the recovery is being started. > >> - * Upon receiving the event, the application should not invoke any > >> control path API > >> - * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving > >> - * RTE_ETH_EVENT_RECOVERY_SUCCESS or > >> RTE_ETH_EVENT_RECOVERY_FAILED event. > >> - * The PMD will set the data path pointers to dummy functions, > >> - * and re-set the data path pointers to non-dummy functions > >> - * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event. > >> - * It means that the application cannot send or receive any packets > >> - * during this period. > >> + * > >> + * If PMD supports proactive error recovery, it should trigger this > >> + * event to notify application that it detected an error and the > >> + * recovery is about to start. > >> + * > >> + * Upon receiving the event, the application should not invoke any > >> + * control and data path API until receiving > >> + * RTE_ETH_EVENT_RECOVERY_SUCCESS or > >> RTE_ETH_EVENT_RECOVERY_FAILED > >> + * event. > >> + * > >> + * Once this event is reported, the PMD will set the data path pointers > >> + * to dummy functions, and re-set the data path pointers to valid > >> + * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS > >> event. > > Why do we need to set the data path pointers to dummy functions if the > application is restricted from invoking any control and data path APIs till > the > recovery process is completed? > > You are right, in theory it is not mandatory. > Though it helps to flag a problem if user will still try to call them while > recovery is > in progress. Ok, may be in debug mode. I mean, we have already set an expectation to the application that it should not call and the application has implemented a method to do the same. Why do we need to complicate this? If the application calls the APIs, it is a programming error. > Again, same as we doing in dev_stop(). > > > > >> + * > >> * @note Before the PMD reports the recovery result, > >> * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event > >> again, > >> * because a larger error may occur during the recovery. > >> */ > >> RTE_ETH_EVENT_ERR_RECOVERING, > > I understand this is not a change in this patch. But, just wondering, what > > is the > purpose of this? How is the application supposed to use this? > > > >> /** Port recovers successfully from the error. > >> - * The PMD already re-configured the port, > >> - * and the effect is the same as a restart operation. > >> + * > >> + * The PMD already re-configured the port: > >> * a) The following operation will be retained: (alphabetically) > >> * - DCB configuration > >> * - FEC configuration > >> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type { > >> * (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP) > >> * c) Any other configuration will not be stored > >> * and will need to be re-configured. > >> + * > >> + * The application should restore some additional configuration > >> + * (see above case b/c), and then enable data path API invocation. > >> */ > >> RTE_ETH_EVENT_RECOVERY_SUCCESS, > >> /** Port recovery failed. > >> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index > >> 357d1a88c0..c273e0bdae 100644 > >> --- a/lib/ethdev/version.map > >> +++ b/lib/ethdev/version.map > >> @@ -320,6 +320,7 @@ INTERNAL { > >> rte_eth_devices; > >> rte_eth_dma_zone_free; > >> rte_eth_dma_zone_reserve; > >> + rte_eth_fp_ops_setup; > >> rte_eth_hairpin_queue_peer_bind; > >> rte_eth_hairpin_queue_peer_unbind; > >> rte_eth_hairpin_queue_peer_update; > >> -- > >> 2.17.1 > > Is there any reason not to design this in the same way as 'rte_eth_dev_reset'? Why does the PMD have to recover by itself? We could have a similar API 'rte_eth_dev_recover' to do the recovery functionality.