> > >>>>>>>>> In the proactive error handling mode, the PMD will set the data 
> > >>>>>>>>> path
> > >>>>>>>>> pointers to dummy functions and then try recovery, in this period 
> > >>>>>>>>> the
> > >>>>>>>>> application may still invoking data path API. This will introduce 
> > >>>>>>>>> a
> > >>>>>>>>> race-condition with data path which may lead to crash [1].
> > >>>>>>>>>
> > >>>>>>>>> Although the PMD added delay after setting data path pointers to 
> > >>>>>>>>> cover
> > >>>>>>>>> the above race-condition, it reduces the probability, but it 
> > >>>>>>>>> doesn't
> > >>>>>>>>> solve the problem.
> > >>>>>>>>>
> > >>>>>>>>> To solve the race-condition problem fundamentally, the following
> > >>>>>>>>> requirements are added:
> > >>>>>>>>> 1. The PMD should set the data path pointers to dummy functions 
> > >>>>>>>>> after
> > >>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>> 2. The application should stop data path API invocation when 
> > >>>>>>>>> process
> > >>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> > >>>>>>>>> 3. The PMD should set the data path pointers to valid functions 
> > >>>>>>>>> before
> > >>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>> 4. The application should enable data path API invocation when 
> > >>>>>>>>> process
> > >>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>> How this is solving the race-condition, by pushing responsibility to
> > >>>>>>> stop data path to application?
> > >>>>>>
> > >>>>>> Exactly, it becomes application responsibility to make sure 
> > >>>>>> data-path is
> > >>>>>> stopped/suspended before recovery will continue.
> > >>>>>>
> > >>>>>
> > >>>>> From documentation of the feature:
> > >>>>>
> > >>>>> ``
> > >>>>> Because the PMD recovers automatically,
> > >>>>> the application can only sense that the data flow is disconnected for 
> > >>>>> a
> > >>>>> while and the control API returns an error in this period.
> > >>>>>
> > >>>>> In order to sense the error happening/recovering, as well as to 
> > >>>>> restore
> > >>>>> some additional configuration, three events are available:
> > >>>>> ``
> > >>>>>
> > >>>>> It looks like initial design is to use events mainly inform 
> > >>>>> application
> > >>>>> about what happened and mainly for re-configuration.
> > >>>>>
> > >>>>> Although I am don't disagree to involve the application, I am not sure
> > >>>>> that is part of current design.
> > >>>>
> > >>>> I thought we all agreed that initial design contain some fallacies that
> > >>>> need to fixed, no?
> > >>>> Statement that with current rte_ethdev design error recovery can be 
> > >>>> done
> > >>>> without interaction with the app (to stop/suspend data/control path)
> > >>>> is the main one I think.
> > >>>> It needs some interaction with app layer, one way or another.
> > >>>>
> > >>>>>>>
> > >>>>>>> What if application is not interested in recovery modes at all and 
> > >>>>>>> not
> > >>>>>>> registered any callback for the recovery?
> > >>>>>>
> > >>>>>>
> > >>>>>> Are you saying there is no way for application to disable
> > >>>>>> automatic recovery in PMD if it is not interested
> > >>>>>> (or can't full-fill per-requesties for it)?
> > >>>>>> If so, then yes it is a problem and we need to fix it.
> > >>>>>> I assumed that such mechanism to disable unwanted events already 
> > >>>>>> exists,
> > >>>>>> but I can't find anything.
> > >>>>>> Wonder what would be the easiest way here - can PMD make a decision
> > >>>>>> based on callback return value, or do we need a new API to
> > >>>>>> enable/disable callbacks, or ...?
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>> As far as I can see automatic recovery is not configurable by app.
> > >>>>>
> > >>>>> But that is not all, PMD sends events to application but PMD can't 
> > >>>>> know
> > >>>>> if application is handling them or not, so with current design PMD 
> > >>>>> can't
> > >>>>> rely on to app.
> > >>>>
> > >>>> Well, PMD invokes user provided callback.
> > >>>> One way to fix that problem - if there is no callback provided,
> > >>>> or callback returns an error code - PMD can assume that recovery
> > >>>> should not be done.
> > >>>> That is probably not the best design choice, but at least it will allow
> > >>>> to fix the problem without too many changes and introducing new API.
> > >>>> That could be sort of a 'quick fix'.
> > >>>> In a meanwhile we can think about new/better approach for that.
> > >>>>
> > >>>
> > >>> -rc2 for 23.03 is a few days away.
> > >>>
> > >>> What do you think to have 'quick fix' as modifying how driver updates
> > >>> burst ops to prevent the race condition, for this release?
> >
> > The 'quick fix', do you mean only update function pointer (without rxq 
> > setting) ?
> > Currently the PMDs which announced support "proactive error handling mode" 
> > already
> > do this.
> 
> Really sorry guys, I was too fast on the keyboard, and didn't read properly 
> what Ferruh suggested.
> Reading it once again - no I don not agree with that.
> It wouldn't fix anything, but will just add extra mess into the code.
> Sorry again for the wrong reply.
> Konstantin
> 

Thinking about 'quick fix' once again: I think the patches Fengchengwen already 
provided:
https://patchwork.dpdk.org/project/dpdk/list/?series=27201
is a much better approach.
I believe it should stop race condition (and crashing) with properly written 
callback.
If we still have time for it, I'd suggest one extra change in PMD:
check that recovery callback is installed, if not simply not start recovery at 
all.  

> >
> > >>>
> > >>> And plan a design update for the next release?
> > >> +1 on the overall approach.
> > >
> > > Yep, agree.
> >
> > Hope for better solution.
> > And also, I notice only the openvswitch (from all open-source software 
> > which based-on DPDK)
> > registers RTE_ETH_EVENT_INTR_RESET callback .
> >
> > Therefore, hope we build a recovery framework at the DPDK SDK level and be 
> > compatible
> > with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> >
> > >
> > >>
> > >>>
> > >>>
> > >>>>>
> > >>>>>>> I think driver should not rely on application for this, unless
> > >>>>>>> application explicitly says (to driver) that it is handling 
> > >>>>>>> recovery,
> > >>>>>>> right now there is no way for driver to know this.
> > >>>>>>
> > >>>>>> I think it is visa-versa:
> > >>>>>> application should not enable auto-recovery if it can't meet
> > >>>>>> per-requeststies for it (provide appropriate callback).
> > >>>>>>
> > >>>>>
> > >>>>> I agree on above, we are saying similar thing in different 
> > >>>>> perspective.
> > >>>>
> > >>>> Ok, that's good we are on the same page.
> > >>>>
> > >>>>
> > >>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>>>> Also, this patch introduce a driver internal function
> > >>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> > >>>>>>>>>
> > >>>>>>>>> [1]
> > >>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/
> > >>>>>>>>>
> > >>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> > >>>>>>>>> Cc: sta...@dpdk.org
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com>
> > >>>>>>>>> ---
> > >>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> > >>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> > >>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> > >>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> > >>>>>>>>> +++++++++++++++----------
> > >>>>>>>>>   lib/ethdev/version.map                  |  1 +
> > >>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> index c145a9066c..e380ff135a 100644
> > >>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> > >>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes 
> > >>>>>>>>> recovery
> > >>>>>>>>> in PASSIVE mode,
> > >>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> > >>>>>>>>>   and only a small amount of work is required for the application.
> > >>>>>>>>>
> > >>>>>>>>> -During error detection and automatic recovery,
> > >>>>>>>>> -the PMD sets the data path pointers to dummy functions
> > >>>>>>>>> -(which will prevent the crash),
> > >>>>>>>>> -and also make sure the control path operations fail with a return
> > >>>>>>>>> code ``-EBUSY``.
> > >>>>>>>>> -
> > >>>>>>>>> -Because the PMD recovers automatically,
> > >>>>>>>>> -the application can only sense that the data flow is disconnected
> > >>>>>>>>> for a while
> > >>>>>>>>> -and the control API returns an error in this period.
> > >>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> > >>>>>>>>> data path
> > >>>>>>>>> +pointers to dummy functions and also make sure the control path
> > >>>>>>>>> operations
> > >>>>>>>>> +failed with a return code ``-EBUSY``.
> > >>>>>>>>>
> > >>>>>>>>>   In order to sense the error happening/recovering,
> > >>>>>>>>>   as well as to restore some additional configuration,
> > >>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> > >>>>>>>>>      Notify the application that an error is detected
> > >>>>>>>>> -   and the recovery is being started.
> > >>>>>>>>> +   and the recovery is about to start.
> > >>>>>>>>>      Upon receiving the event, the application should not invoke
> > >>>>>>>>> -   any control path function until receiving
> > >>>>>>>>> +   any control and data path API until receiving
> > >>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> > >>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> > >>>>>>>>>
> > >>>>>>>>>   .. note::
> > >>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> > >>>>>>>>>      Notify the application that the recovery from error is 
> > >>>>>>>>> successful,
> > >>>>>>>>> -   the PMD already re-configures the port,
> > >>>>>>>>> -   and the effect is the same as a restart operation.
> > >>>>>>>>> +   the PMD already re-configures the port.
> > >>>>>>>>> +   The application should restore some additional configuration,
> > >>>>>>>>> and then
> > >>>>>>>>> +   enable data path API invocation.
> > >>>>>>>>>
> > >>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> > >>>>>>>>>      Notify the application that the recovery from error failed,
> > >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c 
> > >>>>>>>>> b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> > >>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> > >>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct 
> > >>>>>>>>> rte_eth_dev
> > >>>>>>>>> *dev, const char *ring_name,
> > >>>>>>>>>       return rc;
> > >>>>>>>>>   }
> > >>>>>>>>>
> > >>>>>>>>> +void
> > >>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> > >>>>>>>>> +{
> > >>>>>>>>> +    if (dev == NULL)
> > >>>>>>>>> +        return;
> > >>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, 
> > >>>>>>>>> dev);
> > >>>>>>>>> +}
> > >>>>>>>>> +
> > >>>>>>>>>   const struct rte_memzone *
> > >>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const 
> > >>>>>>>>> char
> > >>>>>>>>> *ring_name,
> > >>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> > >>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h 
> > >>>>>>>>> b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> > >>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> > >>>>>>>>> @@ -1621,6 +1621,16 @@ int
> > >>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> > >>>>>>>>> char *name,
> > >>>>>>>>>            uint16_t queue_id);
> > >>>>>>>>>
> > >>>>>>>>> +/**
> > >>>>>>>>> + * @internal
> > >>>>>>>>> + * Setup eth fast-path API to ethdev values.
> > >>>>>>>>> + *
> > >>>>>>>>> + * @param dev
> > >>>>>>>>> + *  Pointer to struct rte_eth_dev.
> > >>>>>>>>> + */
> > >>>>>>>>> +__rte_internal
> > >>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> > >>>>>>>>> +
> > >>>>>>>>>   /**
> > >>>>>>>>>    * @internal
> > >>>>>>>>>    * Atomically set the link status for the specific device.
> > >>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> index 049641d57c..44ee7229c1 100644
> > >>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> > >>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> > >>>>>>>>>       /** Port recovering from a hardware or firmware error.
> > >>>>>>>>> -     * If PMD supports proactive error recovery,
> > >>>>>>>>> -     * it should trigger this event to notify application
> > >>>>>>>>> -     * that it detected an error and the recovery is being 
> > >>>>>>>>> started.
> > >>>>>>>>> -     * Upon receiving the event, the application should not 
> > >>>>>>>>> invoke
> > >>>>>>>>> any control path API
> > >>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> > >>>>>>>>> receiving
> > >>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> > >>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> > >>>>>>>>> -     * The PMD will set the data path pointers to dummy 
> > >>>>>>>>> functions,
> > >>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> > >>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> > >>>>>>>>> -     * It means that the application cannot send or receive any
> > >>>>>>>>> packets
> > >>>>>>>>> -     * during this period.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * If PMD supports proactive error recovery, it should 
> > >>>>>>>>> trigger
> > >>>>>>>>> this
> > >>>>>>>>> +     * event to notify application that it detected an error and 
> > >>>>>>>>> the
> > >>>>>>>>> +     * recovery is about to start.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * Upon receiving the event, the application should not 
> > >>>>>>>>> invoke any
> > >>>>>>>>> +     * control and data path API until receiving
> > >>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or 
> > >>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED
> > >>>>>>>>> +     * event.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * Once this event is reported, the PMD will set the data 
> > >>>>>>>>> path
> > >>>>>>>>> pointers
> > >>>>>>>>> +     * to dummy functions, and re-set the data path pointers to 
> > >>>>>>>>> valid
> > >>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> > >>>>>>>>> event.
> > >>>>>>>>> +     *
> > >>>>>>>>>        * @note Before the PMD reports the recovery result,
> > >>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> > >>>>>>>>> again,
> > >>>>>>>>>        * because a larger error may occur during the recovery.
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> > >>>>>>>>>       /** Port recovers successfully from the error.
> > >>>>>>>>> -     * The PMD already re-configured the port,
> > >>>>>>>>> -     * and the effect is the same as a restart operation.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * The PMD already re-configured the port:
> > >>>>>>>>>        * a) The following operation will be retained: 
> > >>>>>>>>> (alphabetically)
> > >>>>>>>>>        *    - DCB configuration
> > >>>>>>>>>        *    - FEC configuration
> > >>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> > >>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> > >>>>>>>>>        * c) Any other configuration will not be stored
> > >>>>>>>>>        *    and will need to be re-configured.
> > >>>>>>>>> +     *
> > >>>>>>>>> +     * The application should restore some additional 
> > >>>>>>>>> configuration
> > >>>>>>>>> +     * (see above case b/c), and then enable data path API 
> > >>>>>>>>> invocation.
> > >>>>>>>>>        */
> > >>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> > >>>>>>>>>       /** Port recovery failed.
> > >>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> > >>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> > >>>>>>>>> --- a/lib/ethdev/version.map
> > >>>>>>>>> +++ b/lib/ethdev/version.map
> > >>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> > >>>>>>>>>       rte_eth_devices;
> > >>>>>>>>>       rte_eth_dma_zone_free;
> > >>>>>>>>>       rte_eth_dma_zone_reserve;
> > >>>>>>>>> +    rte_eth_fp_ops_setup;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> > >>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> > >>>>>>>>> --
> > >>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.anan...@huawei.com>
> > >>>>>>>>
> > >>>>>>>>> 2.17.1
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>

Reply via email to