> >>>>>>>>>>> In the proactive error handling mode, the PMD will set the data 
> >>>>>>>>>>> path
> >>>>>>>>>>> pointers to dummy functions and then try recovery, in this period 
> >>>>>>>>>>> the
> >>>>>>>>>>> application may still invoking data path API. This will introduce 
> >>>>>>>>>>> a
> >>>>>>>>>>> race-condition with data path which may lead to crash [1].
> >>>>>>>>>>>
> >>>>>>>>>>> Although the PMD added delay after setting data path pointers to 
> >>>>>>>>>>> cover
> >>>>>>>>>>> the above race-condition, it reduces the probability, but it 
> >>>>>>>>>>> doesn't
> >>>>>>>>>>> solve the problem.
> >>>>>>>>>>>
> >>>>>>>>>>> To solve the race-condition problem fundamentally, the following
> >>>>>>>>>>> requirements are added:
> >>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions 
> >>>>>>>>>>> after
> >>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>>>> 2. The application should stop data path API invocation when 
> >>>>>>>>>>> process
> >>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
> >>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions 
> >>>>>>>>>>> before
> >>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>> 4. The application should enable data path API invocation when 
> >>>>>>>>>>> process
> >>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> How this is solving the race-condition, by pushing responsibility to
> >>>>>>>>> stop data path to application?
> >>>>>>>>
> >>>>>>>> Exactly, it becomes application responsibility to make sure 
> >>>>>>>> data-path is
> >>>>>>>> stopped/suspended before recovery will continue.
> >>>>>>>>
> >>>>>>>
> >>>>>>> From documentation of the feature:
> >>>>>>>
> >>>>>>> ``
> >>>>>>> Because the PMD recovers automatically,
> >>>>>>> the application can only sense that the data flow is disconnected for 
> >>>>>>> a
> >>>>>>> while and the control API returns an error in this period.
> >>>>>>>
> >>>>>>> In order to sense the error happening/recovering, as well as to 
> >>>>>>> restore
> >>>>>>> some additional configuration, three events are available:
> >>>>>>> ``
> >>>>>>>
> >>>>>>> It looks like initial design is to use events mainly inform 
> >>>>>>> application
> >>>>>>> about what happened and mainly for re-configuration.
> >>>>>>>
> >>>>>>> Although I am don't disagree to involve the application, I am not sure
> >>>>>>> that is part of current design.
> >>>>>>
> >>>>>> I thought we all agreed that initial design contain some fallacies that
> >>>>>> need to fixed, no?
> >>>>>> Statement that with current rte_ethdev design error recovery can be 
> >>>>>> done
> >>>>>> without interaction with the app (to stop/suspend data/control path)
> >>>>>> is the main one I think.
> >>>>>> It needs some interaction with app layer, one way or another.
> >>>>>>
> >>>>>>>>>
> >>>>>>>>> What if application is not interested in recovery modes at all and 
> >>>>>>>>> not
> >>>>>>>>> registered any callback for the recovery?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Are you saying there is no way for application to disable
> >>>>>>>> automatic recovery in PMD if it is not interested
> >>>>>>>> (or can't full-fill per-requesties for it)?
> >>>>>>>> If so, then yes it is a problem and we need to fix it.
> >>>>>>>> I assumed that such mechanism to disable unwanted events already 
> >>>>>>>> exists,
> >>>>>>>> but I can't find anything.
> >>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
> >>>>>>>> based on callback return value, or do we need a new API to
> >>>>>>>> enable/disable callbacks, or ...?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>> As far as I can see automatic recovery is not configurable by app.
> >>>>>>>
> >>>>>>> But that is not all, PMD sends events to application but PMD can't 
> >>>>>>> know
> >>>>>>> if application is handling them or not, so with current design PMD 
> >>>>>>> can't
> >>>>>>> rely on to app.
> >>>>>>
> >>>>>> Well, PMD invokes user provided callback.
> >>>>>> One way to fix that problem - if there is no callback provided,
> >>>>>> or callback returns an error code - PMD can assume that recovery
> >>>>>> should not be done.
> >>>>>> That is probably not the best design choice, but at least it will allow
> >>>>>> to fix the problem without too many changes and introducing new API.
> >>>>>> That could be sort of a 'quick fix'.
> >>>>>> In a meanwhile we can think about new/better approach for that.
> >>>>>>
> >>>>>
> >>>>> -rc2 for 23.03 is a few days away.
> >>>>>
> >>>>> What do you think to have 'quick fix' as modifying how driver updates
> >>>>> burst ops to prevent the race condition, for this release?
> >>
> >> The 'quick fix', do you mean only update function pointer (without rxq 
> >> setting) ?
> >> Currently the PMDs which announced support "proactive error handling mode" 
> >> already
> >> do this.
> >>
> >
> > Yes.
> > I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
> > updates all fields in 'rte_eth_fp_ops' but only function pointer seems
> > changed in the driver, resulting only function pointers to be updated.
> >
> > The discussion about race condition started with patch [1], which
> > mentions a crash because of a race condition. Later in discussions,
> > recovery event given as a sample for where the race can occur, that is
> > why we are here.
> >
> > But after above info, although there is race condition and a bigger
> > update (that needs application involvement) is required for recovery
> > mechanism, there is no crash and NO 'quick fix' is required for recovery.
> >
> > @Konstantin, @Chengwen, can you please confirm above understanding is
> > correct?
> 
> Yes, that's what.

Yes, I think with Chengwen patch the race condition problem should be fixed.
Though for that user has to provide a properly implemented callback.
What is not currently addressed - user can not disable this auto-recovery 
procedure on his will. 
So if user will not provide a proper call-back the recovery can still proceed 
and race can happen. 

> 
> >
> >
> >
> > [1]
> > https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/
> >
> >>>>>
> >>>>> And plan a design update for the next release?
> >>>> +1 on the overall approach.
> >>>
> >>> Yep, agree.
> >>
> >> Hope for better solution.
> >> And also, I notice only the openvswitch (from all open-source software 
> >> which based-on DPDK)
> >> registers RTE_ETH_EVENT_INTR_RESET callback .
> >>
> >> Therefore, hope we build a recovery framework at the DPDK SDK level and be 
> >> compatible
> >> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> >>
> >>>
> >>>>
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>>>> I think driver should not rely on application for this, unless
> >>>>>>>>> application explicitly says (to driver) that it is handling 
> >>>>>>>>> recovery,
> >>>>>>>>> right now there is no way for driver to know this.
> >>>>>>>>
> >>>>>>>> I think it is visa-versa:
> >>>>>>>> application should not enable auto-recovery if it can't meet
> >>>>>>>> per-requeststies for it (provide appropriate callback).
> >>>>>>>>
> >>>>>>>
> >>>>>>> I agree on above, we are saying similar thing in different 
> >>>>>>> perspective.
> >>>>>>
> >>>>>> Ok, that's good we are on the same page.
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>> Also, this patch introduce a driver internal function
> >>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/
> >>>>>>>>>>>
> >>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
> >>>>>>>>>>> Cc: sta...@dpdk.org
> >>>>>>>>>>>
> >>>>>>>>>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com>
> >>>>>>>>>>> ---
> >>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
> >>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
> >>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
> >>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
> >>>>>>>>>>> +++++++++++++++----------
> >>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
> >>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
> >>>>>>>>>>>
> >>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> index c145a9066c..e380ff135a 100644
> >>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
> >>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes 
> >>>>>>>>>>> recovery
> >>>>>>>>>>> in PASSIVE mode,
> >>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
> >>>>>>>>>>>   and only a small amount of work is required for the application.
> >>>>>>>>>>>
> >>>>>>>>>>> -During error detection and automatic recovery,
> >>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
> >>>>>>>>>>> -(which will prevent the crash),
> >>>>>>>>>>> -and also make sure the control path operations fail with a return
> >>>>>>>>>>> code ``-EBUSY``.
> >>>>>>>>>>> -
> >>>>>>>>>>> -Because the PMD recovers automatically,
> >>>>>>>>>>> -the application can only sense that the data flow is disconnected
> >>>>>>>>>>> for a while
> >>>>>>>>>>> -and the control API returns an error in this period.
> >>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
> >>>>>>>>>>> data path
> >>>>>>>>>>> +pointers to dummy functions and also make sure the control path
> >>>>>>>>>>> operations
> >>>>>>>>>>> +failed with a return code ``-EBUSY``.
> >>>>>>>>>>>
> >>>>>>>>>>>   In order to sense the error happening/recovering,
> >>>>>>>>>>>   as well as to restore some additional configuration,
> >>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
> >>>>>>>>>>>      Notify the application that an error is detected
> >>>>>>>>>>> -   and the recovery is being started.
> >>>>>>>>>>> +   and the recovery is about to start.
> >>>>>>>>>>>      Upon receiving the event, the application should not invoke
> >>>>>>>>>>> -   any control path function until receiving
> >>>>>>>>>>> +   any control and data path API until receiving
> >>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
> >>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
> >>>>>>>>>>>
> >>>>>>>>>>>   .. note::
> >>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
> >>>>>>>>>>>      Notify the application that the recovery from error is 
> >>>>>>>>>>> successful,
> >>>>>>>>>>> -   the PMD already re-configures the port,
> >>>>>>>>>>> -   and the effect is the same as a restart operation.
> >>>>>>>>>>> +   the PMD already re-configures the port.
> >>>>>>>>>>> +   The application should restore some additional configuration,
> >>>>>>>>>>> and then
> >>>>>>>>>>> +   enable data path API invocation.
> >>>>>>>>>>>
> >>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
> >>>>>>>>>>>      Notify the application that the recovery from error failed,
> >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c 
> >>>>>>>>>>> b/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
> >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
> >>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct 
> >>>>>>>>>>> rte_eth_dev
> >>>>>>>>>>> *dev, const char *ring_name,
> >>>>>>>>>>>       return rc;
> >>>>>>>>>>>   }
> >>>>>>>>>>>
> >>>>>>>>>>> +void
> >>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
> >>>>>>>>>>> +{
> >>>>>>>>>>> +    if (dev == NULL)
> >>>>>>>>>>> +        return;
> >>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, 
> >>>>>>>>>>> dev);
> >>>>>>>>>>> +}
> >>>>>>>>>>> +
> >>>>>>>>>>>   const struct rte_memzone *
> >>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const 
> >>>>>>>>>>> char
> >>>>>>>>>>> *ring_name,
> >>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
> >>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h 
> >>>>>>>>>>> b/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
> >>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
> >>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
> >>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
> >>>>>>>>>>> char *name,
> >>>>>>>>>>>            uint16_t queue_id);
> >>>>>>>>>>>
> >>>>>>>>>>> +/**
> >>>>>>>>>>> + * @internal
> >>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
> >>>>>>>>>>> + *
> >>>>>>>>>>> + * @param dev
> >>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
> >>>>>>>>>>> + */
> >>>>>>>>>>> +__rte_internal
> >>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
> >>>>>>>>>>> +
> >>>>>>>>>>>   /**
> >>>>>>>>>>>    * @internal
> >>>>>>>>>>>    * Atomically set the link status for the specific device.
> >>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> index 049641d57c..44ee7229c1 100644
> >>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
> >>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
> >>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
> >>>>>>>>>>> -     * If PMD supports proactive error recovery,
> >>>>>>>>>>> -     * it should trigger this event to notify application
> >>>>>>>>>>> -     * that it detected an error and the recovery is being 
> >>>>>>>>>>> started.
> >>>>>>>>>>> -     * Upon receiving the event, the application should not 
> >>>>>>>>>>> invoke
> >>>>>>>>>>> any control path API
> >>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
> >>>>>>>>>>> receiving
> >>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
> >>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
> >>>>>>>>>>> -     * The PMD will set the data path pointers to dummy 
> >>>>>>>>>>> functions,
> >>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
> >>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> >>>>>>>>>>> -     * It means that the application cannot send or receive any
> >>>>>>>>>>> packets
> >>>>>>>>>>> -     * during this period.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * If PMD supports proactive error recovery, it should 
> >>>>>>>>>>> trigger
> >>>>>>>>>>> this
> >>>>>>>>>>> +     * event to notify application that it detected an error and 
> >>>>>>>>>>> the
> >>>>>>>>>>> +     * recovery is about to start.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * Upon receiving the event, the application should not 
> >>>>>>>>>>> invoke any
> >>>>>>>>>>> +     * control and data path API until receiving
> >>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or 
> >>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED
> >>>>>>>>>>> +     * event.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * Once this event is reported, the PMD will set the data 
> >>>>>>>>>>> path
> >>>>>>>>>>> pointers
> >>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to 
> >>>>>>>>>>> valid
> >>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
> >>>>>>>>>>> event.
> >>>>>>>>>>> +     *
> >>>>>>>>>>>        * @note Before the PMD reports the recovery result,
> >>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
> >>>>>>>>>>> again,
> >>>>>>>>>>>        * because a larger error may occur during the recovery.
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
> >>>>>>>>>>>       /** Port recovers successfully from the error.
> >>>>>>>>>>> -     * The PMD already re-configured the port,
> >>>>>>>>>>> -     * and the effect is the same as a restart operation.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * The PMD already re-configured the port:
> >>>>>>>>>>>        * a) The following operation will be retained: 
> >>>>>>>>>>> (alphabetically)
> >>>>>>>>>>>        *    - DCB configuration
> >>>>>>>>>>>        *    - FEC configuration
> >>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
> >>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
> >>>>>>>>>>>        * c) Any other configuration will not be stored
> >>>>>>>>>>>        *    and will need to be re-configured.
> >>>>>>>>>>> +     *
> >>>>>>>>>>> +     * The application should restore some additional 
> >>>>>>>>>>> configuration
> >>>>>>>>>>> +     * (see above case b/c), and then enable data path API 
> >>>>>>>>>>> invocation.
> >>>>>>>>>>>        */
> >>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
> >>>>>>>>>>>       /** Port recovery failed.
> >>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
> >>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
> >>>>>>>>>>> --- a/lib/ethdev/version.map
> >>>>>>>>>>> +++ b/lib/ethdev/version.map
> >>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
> >>>>>>>>>>>       rte_eth_devices;
> >>>>>>>>>>>       rte_eth_dma_zone_free;
> >>>>>>>>>>>       rte_eth_dma_zone_reserve;
> >>>>>>>>>>> +    rte_eth_fp_ops_setup;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
> >>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
> >>>>>>>>>>> --
> >>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.anan...@huawei.com>
> >>>>>>>>>>
> >>>>>>>>>>> 2.17.1
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >
> > .
> >

Reply via email to