On 3/7/2023 8:25 AM, fengchengwen wrote:
> 
> 
> On 2023/3/6 19:13, Konstantin Ananyev wrote:
>>
>>
>>>>>>>>>> In the proactive error handling mode, the PMD will set the data path
>>>>>>>>>> pointers to dummy functions and then try recovery, in this period the
>>>>>>>>>> application may still invoking data path API. This will introduce a
>>>>>>>>>> race-condition with data path which may lead to crash [1].
>>>>>>>>>>
>>>>>>>>>> Although the PMD added delay after setting data path pointers to 
>>>>>>>>>> cover
>>>>>>>>>> the above race-condition, it reduces the probability, but it doesn't
>>>>>>>>>> solve the problem.
>>>>>>>>>>
>>>>>>>>>> To solve the race-condition problem fundamentally, the following
>>>>>>>>>> requirements are added:
>>>>>>>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>>>>>>>     report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>> 2. The application should stop data path API invocation when process
>>>>>>>>>>     the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>>>>>>>> 3. The PMD should set the data path pointers to valid functions 
>>>>>>>>>> before
>>>>>>>>>>     report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>> 4. The application should enable data path API invocation when 
>>>>>>>>>> process
>>>>>>>>>>     the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>>
>>>>>>>>
>>>>>>>> How this is solving the race-condition, by pushing responsibility to
>>>>>>>> stop data path to application?
>>>>>>>
>>>>>>> Exactly, it becomes application responsibility to make sure data-path is
>>>>>>> stopped/suspended before recovery will continue.
>>>>>>>
>>>>>>
>>>>>> From documentation of the feature:
>>>>>>
>>>>>> ``
>>>>>> Because the PMD recovers automatically,
>>>>>> the application can only sense that the data flow is disconnected for a
>>>>>> while and the control API returns an error in this period.
>>>>>>
>>>>>> In order to sense the error happening/recovering, as well as to restore
>>>>>> some additional configuration, three events are available:
>>>>>> ``
>>>>>>
>>>>>> It looks like initial design is to use events mainly inform application
>>>>>> about what happened and mainly for re-configuration.
>>>>>>
>>>>>> Although I am don't disagree to involve the application, I am not sure
>>>>>> that is part of current design.
>>>>>
>>>>> I thought we all agreed that initial design contain some fallacies that
>>>>> need to fixed, no?
>>>>> Statement that with current rte_ethdev design error recovery can be done
>>>>> without interaction with the app (to stop/suspend data/control path)
>>>>> is the main one I think.
>>>>> It needs some interaction with app layer, one way or another.
>>>>>
>>>>>>>>
>>>>>>>> What if application is not interested in recovery modes at all and not
>>>>>>>> registered any callback for the recovery?
>>>>>>>
>>>>>>>
>>>>>>> Are you saying there is no way for application to disable
>>>>>>> automatic recovery in PMD if it is not interested
>>>>>>> (or can't full-fill per-requesties for it)?
>>>>>>> If so, then yes it is a problem and we need to fix it.
>>>>>>> I assumed that such mechanism to disable unwanted events already exists,
>>>>>>> but I can't find anything.
>>>>>>> Wonder what would be the easiest way here - can PMD make a decision
>>>>>>> based on callback return value, or do we need a new API to
>>>>>>> enable/disable callbacks, or ...?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> As far as I can see automatic recovery is not configurable by app.
>>>>>>
>>>>>> But that is not all, PMD sends events to application but PMD can't know
>>>>>> if application is handling them or not, so with current design PMD can't
>>>>>> rely on to app.
>>>>>
>>>>> Well, PMD invokes user provided callback.
>>>>> One way to fix that problem - if there is no callback provided,
>>>>> or callback returns an error code - PMD can assume that recovery
>>>>> should not be done.
>>>>> That is probably not the best design choice, but at least it will allow
>>>>> to fix the problem without too many changes and introducing new API.
>>>>> That could be sort of a 'quick fix'.
>>>>> In a meanwhile we can think about new/better approach for that.
>>>>>
>>>>
>>>> -rc2 for 23.03 is a few days away.
>>>>
>>>> What do you think to have 'quick fix' as modifying how driver updates
>>>> burst ops to prevent the race condition, for this release?
> 
> The 'quick fix', do you mean only update function pointer (without rxq 
> setting) ?
> Currently the PMDs which announced support "proactive error handling mode" 
> already
> do this.
> 

Yes.
I checked hns3, it does as you said, hns3_eth_dev_fp_ops_config()'
updates all fields in 'rte_eth_fp_ops' but only function pointer seems
changed in the driver, resulting only function pointers to be updated.

The discussion about race condition started with patch [1], which
mentions a crash because of a race condition. Later in discussions,
recovery event given as a sample for where the race can occur, that is
why we are here.

But after above info, although there is race condition and a bigger
update (that needs application involvement) is required for recovery
mechanism, there is no crash and NO 'quick fix' is required for recovery.

@Konstantin, @Chengwen, can you please confirm above understanding is
correct?



[1]
https://patches.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/

>>>>
>>>> And plan a design update for the next release?
>>> +1 on the overall approach.
>>
>> Yep, agree.
> 
> Hope for better solution.
> And also, I notice only the openvswitch (from all open-source software which 
> based-on DPDK)
> registers RTE_ETH_EVENT_INTR_RESET callback .
> 
> Therefore, hope we build a recovery framework at the DPDK SDK level and be 
> compatible
> with RTE_ETH_EVENT_INTR_RESET and RTE_ETH_EVENT_ERR_RECOVERING mechanism.
> 
>>  
>>>
>>>>
>>>>
>>>>>>
>>>>>>>> I think driver should not rely on application for this, unless
>>>>>>>> application explicitly says (to driver) that it is handling recovery,
>>>>>>>> right now there is no way for driver to know this.
>>>>>>>
>>>>>>> I think it is visa-versa:
>>>>>>> application should not enable auto-recovery if it can't meet
>>>>>>> per-requeststies for it (provide appropriate callback).
>>>>>>>
>>>>>>
>>>>>> I agree on above, we are saying similar thing in different perspective.
>>>>>
>>>>> Ok, that's good we are on the same page.
>>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>> Also, this patch introduce a driver internal function
>>>>>>>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/
>>>>>>>>>>
>>>>>>>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>>>>>>>> Cc: sta...@dpdk.org
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com>
>>>>>>>>>> ---
>>>>>>>>>>   doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>>>>>>>   lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>>>>>>>   lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>>>>>>>   lib/ethdev/rte_ethdev.h                 | 32
>>>>>>>>>> +++++++++++++++----------
>>>>>>>>>>   lib/ethdev/version.map                  |  1 +
>>>>>>>>>>   5 files changed, 46 insertions(+), 25 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> index c145a9066c..e380ff135a 100644
>>>>>>>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>>>>>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery
>>>>>>>>>> in PASSIVE mode,
>>>>>>>>>>   the PMD automatically recovers from error in PROACTIVE mode,
>>>>>>>>>>   and only a small amount of work is required for the application.
>>>>>>>>>>
>>>>>>>>>> -During error detection and automatic recovery,
>>>>>>>>>> -the PMD sets the data path pointers to dummy functions
>>>>>>>>>> -(which will prevent the crash),
>>>>>>>>>> -and also make sure the control path operations fail with a return
>>>>>>>>>> code ``-EBUSY``.
>>>>>>>>>> -
>>>>>>>>>> -Because the PMD recovers automatically,
>>>>>>>>>> -the application can only sense that the data flow is disconnected
>>>>>>>>>> for a while
>>>>>>>>>> -and the control API returns an error in this period.
>>>>>>>>>> +During error detection and automatic recovery, the PMD sets the
>>>>>>>>>> data path
>>>>>>>>>> +pointers to dummy functions and also make sure the control path
>>>>>>>>>> operations
>>>>>>>>>> +failed with a return code ``-EBUSY``.
>>>>>>>>>>
>>>>>>>>>>   In order to sense the error happening/recovering,
>>>>>>>>>>   as well as to restore some additional configuration,
>>>>>>>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>>>>>>>      Notify the application that an error is detected
>>>>>>>>>> -   and the recovery is being started.
>>>>>>>>>> +   and the recovery is about to start.
>>>>>>>>>>      Upon receiving the event, the application should not invoke
>>>>>>>>>> -   any control path function until receiving
>>>>>>>>>> +   any control and data path API until receiving
>>>>>>>>>>      ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or
>>>>>>>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>>>>>>>
>>>>>>>>>>   .. note::
>>>>>>>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>>>>>>>      Notify the application that the recovery from error is 
>>>>>>>>>> successful,
>>>>>>>>>> -   the PMD already re-configures the port,
>>>>>>>>>> -   and the effect is the same as a restart operation.
>>>>>>>>>> +   the PMD already re-configures the port.
>>>>>>>>>> +   The application should restore some additional configuration,
>>>>>>>>>> and then
>>>>>>>>>> +   enable data path API invocation.
>>>>>>>>>>
>>>>>>>>>>   ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>>>>>>>      Notify the application that the recovery from error failed,
>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>>>>>>>> index 0be1e8ca04..f994653fe9 100644
>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.c
>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>>>>>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev
>>>>>>>>>> *dev, const char *ring_name,
>>>>>>>>>>       return rc;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>> +void
>>>>>>>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>>>>>>>> +{
>>>>>>>>>> +    if (dev == NULL)
>>>>>>>>>> +        return;
>>>>>>>>>> +    eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>   const struct rte_memzone *
>>>>>>>>>>   rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char
>>>>>>>>>> *ring_name,
>>>>>>>>>>                uint16_t queue_id, size_t size, unsigned int align,
>>>>>>>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>>>>>>>> index 2c9d615fb5..0d964d1f67 100644
>>>>>>>>>> --- a/lib/ethdev/ethdev_driver.h
>>>>>>>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>>>>>>>> @@ -1621,6 +1621,16 @@ int
>>>>>>>>>>   rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const
>>>>>>>>>> char *name,
>>>>>>>>>>            uint16_t queue_id);
>>>>>>>>>>
>>>>>>>>>> +/**
>>>>>>>>>> + * @internal
>>>>>>>>>> + * Setup eth fast-path API to ethdev values.
>>>>>>>>>> + *
>>>>>>>>>> + * @param dev
>>>>>>>>>> + *  Pointer to struct rte_eth_dev.
>>>>>>>>>> + */
>>>>>>>>>> +__rte_internal
>>>>>>>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>>>>>>>> +
>>>>>>>>>>   /**
>>>>>>>>>>    * @internal
>>>>>>>>>>    * Atomically set the link status for the specific device.
>>>>>>>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>>>>>>>> index 049641d57c..44ee7229c1 100644
>>>>>>>>>> --- a/lib/ethdev/rte_ethdev.h
>>>>>>>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>>>>>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>>>>>>>       /** Port recovering from a hardware or firmware error.
>>>>>>>>>> -     * If PMD supports proactive error recovery,
>>>>>>>>>> -     * it should trigger this event to notify application
>>>>>>>>>> -     * that it detected an error and the recovery is being started.
>>>>>>>>>> -     * Upon receiving the event, the application should not invoke
>>>>>>>>>> any control path API
>>>>>>>>>> -     * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until
>>>>>>>>>> receiving
>>>>>>>>>> -     * RTE_ETH_EVENT_RECOVERY_SUCCESS or
>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED event.
>>>>>>>>>> -     * The PMD will set the data path pointers to dummy functions,
>>>>>>>>>> -     * and re-set the data path pointers to non-dummy functions
>>>>>>>>>> -     * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>>>>>>> -     * It means that the application cannot send or receive any
>>>>>>>>>> packets
>>>>>>>>>> -     * during this period.
>>>>>>>>>> +     *
>>>>>>>>>> +     * If PMD supports proactive error recovery, it should trigger
>>>>>>>>>> this
>>>>>>>>>> +     * event to notify application that it detected an error and the
>>>>>>>>>> +     * recovery is about to start.
>>>>>>>>>> +     *
>>>>>>>>>> +     * Upon receiving the event, the application should not invoke 
>>>>>>>>>> any
>>>>>>>>>> +     * control and data path API until receiving
>>>>>>>>>> +     * RTE_ETH_EVENT_RECOVERY_SUCCESS or 
>>>>>>>>>> RTE_ETH_EVENT_RECOVERY_FAILED
>>>>>>>>>> +     * event.
>>>>>>>>>> +     *
>>>>>>>>>> +     * Once this event is reported, the PMD will set the data path
>>>>>>>>>> pointers
>>>>>>>>>> +     * to dummy functions, and re-set the data path pointers to 
>>>>>>>>>> valid
>>>>>>>>>> +     * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS
>>>>>>>>>> event.
>>>>>>>>>> +     *
>>>>>>>>>>        * @note Before the PMD reports the recovery result,
>>>>>>>>>>        * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event
>>>>>>>>>> again,
>>>>>>>>>>        * because a larger error may occur during the recovery.
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_ERR_RECOVERING,
>>>>>>>>>>       /** Port recovers successfully from the error.
>>>>>>>>>> -     * The PMD already re-configured the port,
>>>>>>>>>> -     * and the effect is the same as a restart operation.
>>>>>>>>>> +     *
>>>>>>>>>> +     * The PMD already re-configured the port:
>>>>>>>>>>        * a) The following operation will be retained: 
>>>>>>>>>> (alphabetically)
>>>>>>>>>>        *    - DCB configuration
>>>>>>>>>>        *    - FEC configuration
>>>>>>>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>>>>>>>        *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>>>>>>>        * c) Any other configuration will not be stored
>>>>>>>>>>        *    and will need to be re-configured.
>>>>>>>>>> +     *
>>>>>>>>>> +     * The application should restore some additional configuration
>>>>>>>>>> +     * (see above case b/c), and then enable data path API 
>>>>>>>>>> invocation.
>>>>>>>>>>        */
>>>>>>>>>>       RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>>>>>>>       /** Port recovery failed.
>>>>>>>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>>>>>>>> index 357d1a88c0..c273e0bdae 100644
>>>>>>>>>> --- a/lib/ethdev/version.map
>>>>>>>>>> +++ b/lib/ethdev/version.map
>>>>>>>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>>>>>>>       rte_eth_devices;
>>>>>>>>>>       rte_eth_dma_zone_free;
>>>>>>>>>>       rte_eth_dma_zone_reserve;
>>>>>>>>>> +    rte_eth_fp_ops_setup;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_bind;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_unbind;
>>>>>>>>>>       rte_eth_hairpin_queue_peer_update;
>>>>>>>>>> --
>>>>>>>>>   Acked-by: Konstantin Ananyev <konstantin.anan...@huawei.com>
>>>>>>>>>
>>>>>>>>>> 2.17.1
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>

Reply via email to