On 3/6/2023 1:41 AM, fengchengwen wrote:
> On 2023/3/4 0:51, Ferruh Yigit wrote:
>> On 3/2/2023 12:08 PM, Konstantin Ananyev wrote:
>>>
>>>> In the proactive error handling mode, the PMD will set the data path
>>>> pointers to dummy functions and then try recovery, in this period the
>>>> application may still invoking data path API. This will introduce a
>>>> race-condition with data path which may lead to crash [1].
>>>>
>>>> Although the PMD added delay after setting data path pointers to cover
>>>> the above race-condition, it reduces the probability, but it doesn't
>>>> solve the problem.
>>>>
>>>> To solve the race-condition problem fundamentally, the following
>>>> requirements are added:
>>>> 1. The PMD should set the data path pointers to dummy functions after
>>>>    report RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 2. The application should stop data path API invocation when process
>>>>    the RTE_ETH_EVENT_ERR_RECOVERING event.
>>>> 3. The PMD should set the data path pointers to valid functions before
>>>>    report RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> 4. The application should enable data path API invocation when process
>>>>    the RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>>
>>
>> How this is solving the race-condition, by pushing responsibility to
>> stop data path to application?
> 
> Yes, I think it's more practical to collaborate with application.
> 
> The application will control API invocation (including control and data path),
> From a DPDK SDK perspective, it has a God perspective.
> 
>>
>> What if application is not interested in recovery modes at all and not
>> registered any callback for the recovery?
> 
> There's probably race-condition which may lead to crash, because DPDK worker
> threads runs busyloop and located on isolated core, and also PMDs add delay 
> time,
> the actual probability of occurence is very very low, at least for HNS3 pmd it
> has not run out for at least four years.
> 
>>
>> I think driver should not rely on application for this, unless
>> application explicitly says (to driver) that it is handling recovery,
> 
> If application register the event callback, the PMD could deduce that 
> application will know this.
> If application not register, then PMD will recovery itself and maybe 
> race-condition.
> 

If application support is required (that makes sense as you mentioned),
in that case I think application should explicitly enable this feature.

>> right now there is no way for driver to know this.
>>
>>
>>>> Also, this patch introduce a driver internal function
>>>> rte_eth_fp_ops_setup which used as an help function for PMD.
>>>>
>>>> [1] 
>>>> http://patchwork.dpdk.org/project/dpdk/patch/20230220060839.1267349-2-ashok.k.kal...@intel.com/
>>>>
>>>> Fixes: eb0d471a8941 ("ethdev: add proactive error handling mode")
>>>> Cc: sta...@dpdk.org
>>>>
>>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com>
>>>> ---
>>>>  doc/guides/prog_guide/poll_mode_drv.rst | 20 +++++++---------
>>>>  lib/ethdev/ethdev_driver.c              |  8 +++++++
>>>>  lib/ethdev/ethdev_driver.h              | 10 ++++++++
>>>>  lib/ethdev/rte_ethdev.h                 | 32 +++++++++++++++----------
>>>>  lib/ethdev/version.map                  |  1 +
>>>>  5 files changed, 46 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/doc/guides/prog_guide/poll_mode_drv.rst 
>>>> b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> index c145a9066c..e380ff135a 100644
>>>> --- a/doc/guides/prog_guide/poll_mode_drv.rst
>>>> +++ b/doc/guides/prog_guide/poll_mode_drv.rst
>>>> @@ -638,14 +638,9 @@ different from the application invokes recovery in 
>>>> PASSIVE mode,
>>>>  the PMD automatically recovers from error in PROACTIVE mode,
>>>>  and only a small amount of work is required for the application.
>>>>
>>>> -During error detection and automatic recovery,
>>>> -the PMD sets the data path pointers to dummy functions
>>>> -(which will prevent the crash),
>>>> -and also make sure the control path operations fail with a return code 
>>>> ``-EBUSY``.
>>>> -
>>>> -Because the PMD recovers automatically,
>>>> -the application can only sense that the data flow is disconnected for a 
>>>> while
>>>> -and the control API returns an error in this period.
>>>> +During error detection and automatic recovery, the PMD sets the data path
>>>> +pointers to dummy functions and also make sure the control path operations
>>>> +failed with a return code ``-EBUSY``.
>>>>
>>>>  In order to sense the error happening/recovering,
>>>>  as well as to restore some additional configuration,
>>>> @@ -653,9 +648,9 @@ three events are available:
>>>>
>>>>  ``RTE_ETH_EVENT_ERR_RECOVERING``
>>>>     Notify the application that an error is detected
>>>> -   and the recovery is being started.
>>>> +   and the recovery is about to start.
>>>>     Upon receiving the event, the application should not invoke
>>>> -   any control path function until receiving
>>>> +   any control and data path API until receiving
>>>>     ``RTE_ETH_EVENT_RECOVERY_SUCCESS`` or 
>>>> ``RTE_ETH_EVENT_RECOVERY_FAILED`` event.
>>>>
>>>>  .. note::
>>>> @@ -666,8 +661,9 @@ three events are available:
>>>>
>>>>  ``RTE_ETH_EVENT_RECOVERY_SUCCESS``
>>>>     Notify the application that the recovery from error is successful,
>>>> -   the PMD already re-configures the port,
>>>> -   and the effect is the same as a restart operation.
>>>> +   the PMD already re-configures the port.
>>>> +   The application should restore some additional configuration, and then
>>>> +   enable data path API invocation.
>>>>
>>>>  ``RTE_ETH_EVENT_RECOVERY_FAILED``
>>>>     Notify the application that the recovery from error failed,
>>>> diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c
>>>> index 0be1e8ca04..f994653fe9 100644
>>>> --- a/lib/ethdev/ethdev_driver.c
>>>> +++ b/lib/ethdev/ethdev_driver.c
>>>> @@ -515,6 +515,14 @@ rte_eth_dma_zone_free(const struct rte_eth_dev *dev, 
>>>> const char *ring_name,
>>>>    return rc;
>>>>  }
>>>>
>>>> +void
>>>> +rte_eth_fp_ops_setup(struct rte_eth_dev *dev)
>>>> +{
>>>> +  if (dev == NULL)
>>>> +          return;
>>>> +  eth_dev_fp_ops_setup(rte_eth_fp_ops + dev->data->port_id, dev);
>>>> +}
>>>> +
>>>>  const struct rte_memzone *
>>>>  rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char 
>>>> *ring_name,
>>>>                     uint16_t queue_id, size_t size, unsigned int align,
>>>> diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
>>>> index 2c9d615fb5..0d964d1f67 100644
>>>> --- a/lib/ethdev/ethdev_driver.h
>>>> +++ b/lib/ethdev/ethdev_driver.h
>>>> @@ -1621,6 +1621,16 @@ int
>>>>  rte_eth_dma_zone_free(const struct rte_eth_dev *eth_dev, const char *name,
>>>>             uint16_t queue_id);
>>>>
>>>> +/**
>>>> + * @internal
>>>> + * Setup eth fast-path API to ethdev values.
>>>> + *
>>>> + * @param dev
>>>> + *  Pointer to struct rte_eth_dev.
>>>> + */
>>>> +__rte_internal
>>>> +void rte_eth_fp_ops_setup(struct rte_eth_dev *dev);
>>>> +
>>>>  /**
>>>>   * @internal
>>>>   * Atomically set the link status for the specific device.
>>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>>> index 049641d57c..44ee7229c1 100644
>>>> --- a/lib/ethdev/rte_ethdev.h
>>>> +++ b/lib/ethdev/rte_ethdev.h
>>>> @@ -3944,25 +3944,28 @@ enum rte_eth_event_type {
>>>>     */
>>>>    RTE_ETH_EVENT_RX_AVAIL_THRESH,
>>>>    /** Port recovering from a hardware or firmware error.
>>>> -   * If PMD supports proactive error recovery,
>>>> -   * it should trigger this event to notify application
>>>> -   * that it detected an error and the recovery is being started.
>>>> -   * Upon receiving the event, the application should not invoke any 
>>>> control path API
>>>> -   * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>>>> -   * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED 
>>>> event.
>>>> -   * The PMD will set the data path pointers to dummy functions,
>>>> -   * and re-set the data path pointers to non-dummy functions
>>>> -   * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> -   * It means that the application cannot send or receive any packets
>>>> -   * during this period.
>>>> +   *
>>>> +   * If PMD supports proactive error recovery, it should trigger this
>>>> +   * event to notify application that it detected an error and the
>>>> +   * recovery is about to start.
>>>> +   *
>>>> +   * Upon receiving the event, the application should not invoke any
>>>> +   * control and data path API until receiving
>>>> +   * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED
>>>> +   * event.
>>>> +   *
>>>> +   * Once this event is reported, the PMD will set the data path pointers
>>>> +   * to dummy functions, and re-set the data path pointers to valid
>>>> +   * functions before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
>>>> +   *
>>>>     * @note Before the PMD reports the recovery result,
>>>>     * the PMD may report the RTE_ETH_EVENT_ERR_RECOVERING event again,
>>>>     * because a larger error may occur during the recovery.
>>>>     */
>>>>    RTE_ETH_EVENT_ERR_RECOVERING,
>>>>    /** Port recovers successfully from the error.
>>>> -   * The PMD already re-configured the port,
>>>> -   * and the effect is the same as a restart operation.
>>>> +   *
>>>> +   * The PMD already re-configured the port:
>>>>     * a) The following operation will be retained: (alphabetically)
>>>>     *    - DCB configuration
>>>>     *    - FEC configuration
>>>> @@ -3989,6 +3992,9 @@ enum rte_eth_event_type {
>>>>     *      (@see RTE_ETH_DEV_CAPA_FLOW_SHARED_OBJECT_KEEP)
>>>>     * c) Any other configuration will not be stored
>>>>     *    and will need to be re-configured.
>>>> +   *
>>>> +   * The application should restore some additional configuration
>>>> +   * (see above case b/c), and then enable data path API invocation.
>>>>     */
>>>>    RTE_ETH_EVENT_RECOVERY_SUCCESS,
>>>>    /** Port recovery failed.
>>>> diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
>>>> index 357d1a88c0..c273e0bdae 100644
>>>> --- a/lib/ethdev/version.map
>>>> +++ b/lib/ethdev/version.map
>>>> @@ -320,6 +320,7 @@ INTERNAL {
>>>>    rte_eth_devices;
>>>>    rte_eth_dma_zone_free;
>>>>    rte_eth_dma_zone_reserve;
>>>> +  rte_eth_fp_ops_setup;
>>>>    rte_eth_hairpin_queue_peer_bind;
>>>>    rte_eth_hairpin_queue_peer_unbind;
>>>>    rte_eth_hairpin_queue_peer_update;
>>>> --
>>>  
>>> Acked-by: Konstantin Ananyev <konstantin.anan...@huawei.com>
>>>
>>>> 2.17.1
>>>
>>
>> .
>>

Reply via email to