Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode

fengchengwen Wed, 08 Mar 2023 17:00:06 -0800


On 2023/3/8 9:09, Honnappa Nagarahalli wrote:
> <snip>
> 
>>>>>>>
>>>>>
>>>>> Is there any reason not to design this in the same way as
>>>> 'rte_eth_dev_reset'? Why does the PMD have to recover by itself?
>>>>
>>>> I suppose it is a question for the authors of original patch...
>>> Appreciate if the authors could comment on this.
>>
>> The main cause is that the hardware implementation limit, I will try to 
>> explain
>> from hns3 PMD's view.
>> For a global reset, all the function need responsed within a centain period 
>> of
>> time. otherwise, the reset will fail. and also the reset requirement a few 
>> steps (all
>> may take a long time).
>>
>> When with multiple functions in one DPDK, and trigger a global reset, the
>> rte_eth_dev_reset will not cover this scene:
>> 1. each port's will report RTE_ETH_EVENT_INTR_RESET in interrupt thread.
>> 2. then invoke application callback, but due to the same thread, and each
>>     port's recover will take a long time, so later port will reset failed.
> If the design were to introduce RTE_ETH_EVENT_INTR_RECOVER and 
> rte_eth_dev_recover, what problems do you see?

I see the 'RTE_ETH_EVENT_INTR_RECOVER and rte_eth_dev_recover' has no 
difference with
RTE_ETH_EVENT_INTR_RESET mechanism.
Could you detail more?

> 
>>
>>>
>>>>
>>>>> We could have a similar API 'rte_eth_dev_recover' to do the recovery
>>>> functionality.
>>>>
>>>> I suppose such approach is also possible.
>>>> Personally I am fine with both ways: either existing one or what you
>>>> propose, as long as we'll fix existing race-condition.
>>>> What is good with what you suggest - that way we probably don't need
>>>> to worry how to allow user to enable/disable auto-recovery inside PMD.
>>>>
>>>> Konstantin
>>>>
>>>

Re: [PATCH 1/5] ethdev: fix race-condition of proactive error handling mode

Reply via email to