On 5/23/2023 2:45 AM, Deng, KaiwenX wrote:
> 
> 
>> -----Original Message-----
>> From: Deng, KaiwenX
>> Sent: Friday, May 5, 2023 10:31 AM
>> To: Ferruh Yigit <ferruh.yi...@amd.com>; dev@dpdk.org
>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou, YidingX
>> <yidingx.z...@intel.com>; Chas Williams <ch...@att.com>; Min Hu (Connor)
>> <humi...@huawei.com>; Wu, Jingjing <jingjing...@intel.com>; Xing, Beilei
>> <beilei.x...@intel.com>; Mike Pattrick <m...@redhat.com>; Zhang, Qi Z
>> <qi.z.zh...@intel.com>; Doherty, Declan <declan.dohe...@intel.com>;
>> Mrzyglod, Daniel T <daniel.t.mrzyg...@intel.com>; Dapeng Yu
>> <dapengx...@intel.com>; Zhang, Helin <helin.zh...@intel.com>;
>> Mcnamara, John <john.mcnam...@intel.com>; Thomas Monjalon
>> <tho...@monjalon.net>
>> Subject: RE: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>
>>
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>>> Sent: Monday, March 27, 2023 8:38 PM
>>> To: Deng, KaiwenX <kaiwenx.d...@intel.com>; dev@dpdk.org
>>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou,
>>> YidingX <yidingx.z...@intel.com>; Chas Williams <ch...@att.com>; Min
>>> Hu (Connor) <humi...@huawei.com>; Wu, Jingjing
>>> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>; Mike
>>> Pattrick <m...@redhat.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
>>> Doherty, Declan <declan.dohe...@intel.com>; Mrzyglod, Daniel T
>>> <daniel.t.mrzyg...@intel.com>; Dapeng Yu <dapengx...@intel.com>;
>>> Zhang, Helin <helin.zh...@intel.com>; Mcnamara, John
>>> <john.mcnam...@intel.com>; Thomas Monjalon <tho...@monjalon.net>
>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>>
>>> On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
>>>> On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>>>>>> Sent: Thursday, March 23, 2023 11:39 PM
>>>>>> To: Deng, KaiwenX <kaiwenx.d...@intel.com>; dev@dpdk.org
>>>>>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou,
>>>>>> YidingX <yidingx.z...@intel.com>; Chas Williams <ch...@att.com>;
>>>>>> Min Hu (Connor) <humi...@huawei.com>; Wu, Jingjing
>>>>>> <jingjing...@intel.com>; Xing, Beilei <beilei.x...@intel.com>;
>>>>>> Mike Pattrick <m...@redhat.com>; Zhang, Qi Z
>>>>>> <qi.z.zh...@intel.com>; Doherty, Declan
>>>>>> <declan.dohe...@intel.com>; Mrzyglod, Daniel T
>>>>>> <daniel.t.mrzyg...@intel.com>; Dapeng Yu <dapengx...@intel.com>
>>>>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
>>>>>> thread
>>>>>>
>>>>>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>>>>>>> When iavf send query-stats command in eal-intr-thread through
>>>>>>> virtual channel, there will be no response received from
>>>>>>> iavf_dev_virtchnl_handler for this command during block and wait.
>>>>>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-
>> thread.
>>>>>>>
>>>>>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave
>>> device
>>>>>>> update callback will registered in alarm and called by
>>>>>>> eal-intr-thread, it would also raise the above issue.
>>>>>>>
>>>>>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS
>>> when
>>>>>> it
>>>>>>> is called by eal-intr-thread to fix this issue.
>>>>>>>
>>>>>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>>>>>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>>>>>>> Fixes: 7c76a747e68c ("bond: add mode 5")
>>>>>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>>>>>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
>>>>>>
>>>>>>
>>>>>> Hi Kaiwen,
>>>>>>
>>>>>> Above commit already seems trying to address same issue, it
>>>>>> creates
>>>>>> "iavf- event-thread" control thread to asyncroniously handle the
>>>>>> interrupts, in non- interrupt context, why it is not working?
>>>>>>
>>>>>> Instead of adding 'rte_thread_is_intr()' checks, can't you make
>>>>>> sure all interrupts handled in control tread?
>>>>>>
>>>>>> And can you please provide a stack trace in commit log, to
>>>>>> describe the issue better?
>>>>> Hi Ferru,
>>>>> Sorry for my late reply, And thanks for your review.
>>>>>
>>>>> The above commit does not fix this issue when we need to get the
>>> returned data.
>>>>> If we call iavf_query_stats and wait for response statistics in the
>>>>> intr-
>>> thread.
>>>>> iavf_handle_virtchnl_msg is also registered in the intr_thread and
>>>>> will not be executed while waiting.
>>>>>
>>>>
>>>> Got it, since return value is required, API can't be called asyncroniously.
>>>>
>>>>
>>>>
>>>> I think 'rte_thread_is_intr()' checks may cause more trouble for you
>>>> in long term,
>>>>
>>>> - why 'iavf_query_stats()' is called in the iterrupt thread, can it
>>>> be prevented?
>>>>
>>>> - does it make sense to allways poll messages from PF (for simplification)?
>>>>
>>>>
>>>> If answer to both are 'No', I am OK to continue with current
>>>> proposal if you are happy with it.
>>>>
>>>
>>>
>>> btw, how critical is this issue?
>>>
>>> If it is critical, I am OK to get it as it is for this release and
>>> investigate it further for next release, since only a few days left for this
>> release.
>>>
>> Hi Ferruh,
>>
>> I didn't find a more suitable solution after consideration, if you have a 
>> better
>> suggestion, please let me know, thanks.
> Hi Ferruh,
> Can you please take a look at this patch.
>

Hi Kaiwen,

Sorry for delay, lets continue with this solution.

I thought calling from "iavf-event-thread" can be an option but your
description clarifies why it is not an option, and I also don't have
better solution, so I think can continue as it is.


> Thanks.
>>>
>>>>
>>>>> This commit I changed it to polling for replies to commands
>>>>> executed in
>>> the interrupt thread.
>>>>>
>>>>> main thread                                                               
>>>>>                                  interrupt
>>> thread
>>>>>      |                                                                    
>>>>>                                                            |
>>>>>      |                                                                    
>>>>>                                                            |
>>>>> iavf_query_stats                                                          
>>>>>                                               |
>>>>> iavf_execute_vf_cmd                                                       
>>>>>                                         |
>>>>> iavf_aq_send_msg_to_pf  and wait handle complete
>> |
>>>>>      |                                                                    
>>>>>                                                           |
>>>>>
>>>>> |------------------------------------------------------------------
>>>>> -------------------
>>> ------->|
>>>>>      |                                                                    
>>>>>                                                           |
>>>>>      |
>>> iavf_handle_virtchnl_msg
>>>>>      |                                                                    
>>>>>                                                          |
>>>>>
>>>>> |<-----------------------------------------------------------------
>>>>> -------------------
>>> --------|
>>>>>      |                                                                    
>>>>>                                                          |
>>>>> iavf_execute_vf_cmd get response                                          
>>>>>                             |
>>>>>      |                                                                    
>>>>>                                                          |
>>>>>
>>>>> The above is the stack trace for the normal execution of
>>>>> iavf_query_stats in the main thread.
>>>>>
>>>>>  interrupt thread
>>>>>      |
>>>>>      |
>>>>> iavf_query_stats
>>>>> iavf_execute_vf_cmd
>>>>> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
>>>>> iavf_execute_vf_cmd timeout
>>>>>      |
>>>>>      |
>>>>> iavf_handle_virtchnl_msg
>>>>>      |
>>>>>
>>>>> The above is the stack trace for the abnormal execution of
>>>>> iavf_query_stats in the interrupt thread.
>>>>>>
>>>>>>> Cc: sta...@dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Kaiwen Deng <kaiwenx.d...@intel.com>
>>>>>>> ---
>>>>>>> Changes since v2:
>>>>>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
>>>>>>>
>>>>>>> Changes since v1:
>>>>>>> - Add lock to avoid race condition.
>>>>>>> ---
>>>>>>> ---
>>>>>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>>>>>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>>>>>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>>>>>>>  3 files changed, 58 insertions(+), 25 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> index f0c4f7d26b..edce621496 100644
>>>>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>>>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>>>>         uint8_t update_stats = 0;
>>>>>>>         uint16_t slave_id;
>>>>>>>         uint16_t i;
>>>>>>> +       int ret;
>>>>>>>
>>>>>>>         internals->slave_update_idx++;
>>>>>>>
>>>>>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void
>> *arg)
>>>>>>>
>>>>>>>         for (i = 0; i < internals->active_slave_count; i++) {
>>>>>>>                 slave_id = internals->active_slaves[i];
>>>>>>> -               rte_eth_stats_get(slave_id, &slave_stats);
>>>>>>> +               ret = rte_eth_stats_get(slave_id, &slave_stats);
>>>>>>> +               if (ret)
>>>>>>> +                       goto OUT;
>>>>>>> +
>>>>>>>                 tx_bytes = slave_stats.obytes -
>> tlb_last_obytets[slave_id];
>>>>>>>                 bandwidth_left(slave_id, tx_bytes,
>>>>>>>                                 internals->slave_update_idx,
>> &bwg_array[i]);
>>>>>> @@ -922,6 +926,7 @@
>>>>>>> bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>>>>         for (i = 0; i < slave_count; i++)
>>>>>>>                 internals->tlb_slaves_order[i] = bwg_array[i].slave;
>>>>>>>
>>>>>>> +OUT:
>>>>>>>         rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
>>>>>> bond_ethdev_update_tlb_slave_cb,
>>>>>>>                         (struct bond_dev_private *)internals);  } diff -
>> -git
>>>>>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
>>>>>>> index 3196210f2c..d6e1f1a7f4 100644
>>>>>>> --- a/drivers/net/iavf/iavf_ethdev.c
>>>>>>> +++ b/drivers/net/iavf/iavf_ethdev.c
>>>>>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>>>>         adapter->dev_data = eth_dev->data;
>>>>>>>         adapter->stopped = 1;
>>>>>>>
>>>>>>> +       if (iavf_dev_event_handler_init())
>>>>>>> +               goto init_vf_err;
>>>>>>> +
>>>>>>>         if (iavf_init_vf(eth_dev) != 0) {
>>>>>>>                 PMD_INIT_LOG(ERR, "Init vf failed");
>>>>>>>                 return -1;
>>>>>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>>>>         rte_ether_addr_copy((struct rte_ether_addr *)hw-
>>> mac.addr,
>>>>>>>                         &eth_dev->data->mac_addrs[0]);
>>>>>>>
>>>>>>> -       if (iavf_dev_event_handler_init())
>>>>>>> -               goto init_vf_err;
>>>>>>>
>>>>>>>         if (vf->vf_res->vf_cap_flags &
>> VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
>>>>>> {
>>>>>>>                 /* register callback func to eal lib */ diff --git
>>>>>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
>>>>>>> index 9adaadb173..aeffb07cca 100644
>>>>>>> --- a/drivers/net/iavf/iavf_vchnl.c
>>>>>>> +++ b/drivers/net/iavf/iavf_vchnl.c
>>>>>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
>>>>>> *adapter, uint16_t buf_len,
>>>>>>>                                 vf->link_speed =
>>>>>> iavf_convert_link_speed(speed);
>>>>>>>                         }
>>>>>>>                         iavf_dev_link_update(vf->eth_dev, 0);
>>>>>>> +                       iavf_dev_event_post(vf->eth_dev,
>>>>>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>>>>>>>                         PMD_DRV_LOG(INFO, "Link status
>> update:%s",
>>>>>>>                                         vf->link_up ? "up" : "down");
>>>>>>>                         break;
>>>>>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
>>>>>> *adapter, struct iavf_cmd_info *args,
>>>>>>>                 _clear_cmd(vf);
>>>>>>>                 break;
>>>>>>>         default:
>>>>>>> -               /* For other virtchnl ops in running time,
>>>>>>> -                * wait for the cmd done flag.
>>>>>>> -                */
>>>>>>> -               do {
>>>>>>> -                       if (vf->pend_cmd ==
>> VIRTCHNL_OP_UNKNOWN)
>>>>>>> -                               break;
>>>>>>> -                       iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> -                       /* If don't read msg or read sys event,
>> continue */
>>>>>>> -               } while (i++ < MAX_TRY_TIMES);
>>>>>>> -
>>>>>>> -               if (i >= MAX_TRY_TIMES) {
>>>>>>> -                       PMD_DRV_LOG(ERR, "No response for
>> cmd %d",
>>>>>> args->ops);
>>>>>>> +               if (rte_thread_is_intr()) {
>>>>>>> +                       /* For virtchnl ops were executed in
>> eal_intr_thread,
>>>>>>> +                        * need to poll the response.
>>>>>>> +                        */
>>>>>>> +                       do {
>>>>>>> +                               result =
>> iavf_read_msg_from_pf(adapter,
>>>>>> args->out_size,
>>>>>>> +                                                       args-
>>> out_buffer);
>>>>>>> +                               if (result == IAVF_MSG_CMD)
>>>>>>> +                                       break;
>>>>>>> +                               iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> +                       } while (i++ < MAX_TRY_TIMES);
>>>>>>> +                       if (i >= MAX_TRY_TIMES ||
>>>>>>> +                               vf->cmd_retval !=
>>>>>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> +                               err = -1;
>>>>>>> +                               PMD_DRV_LOG(ERR, "No response
>> or return
>>>>>> failure (%d)"
>>>>>>> +                                               " for cmd %d", vf-
>>>>>>> cmd_retval, args->ops);
>>>>>>> +                       }
>>>>>>>                         _clear_cmd(vf);
>>>>>>> -                       err = -EIO;
>>>>>>> -               } else if (vf->cmd_retval ==
>>>>>>> -                          VIRTCHNL_STATUS_ERR_NOT_SUPPORTED)
>> {
>>>>>>> -                       PMD_DRV_LOG(ERR, "Cmd %d not
>> supported", args-
>>>>>>> ops);
>>>>>>> -                       err = -ENOTSUP;
>>>>>>> -               } else if (vf->cmd_retval !=
>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> -                       PMD_DRV_LOG(ERR, "Return failure %d for
>> cmd %d",
>>>>>>> -                                   vf->cmd_retval, args->ops);
>>>>>>> -                       err = -EINVAL;
>>>>>>> +               } else {
>>>>>>> +                       /* For other virtchnl ops in running time,
>>>>>>> +                        * wait for the cmd done flag.
>>>>>>> +                        */
>>>>>>> +                       do {
>>>>>>> +                               if (vf->pend_cmd ==
>>>>>> VIRTCHNL_OP_UNKNOWN)
>>>>>>> +                                       break;
>>>>>>> +                               iavf_msec_delay(ASQ_DELAY_MS);
>>>>>>> +                               /* If don't read msg or read sys event,
>>>>>> continue */
>>>>>>> +                       } while (i++ < MAX_TRY_TIMES);
>>>>>>> +
>>>>>>> +                       if (i >= MAX_TRY_TIMES) {
>>>>>>> +                               PMD_DRV_LOG(ERR, "No response
>> for
>>>>>> cmd %d", args->ops);
>>>>>>> +                               _clear_cmd(vf);
>>>>>>> +                               err = -EIO;
>>>>>>> +                       } else if (vf->cmd_retval ==
>>>>>>> +
>>      VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>>>>> +                               PMD_DRV_LOG(ERR, "Cmd %d not
>>>>>> supported", args->ops);
>>>>>>> +                               err = -ENOTSUP;
>>>>>>> +                       } else if (vf->cmd_retval !=
>>>>>> VIRTCHNL_STATUS_SUCCESS) {
>>>>>>> +                               PMD_DRV_LOG(ERR, "Return
>> failure %d for
>>>>>> cmd %d",
>>>>>>> +                                               vf->cmd_retval, args-
>>> ops);
>>>>>>> +                               err = -EINVAL;
>>>>>>> +                       }
>>>>>>>                 }
>>>>>>>                 break;
>>>>>>>         }
>>>>>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct
>>> iavf_adapter
>>>>>>> *adapter,  {
>>>>>>>         struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>>>>>>>         int ret;
>>>>>>> +       int is_intr_thread = rte_thread_is_intr();
>>>>>>>
>>>>>>> -       rte_spinlock_lock(&vf->aq_lock);
>>>>>>> +       if (is_intr_thread) {
>>>>>>> +               if (!rte_spinlock_trylock(&vf->aq_lock))
>>>>>>> +                       return -EIO;
>>>>>>> +       } else {
>>>>>>> +               rte_spinlock_lock(&vf->aq_lock);
>>>>>>> +       }
>>>>>>>         ret = iavf_execute_vf_cmd(adapter, args, async);
>>>>>>>         rte_spinlock_unlock(&vf->aq_lock);
>>>>>>>
>>>>>
>>>>
> 

Reply via email to