On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
>>
>>
>>> -----Original Message-----
>>> From: Ferruh Yigit <ferruh.yi...@amd.com>
>>> Sent: Thursday, March 23, 2023 11:39 PM
>>> To: Deng, KaiwenX <kaiwenx.d...@intel.com>; dev@dpdk.org
>>> Cc: sta...@dpdk.org; Yang, Qiming <qiming.y...@intel.com>; Zhou, YidingX
>>> <yidingx.z...@intel.com>; Chas Williams <ch...@att.com>; Min Hu (Connor)
>>> <humi...@huawei.com>; Wu, Jingjing <jingjing...@intel.com>; Xing, Beilei
>>> <beilei.x...@intel.com>; Mike Pattrick <m...@redhat.com>; Zhang, Qi Z
>>> <qi.z.zh...@intel.com>; Doherty, Declan <declan.dohe...@intel.com>;
>>> Mrzyglod, Daniel T <daniel.t.mrzyg...@intel.com>; Dapeng Yu
>>> <dapengx...@intel.com>
>>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
>>>
>>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>>>> When iavf send query-stats command in eal-intr-thread through virtual
>>>> channel, there will be no response received from
>>>> iavf_dev_virtchnl_handler for this command during block and wait.
>>>> Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
>>>>
>>>> When vf device is bonded as BONDING_MODE_TLB mode, the slave device
>>>> update callback will registered in alarm and called by
>>>> eal-intr-thread, it would also raise the above issue.
>>>>
>>>> This commit add to poll the response for VIRTCHNL_OP_GET_STATS when
>>> it
>>>> is called by eal-intr-thread to fix this issue.
>>>>
>>>> Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>>>> Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>>>> Fixes: 7c76a747e68c ("bond: add mode 5")
>>>> Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>>>> Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
>>>
>>>
>>> Hi Kaiwen,
>>>
>>> Above commit already seems trying to address same issue, it creates "iavf-
>>> event-thread" control thread to asyncroniously handle the interrupts, in 
>>> non-
>>> interrupt context, why it is not working?
>>>
>>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure all
>>> interrupts handled in control tread?
>>>
>>> And can you please provide a stack trace in commit log, to describe the 
>>> issue
>>> better?
>> Hi Ferru,
>> Sorry for my late reply, And thanks for your review.
>>  
>> The above commit does not fix this issue when we need to get the returned 
>> data.
>> If we call iavf_query_stats and wait for response statistics in the 
>> intr-thread.
>> iavf_handle_virtchnl_msg is also registered in the intr_thread and will not 
>> be 
>> executed while waiting.
>>
> 
> Got it, since return value is required, API can't be called asyncroniously.
> 
> 
> 
> I think 'rte_thread_is_intr()' checks may cause more trouble for you in
> long term,
> 
> - why 'iavf_query_stats()' is called in the iterrupt thread, can it be
> prevented?
> 
> - does it make sense to allways poll messages from PF (for simplification)?
> 
> 
> If answer to both are 'No', I am OK to continue with current proposal if
> you are happy with it.
> 


btw, how critical is this issue?

If it is critical, I am OK to get it as it is for this release and
investigate it further for next release, since only a few days left for
this release.


> 
>> This commit I changed it to polling for replies to commands executed in the 
>> interrupt thread.
>>
>> main thread                                                                  
>>                               interrupt thread
>>      |                                                                       
>>                                                         |
>>      |                                                                       
>>                                                         |
>> iavf_query_stats                                                             
>>                                            |
>> iavf_execute_vf_cmd                                                          
>>                                      |
>> iavf_aq_send_msg_to_pf  and wait handle complete                             
>>           |
>>      |                                                                       
>>                                                        |
>>      
>> |-------------------------------------------------------------------------------------------->|
>>      |                                                                       
>>                                                        |
>>      |                                                                       
>>                                 iavf_handle_virtchnl_msg
>>      |                                                                       
>>                                                       |
>>      
>> |<--------------------------------------------------------------------------------------------|
>>      |                                                                       
>>                                                       |
>> iavf_execute_vf_cmd get response                                             
>>                          |
>>      |                                                                       
>>                                                       |
>>
>> The above is the stack trace for the normal execution of iavf_query_stats 
>> in the main thread.
>>
>>  interrupt thread
>>      |
>>      |
>> iavf_query_stats
>> iavf_execute_vf_cmd
>> iavf_aq_send_msg_to_pf wait handle complete(1 sec)
>> iavf_execute_vf_cmd timeout
>>      |
>>      |
>> iavf_handle_virtchnl_msg
>>      |
>>
>> The above is the stack trace for the abnormal execution of iavf_query_stats 
>> in the interrupt thread.
>>>
>>>> Cc: sta...@dpdk.org
>>>>
>>>> Signed-off-by: Kaiwen Deng <kaiwenx.d...@intel.com>
>>>> ---
>>>> Changes since v2:
>>>> - Add to poll the response for VIRTCHNL_OP_GET_STATS.
>>>>
>>>> Changes since v1:
>>>> - Add lock to avoid race condition.
>>>> ---
>>>> ---
>>>>  drivers/net/bonding/rte_eth_bond_pmd.c |  7 ++-
>>>>  drivers/net/iavf/iavf_ethdev.c         |  5 +-
>>>>  drivers/net/iavf/iavf_vchnl.c          | 71 ++++++++++++++++++--------
>>>>  3 files changed, 58 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> index f0c4f7d26b..edce621496 100644
>>>> --- a/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c
>>>> @@ -894,6 +894,7 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>    uint8_t update_stats = 0;
>>>>    uint16_t slave_id;
>>>>    uint16_t i;
>>>> +  int ret;
>>>>
>>>>    internals->slave_update_idx++;
>>>>
>>>> @@ -903,7 +904,10 @@ bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>
>>>>    for (i = 0; i < internals->active_slave_count; i++) {
>>>>            slave_id = internals->active_slaves[i];
>>>> -          rte_eth_stats_get(slave_id, &slave_stats);
>>>> +          ret = rte_eth_stats_get(slave_id, &slave_stats);
>>>> +          if (ret)
>>>> +                  goto OUT;
>>>> +
>>>>            tx_bytes = slave_stats.obytes - tlb_last_obytets[slave_id];
>>>>            bandwidth_left(slave_id, tx_bytes,
>>>>                            internals->slave_update_idx, &bwg_array[i]);
>>> @@ -922,6 +926,7 @@
>>>> bond_ethdev_update_tlb_slave_cb(void *arg)
>>>>    for (i = 0; i < slave_count; i++)
>>>>            internals->tlb_slaves_order[i] = bwg_array[i].slave;
>>>>
>>>> +OUT:
>>>>    rte_eal_alarm_set(REORDER_PERIOD_MS * 1000,
>>> bond_ethdev_update_tlb_slave_cb,
>>>>                    (struct bond_dev_private *)internals);  } diff --git
>>>> a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
>>>> index 3196210f2c..d6e1f1a7f4 100644
>>>> --- a/drivers/net/iavf/iavf_ethdev.c
>>>> +++ b/drivers/net/iavf/iavf_ethdev.c
>>>> @@ -2607,6 +2607,9 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>    adapter->dev_data = eth_dev->data;
>>>>    adapter->stopped = 1;
>>>>
>>>> +  if (iavf_dev_event_handler_init())
>>>> +          goto init_vf_err;
>>>> +
>>>>    if (iavf_init_vf(eth_dev) != 0) {
>>>>            PMD_INIT_LOG(ERR, "Init vf failed");
>>>>            return -1;
>>>> @@ -2634,8 +2637,6 @@ iavf_dev_init(struct rte_eth_dev *eth_dev)
>>>>    rte_ether_addr_copy((struct rte_ether_addr *)hw->mac.addr,
>>>>                    &eth_dev->data->mac_addrs[0]);
>>>>
>>>> -  if (iavf_dev_event_handler_init())
>>>> -          goto init_vf_err;
>>>>
>>>>    if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_WB_ON_ITR)
>>> {
>>>>            /* register callback func to eal lib */ diff --git
>>>> a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c index
>>>> 9adaadb173..aeffb07cca 100644
>>>> --- a/drivers/net/iavf/iavf_vchnl.c
>>>> +++ b/drivers/net/iavf/iavf_vchnl.c
>>>> @@ -256,6 +256,7 @@ iavf_read_msg_from_pf(struct iavf_adapter
>>> *adapter, uint16_t buf_len,
>>>>                            vf->link_speed =
>>> iavf_convert_link_speed(speed);
>>>>                    }
>>>>                    iavf_dev_link_update(vf->eth_dev, 0);
>>>> +                  iavf_dev_event_post(vf->eth_dev,
>>> RTE_ETH_EVENT_INTR_LSC, NULL, 0);
>>>>                    PMD_DRV_LOG(INFO, "Link status update:%s",
>>>>                                    vf->link_up ? "up" : "down");
>>>>                    break;
>>>> @@ -368,28 +369,48 @@ iavf_execute_vf_cmd(struct iavf_adapter
>>> *adapter, struct iavf_cmd_info *args,
>>>>            _clear_cmd(vf);
>>>>            break;
>>>>    default:
>>>> -          /* For other virtchnl ops in running time,
>>>> -           * wait for the cmd done flag.
>>>> -           */
>>>> -          do {
>>>> -                  if (vf->pend_cmd == VIRTCHNL_OP_UNKNOWN)
>>>> -                          break;
>>>> -                  iavf_msec_delay(ASQ_DELAY_MS);
>>>> -                  /* If don't read msg or read sys event, continue */
>>>> -          } while (i++ < MAX_TRY_TIMES);
>>>> -
>>>> -          if (i >= MAX_TRY_TIMES) {
>>>> -                  PMD_DRV_LOG(ERR, "No response for cmd %d",
>>> args->ops);
>>>> +          if (rte_thread_is_intr()) {
>>>> +                  /* For virtchnl ops were executed in eal_intr_thread,
>>>> +                   * need to poll the response.
>>>> +                   */
>>>> +                  do {
>>>> +                          result = iavf_read_msg_from_pf(adapter,
>>> args->out_size,
>>>> +                                                  args->out_buffer);
>>>> +                          if (result == IAVF_MSG_CMD)
>>>> +                                  break;
>>>> +                          iavf_msec_delay(ASQ_DELAY_MS);
>>>> +                  } while (i++ < MAX_TRY_TIMES);
>>>> +                  if (i >= MAX_TRY_TIMES ||
>>>> +                          vf->cmd_retval !=
>>> VIRTCHNL_STATUS_SUCCESS) {
>>>> +                          err = -1;
>>>> +                          PMD_DRV_LOG(ERR, "No response or return
>>> failure (%d)"
>>>> +                                          " for cmd %d", vf-
>>>> cmd_retval, args->ops);
>>>> +                  }
>>>>                    _clear_cmd(vf);
>>>> -                  err = -EIO;
>>>> -          } else if (vf->cmd_retval ==
>>>> -                     VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>> -                  PMD_DRV_LOG(ERR, "Cmd %d not supported", args-
>>>> ops);
>>>> -                  err = -ENOTSUP;
>>>> -          } else if (vf->cmd_retval != VIRTCHNL_STATUS_SUCCESS) {
>>>> -                  PMD_DRV_LOG(ERR, "Return failure %d for cmd %d",
>>>> -                              vf->cmd_retval, args->ops);
>>>> -                  err = -EINVAL;
>>>> +          } else {
>>>> +                  /* For other virtchnl ops in running time,
>>>> +                   * wait for the cmd done flag.
>>>> +                   */
>>>> +                  do {
>>>> +                          if (vf->pend_cmd ==
>>> VIRTCHNL_OP_UNKNOWN)
>>>> +                                  break;
>>>> +                          iavf_msec_delay(ASQ_DELAY_MS);
>>>> +                          /* If don't read msg or read sys event,
>>> continue */
>>>> +                  } while (i++ < MAX_TRY_TIMES);
>>>> +
>>>> +                  if (i >= MAX_TRY_TIMES) {
>>>> +                          PMD_DRV_LOG(ERR, "No response for
>>> cmd %d", args->ops);
>>>> +                          _clear_cmd(vf);
>>>> +                          err = -EIO;
>>>> +                  } else if (vf->cmd_retval ==
>>>> +                          VIRTCHNL_STATUS_ERR_NOT_SUPPORTED) {
>>>> +                          PMD_DRV_LOG(ERR, "Cmd %d not
>>> supported", args->ops);
>>>> +                          err = -ENOTSUP;
>>>> +                  } else if (vf->cmd_retval !=
>>> VIRTCHNL_STATUS_SUCCESS) {
>>>> +                          PMD_DRV_LOG(ERR, "Return failure %d for
>>> cmd %d",
>>>> +                                          vf->cmd_retval, args->ops);
>>>> +                          err = -EINVAL;
>>>> +                  }
>>>>            }
>>>>            break;
>>>>    }
>>>> @@ -403,8 +424,14 @@ iavf_execute_vf_cmd_safe(struct iavf_adapter
>>>> *adapter,  {
>>>>    struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(adapter);
>>>>    int ret;
>>>> +  int is_intr_thread = rte_thread_is_intr();
>>>>
>>>> -  rte_spinlock_lock(&vf->aq_lock);
>>>> +  if (is_intr_thread) {
>>>> +          if (!rte_spinlock_trylock(&vf->aq_lock))
>>>> +                  return -EIO;
>>>> +  } else {
>>>> +          rte_spinlock_lock(&vf->aq_lock);
>>>> +  }
>>>>    ret = iavf_execute_vf_cmd(adapter, args, async);
>>>>    rte_spinlock_unlock(&vf->aq_lock);
>>>>
>>
> 

Reply via email to