On Mon, Aug 04, 2025 at 04:17:04PM -0700, Mingming Cao wrote:
> POWER8 support a maximum of 16 subcrq indirect descriptor entries per
>  H_SEND_SUB_CRQ_INDIRECT call, while POWER9 and newer hypervisors
>  support up to 128 entries. Increasing the max number of indirect
> descriptor entries improves batching efficiency and reduces
> hcall overhead, which enhances throughput under large workload on POWER9+.
> 
> Currently, ibmvnic driver always uses a fixed number of max indirect
> descriptor entries (16). send_subcrq_indirect() treats all hypervisor
> errors the same:
>  - Cleanup and Drop the entire batch of descriptors.
>  - Return an error to the caller.
>  - Rely on TCP/IP retransmissions to recover.
>  - If the hypervisor returns H_PARAMETER (e.g., because 128
>    entries are not supported on POWER8), the driver will continue
>    to drop batches, resulting in unnecessary packet loss.
> 
> In this patch:
> Raise the default maximum indirect entries to 128 to improve ibmvnic
> batching on morden platform. But also gracefully fall back to
> 16 entries for Power 8 systems.
> 
> Since there is no VIO interface to query the hypervisor’s supported
> limit, vnic handles send_subcrq_indirect() H_PARAMETER errors:
>  - On first H_PARAMETER failure, log the failure context
>  - Reduce max_indirect_entries to 16 and allow the single batch to drop.
>  - Subsequent calls automatically use the correct lower limit,
>     avoiding repeated drops.
> 
> The goal is to  optimizes performance on modern systems while handles
> falling back for older POWER8 hypervisors.
> 
> Performance shows 40% improvements with MTU (1500) on largework load.
> 
> --------------------------------------
> Changes since v2:
> link to v2: https://www.spinics.net/lists/netdev/msg1104669.html
> 
> -- was Patch 4 from a patch series v2. v2 introduced a module parameter
> for backward compatibility. Based on review feedback, This patch handles
> older systems fall back case without adding a module parameter.
> 
> Signed-off-by: Mingming Cao <m...@linux.ibm.com>
> Reviewed-by: Brian King <bjki...@linux.ibm.com>
> Reviewed-by: Haren Myneni <ha...@linux.ibm.com>
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c | 56 ++++++++++++++++++++++++++----
>  drivers/net/ethernet/ibm/ibmvnic.h |  6 ++--
>  2 files changed, 53 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c

...

> @@ -862,6 +862,19 @@ static void replenish_rx_pool(struct ibmvnic_adapter 
> *adapter,
>  failure:
>       if (lpar_rc != H_PARAMETER && lpar_rc != H_CLOSED)
>               dev_err_ratelimited(dev, "rx: replenish packet buffer 
> failed\n");
> +
> +     /* Detect platform limit H_PARAMETER */
> +     if (lpar_rc == H_PARAMETER &&
> +         adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> +             netdev_info(adapter->netdev,
> +                         "H_PARAMETER, set ind desc to safe limit %u\n",
> +                         IBMVNIC_MAX_IND_DESC_MIN);
> +             adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> +     }

Hi Mingming, all,

The logic above seems to appear twice in this patch.
I think it would be good to consolidate it somehow.
E.g. in a helper function.

> +
> +     /* for all error case, temporarily drop only this batch
> +      * Rely on TCP/IP retransmissions to retry and recover
> +      */

Thanks for adding this comment.
Although perhaps 'for' -> 'For'.

Likewise below.

>       for (i = ind_bufp->index - 1; i >= 0; --i) {
>               struct ibmvnic_rx_buff *rx_buff;
>  
> @@ -2381,16 +2394,33 @@ static int ibmvnic_tx_scrq_flush(struct 
> ibmvnic_adapter *adapter,
>               rc = send_subcrq_direct(adapter, handle,
>                                       (u64 *)ind_bufp->indir_arr);
>  
> -     if (rc)
> +     if (rc) {
> +             dev_err_ratelimited(&adapter->vdev->dev,
> +                                 "tx_flush failed, rc=%u (%llu entries 
> dma=%pad handle=%llx)\n",
> +                                 rc, entries, &dma_addr, handle);
> +             /* Detect platform limit H_PARAMETER */
> +             if (rc == H_PARAMETER &&
> +                 adapter->cur_max_ind_descs > IBMVNIC_MAX_IND_DESC_MIN) {
> +                     netdev_info(adapter->netdev,
> +                                 "H_PARAMETER, set ind descs to safe limit 
> %u\n",
> +                                 IBMVNIC_MAX_IND_DESC_MIN);
> +                     adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MIN;
> +             }
> +
> +             /* for all error case, temporarily drop only this batch
> +              * Rely on TCP/IP retransmissions to retry and recover
> +              */
>               ibmvnic_tx_scrq_clean_buffer(adapter, tx_scrq);
> -     else
> +     } else {
>               ind_bufp->index = 0;
> +     }
>       return rc;
>  }

...

> @@ -6369,6 +6399,17 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter 
> *adapter, bool reset)
>                       rc = reset_sub_crq_queues(adapter);
>               }
>       } else {
> +             if (adapter->reset_reason == VNIC_RESET_MOBILITY) {
> +                     /* post migrtione reset the max
> +                      * indirect descriptors per hcall to be default max
> +                      * (e.g p8 ->p10)
> +                      * if the destination is on the platform supports
> +                      * do not support max (e.g. p10->p8) the threshold
> +                      * will be reduced to safe min limit for p8 later
> +                      */

nits: Post migration, reset.

      The line breaking seems uneven.

      And if p8 and p10 are POWER8 and POWER10 then I think it would
      be worth spelling that out.
 
> +                     adapter->cur_max_ind_descs = IBMVNIC_MAX_IND_DESC_MAX;
> +             }
> +
>               rc = init_sub_crqs(adapter);
>       }
>  

...

> diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
> b/drivers/net/ethernet/ibm/ibmvnic.h
> index 246ddce753f9..829a16116812 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.h
> +++ b/drivers/net/ethernet/ibm/ibmvnic.h
> @@ -29,8 +29,9 @@
>  #define IBMVNIC_BUFFS_PER_POOL       100
>  #define IBMVNIC_MAX_QUEUES   16
>  #define IBMVNIC_MAX_QUEUE_SZ   4096
> -#define IBMVNIC_MAX_IND_DESCS  16
> -#define IBMVNIC_IND_ARR_SZ   (IBMVNIC_MAX_IND_DESCS * 32)
> +#define IBMVNIC_MAX_IND_DESC_MAX 128
> +#define IBMVNIC_MAX_IND_DESC_MIN 16

...MAX...{MAX,MIN} seems like an unfortunate name.
But I don't feel particularly strongly about this one.

> +#define IBMVNIC_IND_MAX_ARR_SZ (IBMVNIC_MAX_IND_DESC_MAX * 32)
>  
>  #define IBMVNIC_TSO_BUF_SZ   65536
>  #define IBMVNIC_TSO_BUFS     64

...

Reply via email to