On Tue, Mar 03, 2026 at 11:56:29AM +0100, Paolo Abeni wrote:
> On 2/27/26 11:15 AM, Dipayaan Roy wrote:
> > On certain systems configured with 4K PAGE_SIZE, utilizing page_pool
> > fragments for RX buffers results in a significant throughput regression.
> > Profiling reveals that this regression correlates with high overhead in the
> > fragment allocation and reference counting paths on these specific
> > platforms, rendering the multi-buffer-per-page strategy counterproductive.
> > 
> > To mitigate this, bypass the page_pool fragment path and force a single RX
> > packet per page allocation when all the following conditions are met:
> >   1. The system is configured with a 4K PAGE_SIZE.
> >   2. A processor-specific quirk is detected via SMBIOS Type 4 data.
> > 
> > This approach restores expected line-rate performance by ensuring
> > predictable RX refill behavior on affected hardware.
> > 
> > There is no behavioral change for systems using larger page sizes
> > (16K/64K), or platforms where this processor-specific quirk do not
> > apply.
> > 
> > Signed-off-by: Dipayaan Roy <[email protected]>
> > ---
> >  .../net/ethernet/microsoft/mana/gdma_main.c   | 120 ++++++++++++++++++
> >  drivers/net/ethernet/microsoft/mana/mana_en.c |  23 +++-
> >  include/net/mana/gdma.h                       |  10 ++
> >  3 files changed, 151 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c 
> > b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > index 0055c231acf6..26bbe736a770 100644
> > --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> > @@ -9,6 +9,7 @@
> >  #include <linux/msi.h>
> >  #include <linux/irqdomain.h>
> >  #include <linux/export.h>
> > +#include <linux/dmi.h>
> >  
> >  #include <net/mana/mana.h>
> >  #include <net/mana/hw_channel.h>
> > @@ -1955,6 +1956,115 @@ static bool mana_is_pf(unsigned short dev_id)
> >     return dev_id == MANA_PF_DEVICE_ID;
> >  }
> >  
> > +/*
> > + * Table for Processor Version strings found from SMBIOS Type 4 
> > information,
> > + * for processors that needs to force single RX buffer per page quirk for
> > + * meeting line rate performance with ARM64 + 4K pages.
> > + * Note: These strings are exactly matched with version fetched from 
> > SMBIOS.
> > + */
> > +static const char * const mana_single_rxbuf_per_page_quirk_tbl[] = {
> > +   "Cobalt 200",
> > +};
> > +
> > +static const char *smbios_get_string(const struct dmi_header *hdr, u8 idx)
> > +{
> > +   const u8 *start, *end;
> > +   u8 i;
> > +
> > +   /* Indexing starts from 1. */
> > +   if (!idx)
> > +           return NULL;
> > +
> > +   start   = (const u8 *)hdr + hdr->length;
> > +   end = start + SMBIOS_STR_AREA_MAX;
> > +
> > +   for (i = 1; i < idx; i++) {
> > +           while (start < end && *start)
> > +                   start++;
> > +           if (start < end)
> > +                   start++;
> > +           if (start + 1 < end && start[0] == 0 && start[1] == 0)
> > +                   return NULL;
> > +   }
> > +
> > +   if (start >= end || *start == 0)
> > +           return NULL;
> > +
> > +   return (const char *)start;
> 
> If I read correctly, the above sort of duplicate dmi_decode_table().
>
Yes, its not exported.
 
> I think you are better of:
> - use the mana_get_proc_ver_from_smbios() decoder to store the
> SMBIOS_TYPE4_PROC_VERSION_OFFSET index into gd
> - do a 2nd walk with a different decoder to fetch the string at the
> specified index.
Sure, will implement the 2nd walk for fetching string in v2.

> 
> /P

Thank you Paolo, for the comments, and apologies in my delay in response as 
this week I am on-call.
I will send out v2 with the changes suggested.

Regards

Reply via email to