intel: introduce infrastructure for Rx path selection

Loftus, Ciara Wed, 06 Aug 2025 03:15:16 -0700

> 
> On Fri, Jul 25, 2025 at 12:49:15PM +0000, Ciara Loftus wrote:
> > The code for determining which Rx path to select during initialisation
> > has become complicated in many intel drivers due to the amount of
> > different paths and features available within each path. This commit
> > aims to simplify and genericize the path selection logic.
> >
> > The following information about each Rx burst function is stored and
> > used by the new common function to select the appropriate Rx path:
> > - Rx Offloads
> > - SIMD bitwidth
> > - Flexible RXD usage
> > - Bulk alloc function
> > - Scattered function
> >
> > Signed-off-by: Ciara Loftus <ciara.lof...@intel.com>
> > ---
> 
> This is a good idea to simplify things. Some comments inline below.
> 
> /Bruce
> 
> >  drivers/net/intel/common/rx.h | 110
> ++++++++++++++++++++++++++++++++++
> >  1 file changed, 110 insertions(+)
> >
> > diff --git a/drivers/net/intel/common/rx.h b/drivers/net/intel/common/rx.h
> > index 70b597e8dc..6e9d81fecf 100644
> > --- a/drivers/net/intel/common/rx.h
> > +++ b/drivers/net/intel/common/rx.h
> > @@ -10,6 +10,7 @@
> >  #include <unistd.h>
> >  #include <rte_mbuf.h>
> >  #include <rte_ethdev.h>
> > +#include <rte_vect.h>
> >
> >  #include "desc.h"
> >
> > @@ -20,6 +21,12 @@
> >  #define CI_VPMD_DESCS_PER_LOOP_WIDE 8
> >  #define CI_VPMD_RX_REARM_THRESH     64
> >
> > +#define CI_RX_BURST_NO_FEATURES            0
> > +#define CI_RX_BURST_FEATURE_SCATTERED      RTE_BIT32(0)
> > +#define CI_RX_BURST_FEATURE_FLEX   RTE_BIT32(1)
> > +#define CI_RX_BURST_FEATURE_BULK_ALLOC     RTE_BIT32(2)
> > +#define CI_RX_BURST_FEATURE_IS_DISABLED    RTE_BIT32(3)
> > +
> >  struct ci_rx_queue;
> >
> >  struct ci_rx_entry {
> > @@ -125,6 +132,19 @@ struct ci_rx_queue {
> >     };
> >  };
> >
> > +
> > +struct ci_rx_burst_features {
> > +   uint32_t rx_offloads;
> > +   enum rte_vect_max_simd simd_width;
> > +   uint32_t other_features_mask;
> 
> Rather than using a bitmask here and having to do bit ops, how about just
> using individual boolean variables for each setting. Given that we only
> have 4 settings right now (disabled, scattered, flex, and bulk-alloc), the
> actual struct size here will be the same, but the code will be simpler.
> Since this is not a datapath function, I'm not worried if the overall
> struct size grows a bit over time.
> 
> I'd also note that in the structure below, we need an extra 4-bytes of
> padding anyway, since the pointers require it to be 8-byte aligned, so we
> can have up to 8 flags in the rx_burst_features struct without affecting
> the size of the burst_info struct.


Thanks for the feedback on the series. No objections with the feedback on the 
other patches.

The choice to use a bitmask here wasn't for space-saving rather for readability 
in later patches when we define the ci_rx_burst_features for each path in each 
driver.
For example, if a path just has the scattered feature, it's features can be 
defined like:
{PATH_OFFLOADS, PATH_SIMD_WIDTH, CI_RX_BURST_FEATURE_SCATTERED}
instead of the below if we were to split the mask into booleans:
{PATH_OFFLOADS, PATH_SIMD_WIDTH, CI_RX_BURST_FEATURE_SCATTERED, 0, 0, 0}

Maybe it's not that much of an improvement in readability. And I suppose it 
introduces a little bit of extra complexity in the select() function when 
reading the values of the mask instead of reading simple booleans. So I'm happy 
to go the boolean route if you feel it's better.

> 
> > +};
> > +
> > +struct ci_rx_burst_info {
> > +   eth_rx_burst_t pkt_burst;
> > +   const char *info;
> > +   struct ci_rx_burst_features features;
> > +};
> > +
> 
> Just on struct naming - would it be better to use the word "path" instead
> of "burst" in the struct names, as in:
>  ci_rx_burst_features -> ci_rx_path_features
>  ci_rx_burst_info     -> ci_rx_path_info
> 
> Given that the mode_select function below takes as parameter "num_paths", I
> think this would make the names more consistent and meaningful.
> 
> >  static inline uint16_t
> >  ci_rx_reassemble_packets(struct rte_mbuf **rx_bufs, uint16_t nb_bufs,
> uint8_t *split_flags,
> >             struct rte_mbuf **pkt_first_seg, struct rte_mbuf
> **pkt_last_seg,
> > @@ -222,4 +242,94 @@ ci_rxq_vec_capable(uint16_t nb_desc, uint16_t
> rx_free_thresh, uint64_t offloads)
> >     return true;
> >  }
> >
> > +/**
> > + * Select the best matching Rx burst mode function based on features
> > + *
> > + * @param req_features
> > + *   The requested features for the Rx burst mode
> > + *
> > + * @return
> > + *   The packet burst function index that best matches the requested
> features
> > + */
> 
> Missing doxygen documentation for some of the parameters.
> 
> > +static inline int
> > +ci_rx_burst_mode_select(const struct ci_rx_burst_info *infos,
> > +                   struct ci_rx_burst_features req_features,
> 
> I would swap the order of these first two parameters, so that the request
> comes first. As well as (at least to me) making more sense having the
> request first, it also means that the "num_paths" parameter immediately
> follows the array to which it refers.
> 
> > +                   int num_paths,
> > +                   int default_path)
> > +{
> > +   int i, idx = -1;
> > +   const struct ci_rx_burst_features *info_features;
> 
> This can be defined inside the loop.
> 
> > +   bool req_flex = req_features.other_features_mask &
> CI_RX_BURST_FEATURE_FLEX;
> > +   bool req_scattered = req_features.other_features_mask &
> CI_RX_BURST_FEATURE_SCATTERED;
> > +   bool req_bulk_alloc = req_features.other_features_mask &
> CI_RX_BURST_FEATURE_BULK_ALLOC;
> > +   bool info_flex, info_scattered, info_bulk_alloc;
> > +
> > +   for (i = 0; i < num_paths; i++) {
> > +           info_features =  &infos[i].features;
> > +
> "path_features" is probably a better name for this, since it refers to the
> current path we are examining.
> 
> > +           /* Do not select a disabled rx burst function. */
> > +           if (info_features->other_features_mask &
> CI_RX_BURST_FEATURE_IS_DISABLED)
> > +                   continue;
> > +
> > +           /* If requested, ensure the function uses the flexible
> descriptor. */
> > +           info_flex = info_features->other_features_mask &
> CI_RX_BURST_FEATURE_FLEX;
> > +           if (info_flex != req_flex)
> > +                   continue;
> > +
> > +           /* If requested, ensure the function supports scattered RX. */
> > +           info_scattered = info_features->other_features_mask &
> CI_RX_BURST_FEATURE_SCATTERED;
> > +           if (info_scattered != req_scattered)
> > +                   continue;
> > +
> > +           /* Do not use a bulk alloc function if not requested. However
> if it is the only
> > +            * feature requested, ensure it is supported in the selected
> function.
> > +            */
> What the is benefit or reason for ths second part of this - checking if
> it's the only feature requested, vs being requested with other features?
> Can you please explain a bit?

Sure. I agree it's not clear.
This is a way to ensure that the scalar bulk alloc path will be chosen over the 
basic scalar path in the case that no other features are requested.
If other features are requested eg. scattered, they take precedence over the 
bulk alloc request. But if only bulk alloc is requested, it should be satisfied.
I will try to improve the clarity here.

> 
> > +           info_bulk_alloc =
> > +                   info_features->other_features_mask &
> CI_RX_BURST_FEATURE_BULK_ALLOC;
> > +           if ((info_bulk_alloc && !req_bulk_alloc) ||
> > +                           (req_features.other_features_mask ==
> > +
>       CI_RX_BURST_FEATURE_BULK_ALLOC &&
> > +                                           !info_bulk_alloc))
> > +                   continue;
> > +
> > +           /* Ensure the function supports the requested RX offloads. */
> > +           if ((info_features->rx_offloads & req_features.rx_offloads) !=
> > +                           req_features.rx_offloads)
> > +                   continue;
> > +
> > +           /* Ensure the function's SIMD width is compatible with the
> requested width. */
> > +           if (info_features->simd_width > req_features.simd_width)
> > +                   continue;
> > +
> > +           /* If this is the first valid path found, select it. */
> > +           if (idx == -1) {
> > +                   idx = i;
> > +                   continue;
> > +           }
> > +
> > +           /* At this point, at least one path has already been found that
> has met the
> > +            * requested criteria. Analyse the current path and select it 
> > if it
> is
> > +            * better than the previously selected one. i.e. if it has a 
> > larger
> SIMD width or
> > +            * if it has the same SIMD width but fewer offloads enabled.
> > +            */
> > +
> > +           if (info_features->simd_width >
> infos[idx].features.simd_width) {
> > +                   idx = i;
> > +                   continue;
> > +           }
> > +
> > +           /* Use the path with the least offloads that satisfies the
> requested offloads. */
> > +           if (info_features->simd_width ==
> infos[idx].features.simd_width &&
> 
> I'd probably make this an "else if", since we don't want to bother with
> this if we have just updated idx based on the simd_width.
> 
> > +                           (rte_popcount32(info_features->rx_offloads)
> <
> > +
>       rte_popcount32(infos[idx].features.rx_offloads)))
> > +                   idx = i;
> > +   }
> > +
> > +   /* No path was found so use the default. */
> > +   if (idx == -1)
> > +           return default_path;
> > +
> > +   return idx;
> > +}
> > +
> >  #endif /* _COMMON_INTEL_RX_H_ */
> > --
> > 2.34.1
> >

RE: [RFC PATCH 10/14] net/intel: introduce infrastructure for Rx path selection

Reply via email to