Re: rte_malloc() and alignment

2024-02-07 Thread Dmitry Kozlyuk
2024-02-06 20:46 (UTC-0800), Stephen Hemminger:
> On Tue, 6 Feb 2024 17:17:31 +0100
> Mattias Rönnblom  wrote:
> 
> > The rte_malloc() API documentation has the following to say about the 
> > align parameter:
> > 
> > "If 0, the return is a pointer that is suitably aligned for any kind of 
> > variable (in the same manner as malloc()). Otherwise, the return is a 
> > pointer that is a multiple of align. In this case, it must be a power of 
> > two. (Minimum alignment is the cacheline size, i.e. 64-bytes)"
> > 
> > After reading this, one might be left with the impression that the 
> > parenthesis refers to only the "otherwise" (non-zero-align) case, since 
> > surely, cache line alignment should be sufficient for any kind of 
> > variable and it semantics would be "in the same manner as malloc()".
> > 
> > However, in the actual RTE malloc implementation, any align parameter 
> > value less than RTE_CACHE_LINE_SIZE results in an alignment of 
> > RTE_CACHE_LINE_SIZE, unless I'm missing something.
> > 
> > Is there any conceivable scenario where passing a non-zero align 
> > parameter is useful?
> > 
> > Would it be an improvement to rephrase the documentation to:
> > 
> > "The alignment of the allocated memory meets all of the following criteria:
> > 1) able to hold any built-in type.
> > 2) be at least as large as the align parameter.
> > 3) be at least as large as RTE_CACHE_LINE_SIZE.
> > 
> > The align parameter must be a power-of-2 or 0.
> > "
> > 
> > ...so it actually describes what is implemented? And also adds the 
> > theoretical (?) case of a built-in type requiring > RTE_CACHE_LINE_SIZE 
> > amount of alignment.  
> 
> My reading is that align of 0 means that rte_malloc() should act
> same as malloc(), and give alignment for largest type. 
> 
> Walking through the code, the real work is in and at this point align
> of 0 has been convert to 1. in malloc_heap_alloc_on_heap_id()
> 
> /*
>  * Iterates through the freelist for a heap to find a free element with the
>  * biggest size and requested alignment. Will also set size to whatever 
> element
>  * size that was found.
>  * Returns null on failure, or pointer to element on success.
>  */
> static struct malloc_elem *
> find_biggest_element(struct malloc_heap *heap, size_t *size,
>   unsigned int flags, size_t align, bool contig)
> 
> 
> Then the elements are examined with:
> 
> size_t
> malloc_elem_find_max_iova_contig(struct malloc_elem *elem, size_t align)
> 
> But I don't see anywhere that 0 converts to being aligned on sizeof(double)
> which is the largest type.

One may also read "in the same manner as malloc()" as referring to "suitably
aligned", which means that the alignment is "as suitable as malloc()'s"
and it may also be larger. Then comes the assumption that no built-in type has
alignment larger than a cache line (can vectored types be the case?).

> Not sure who has expertise here?

Added Anatoly.

> The allocator is a bit of problem child.
> It is complex, slow and critical.


Re: [PATCH v5 4/5] baseband/fpga_5gnr_fec: add AGX100 support

2024-02-07 Thread Maxime Coquelin




On 2/6/24 16:30, Maxime Coquelin wrote:



On 1/23/24 17:54, Hernan Vargas wrote:

Add support for new FPGA variant AGX100 (on Arrow Creek N6000).

Signed-off-by: Hernan Vargas 
---
  doc/guides/bbdevs/fpga_5gnr_fec.rst   |   69 +-
  drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |  273 
  .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h    |   12 +-
  .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1230 +++--
  drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  |    1 -
  5 files changed, 1458 insertions(+), 127 deletions(-)
  create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h



...


+#endif /* _AGX100_H_ */
diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h

index 982e956dc819..224684902569 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -8,6 +8,7 @@
  #include 
  #include 
+#include "agx100_pmd.h"
  #include "vc_5gnr_pmd.h"
  /* Helper macro for logging */
@@ -131,12 +132,21 @@ struct fpga_5gnr_fec_device {
  uint64_t q_assigned_bit_map;
  /** True if this is a PF FPGA 5GNR device. */
  bool pf_device;
+    /** Maximum number of possible queues for this device. */
+    uint8_t total_num_queues;


You missed below comment on v4 review:
"
Introduction of total_num_queues should be in a dedicated patch as a
preliminary rework.
"


While at it, please look at the checkpatch too:

WARNING:TYPO_SPELLING: 'worload' may be misspelled - perhaps 'workload'?
#306: FILE: drivers/baseband/fpga_5gnr_fec/agx100_pmd.h:98:
+   uint32_t ea:21, /**< Value of E when worload is CB. */

WARNING:TYPO_SPELLING: 'worload' may be misspelled - perhaps 'workload'?
#405: FILE: drivers/baseband/fpga_5gnr_fec/agx100_pmd.h:197:
+   uint32_t ea:21, /**< Value of E when worload is CB. */

total: 0 errors, 2 warnings, 2030 lines checked
Warning in drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c:
Using rte_smp_[r/w]mb

+    /** FPGA Variant. VC_5GNR_FPGA_VARIANT = 0; AGX100_FPGA_VARIANT = 
1. */

+    uint8_t fpga_variant;
  };






Re: rte_malloc() and alignment

2024-02-07 Thread Mattias Rönnblom

On 2024-02-07 05:46, Stephen Hemminger wrote:

On Tue, 6 Feb 2024 17:17:31 +0100
Mattias Rönnblom  wrote:


The rte_malloc() API documentation has the following to say about the
align parameter:

"If 0, the return is a pointer that is suitably aligned for any kind of
variable (in the same manner as malloc()). Otherwise, the return is a
pointer that is a multiple of align. In this case, it must be a power of
two. (Minimum alignment is the cacheline size, i.e. 64-bytes)"

After reading this, one might be left with the impression that the
parenthesis refers to only the "otherwise" (non-zero-align) case, since
surely, cache line alignment should be sufficient for any kind of
variable and it semantics would be "in the same manner as malloc()".

However, in the actual RTE malloc implementation, any align parameter
value less than RTE_CACHE_LINE_SIZE results in an alignment of
RTE_CACHE_LINE_SIZE, unless I'm missing something.

Is there any conceivable scenario where passing a non-zero align
parameter is useful?

Would it be an improvement to rephrase the documentation to:

"The alignment of the allocated memory meets all of the following criteria:
1) able to hold any built-in type.
2) be at least as large as the align parameter.
3) be at least as large as RTE_CACHE_LINE_SIZE.

The align parameter must be a power-of-2 or 0.
"

...so it actually describes what is implemented? And also adds the
theoretical (?) case of a built-in type requiring > RTE_CACHE_LINE_SIZE
amount of alignment.


My reading is that align of 0 means that rte_malloc() should act
same as malloc(), and give alignment for largest type.



That would be mine as well, if my Bayesian prior hadn't been "doesn't 
DPDK cache-aligned *all* heap allocations?".



Walking through the code, the real work is in and at this point align
of 0 has been convert to 1. in malloc_heap_alloc_on_heap_id()

/*
  * Iterates through the freelist for a heap to find a free element with the
  * biggest size and requested alignment. Will also set size to whatever element
  * size that was found.
  * Returns null on failure, or pointer to element on success.
  */
static struct malloc_elem *
find_biggest_element(struct malloc_heap *heap, size_t *size,
unsigned int flags, size_t align, bool contig)




I continued to heap_alloc() (malloc_heap.c:239), and there one can find:

align = RTE_CACHE_LINE_ROUNDUP(align);

That's where I stopped, still knowing I was pretty clueless in regards 
to the full picture.


There are two reasons to asked for aligned memory. One is to fit a 
certain primitive type into that region, knowing that certain CPUs may 
slow to or unable to do unaligned loads and stores (e.g., for MMX 
registers).


Another is to avoid false sharing between the piece of memory you just 
allocated and adjacent memory blocks (which you know nothing about).


I wonder if aren't best off keeping those two concerns separate, and 
maybe let the memory allocator deal with both.


It seems that alignment-for-load/store you can just solve by having all 
allocations be naturally aligned up to a certain, ISA-specific, size (16 
bytes on x86_64, I think). By naturally aligned I mean that a two-byte 
allocation would be aligned by 2, a four-byte by 4 etc.


The false sharing issue is a more difficult one. In a world without 
next-line-prefetchers (or more elaborate variants thereof), you could 
just cache-align every distinct allocation (which I'm guessing is the 
rationale for malloc_heap:239). The situation we seem to be in today, 
not only the line the core loads/stores to is fetched, but also the next 
(few?) line(s) as well, no amount of struct alignment will fix the issue 
- you need guaranteed padding. You would also want a global knob to turn 
all that extra padding off, since disabling hardware prefetchers may 
well be possible (as well as impossible).


A scenario you want to take into account is one where you have large 
amount of relatively rarely accessed data, where you don't need to worry 
about false sharing, and thus you don't want any alignment or padding 
beyond what the ISA requires. That would just make the whole thing grow, 
potentially with a lot.


That leads me to something like

void *rte_malloc_socket(size_t n, int socket, unsigned int flags);

Where you get memory which is naturally aligned, and "false 
sharing-protected" by default. Such protection would entail having 
enough padding *between* blocks (both before, and after). With this API, 
libs/apps/PMDs should not use any __rte_cache_aligned or RTE_CACHE_GUARD 
type constructs (except for block-internal padding, which one might 
argue shouldn't be used).


With a flag

#define RTE_MALLOC_FLAG_HINT_RARELY_USED

the application specifies that this data is rarely accessed, so false 
sharing is not a concern -> no padding is required. Or you turn it 
around, so "rarely used" is the default.


You could also have a flag
#define RTE_MALLOC_FLAG_NO_ALIGNMENT
which would turn off natural al

Re: [PATCH 1/2] baseband/acc: fix logtypes register

2024-02-07 Thread Maxime Coquelin




On 12/18/23 16:43, David Marchand wrote:

This library was calling RTE_LOG_REGISTER_DEFAULT twice, which means that
all logs for both acc100 and vrb drivers would be emitted for
pmd.baseband.acc logtype.

It seems the intent was to have dedicated logtypes per driver, so
register one for each with a suffix.

Fixes: c2d93488c7c3 ("baseband/acc200: introduce ACC200")

Signed-off-by: David Marchand 
---
  drivers/baseband/acc/rte_acc100_pmd.c | 4 ++--
  drivers/baseband/acc/rte_vrb_pmd.c| 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)



Applied to next-baseband.

Thanks,
Maxime



Re: [PATCH 2/2] baseband/acc: fix common logs

2024-02-07 Thread Maxime Coquelin




On 12/18/23 16:43, David Marchand wrote:

Logs generated by helpers common to acc100 and vrb drivers were
emitted with a RTE_LOG_NOTICE == 6 == RTE_LOGTYPE_HASH.
Register a dedicated logtype for this.

Fixes: 32e8b7ea35dd ("baseband/acc100: refactor to segregate common code")

Signed-off-by: David Marchand 
---
  drivers/baseband/acc/acc_common.c | 7 +++
  drivers/baseband/acc/acc_common.h | 4 +++-
  drivers/baseband/acc/meson.build  | 2 +-
  3 files changed, 11 insertions(+), 2 deletions(-)
  create mode 100644 drivers/baseband/acc/acc_common.c



Applied to next-baseband.

Thanks,
Maxime




Re: [PATCH v1 0/1] baseband/acc: refactor of DMA response

2024-02-07 Thread Maxime Coquelin




On 1/10/24 23:28, Nicolas Chautru wrote:

Based on previous discussion last year with Maxime, refactoring a bit
the VRB PMD response as multiple functions have very similar code
when updating status based on DMA response.

Nicolas Chautru (1):
   baseband/acc: refactor of DMA response

  drivers/baseband/acc/rte_vrb_pmd.c | 139 +
  1 file changed, 40 insertions(+), 99 deletions(-)



Applied to next-baseband.

Thanks,
Maxime



Re: [PATCH v1 0/1] baseband/acc: remove ACC101 variant

2024-02-07 Thread Maxime Coquelin




On 1/12/24 21:36, Hernan Vargas wrote:

Removing obsolete code for ACC101 variant which will not be productized.

Hernan Vargas (1):
   baseband/acc: remove acc101

  doc/guides/bbdevs/acc100.rst   |  18 +-
  doc/guides/rel_notes/release_24_03.rst |   1 +
  drivers/baseband/acc/acc100_pmd.h  |   5 +-
  drivers/baseband/acc/acc101_pmd.h  |  40 ---
  drivers/baseband/acc/rte_acc100_pmd.c  | 425 -
  5 files changed, 6 insertions(+), 483 deletions(-)
  delete mode 100644 drivers/baseband/acc/acc101_pmd.h



Applied to next-baseband.

Thanks,
Maxime



Re: [PATCH 1/4] ethdev: introduce encap hash calculation

2024-02-07 Thread Thomas Monjalon
07/02/2024 07:56, Ori Kam:
> Hi Thomas,
> 
> > -Original Message-
> > From: Thomas Monjalon 
> > Sent: Wednesday, February 7, 2024 12:40 AM
> > 
> > 28/01/2024 10:39, Ori Kam:
> > > During the encapsulation of a packet, it is expected to calculate the
> > > hash value which is based on the original packet (the outer values,
> > > which will become the inner values).
> > 
> > It is not clear what the hash is for.
> 
> Will add explanation.
> 
> > 
> > > The tunnel protocol defines which tunnel field should hold this hash,
> > > but it doesn't define the hash calculation algorithm.
> > 
> > If the hash is stored in the packet header,
> > I expect it to be reproducible when being checked.
> > How the algorithm may be undefined?
> > 
> The hash is not being checked it is used for hash the packets to different 
> queues.
> the actual value is not important. It is critical that all packets that 
> belongs to the same
> flow will have the same hash value.

That's the missing explanation.
Please describe you are talking about an internal hash
used for distributing packet in queues.
You should also explain how it differs from RSS.


> > > An application that uses flow offloads gets the first few packets
> > > and then decides to offload the flow. As a result, there are two
> > > different paths that a packet from a given flow may take.
> > > SW for the first few packets or HW for the rest.
> > > When the packet goes through the SW, the SW encapsulates the packet
> > > and must use the same hash calculation as the HW will do for
> > > the rest of the packets in this flow.
> > >
> > > This patch gives the SW a way to query the hash value
> > > for a given packet as if the packet was passed through the HW.
> > >
> > > Signed-off-by: Ori Kam 
> > > ---
> > > +Calculate encap hash
> > > +
> > > +
> > > +Calculating hash of a packet in SW as it would be calculated in HW for 
> > > the
> > encap action
> > 
> > We should give the real full name of the flow action.
> > 
> > > +
> > > +When the HW execute an encapsulation action, it may calculate an hash
> > value which is based
> > > +on the original packet. This hash is stored depending on the 
> > > encapsulation
> > protocol, in one
> > > +of the outer fields.
> > 
> > Give an example of such encapsulation protocol?
> 
> Sure,
> Just to be clear something like this?
> When the HW execute an encapsulation action for example for VXLAN tunnel,, it 
> may ...

I was more thinking about saying which fields are hashed in VXLAN,
and more importantly how/when the hash is used.




Re: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Thomas Monjalon
07/02/2024 01:57, Ferruh Yigit:
> On 2/6/2024 10:21 PM, Thomas Monjalon wrote:
> > 06/02/2024 18:36, Dariusz Sosnowski:
> >> --- a/doc/guides/nics/build_and_test.rst
> >> +++ b/doc/guides/nics/build_and_test.rst
> >> +- ``RTE_FLOW_DEBUG`` (default **disabled**; enabled automatically on 
> >> debug builds)
> >> +
> >> +  Build with debug code in asynchronous flow APIs.
> >> +
> >>  .. Note::
> >>
> >> -   The ethdev library use above options to wrap debug code to trace 
> >> invalid parameters
> >> +   The ethdev library uses above options to wrap debug code to trace 
> >> invalid parameters
> >> on data path APIs, so performance downgrade is expected when enabling 
> >> those options.
> >> -   Each PMD can decide to reuse them to wrap their own debug code in the 
> >> Rx/Tx path.
> >> +   Each PMD can decide to reuse them to wrap their own debug code in the 
> >> Rx/Tx path
> >> +   and in asynchronous flow APIs implementation.
> > 
> > Good
> > 
> >> --- a/doc/guides/rel_notes/release_24_03.rst
> >> +++ b/doc/guides/rel_notes/release_24_03.rst
> >> +* ethdev: PMDs implementing asynchronous flow operations are required to 
> >> provide relevant functions
> >> +  implementation through ``rte_flow_fp_ops`` struct, instead of 
> >> ``rte_flow_ops`` struct.
> >> +  Pointer to device-dependent ``rte_flow_fp_ops`` should be provided to 
> >> ``rte_eth_dev.flow_fp_ops``.
> > 
> > That's a change only for the driver.
> > If there is no change for the application, it should not appear in the 
> > release notes.
> > BTW, API means Application Programming Interface :)
> > 
> >> +  This change applies to the following API functions:
> >> +
> >> +   * ``rte_flow_async_create``
> >> +   * ``rte_flow_async_create_by_index``
> >> +   * ``rte_flow_async_actions_update``
> >> +   * ``rte_flow_async_destroy``
> >> +   * ``rte_flow_push``
> >> +   * ``rte_flow_pull``
> >> +   * ``rte_flow_async_action_handle_create``
> >> +   * ``rte_flow_async_action_handle_destroy``
> >> +   * ``rte_flow_async_action_handle_update``
> >> +   * ``rte_flow_async_action_handle_query``
> >> +   * ``rte_flow_async_action_handle_query_update``
> >> +   * ``rte_flow_async_action_list_handle_create``
> >> +   * ``rte_flow_async_action_list_handle_destroy``
> >> +   * ``rte_flow_async_action_list_handle_query_update``
> >> +
> >> +* ethdev: Removed the following fields from ``rte_flow_ops`` struct:
> >> +
> >> +   * ``async_create``
> >> +   * ``async_create_by_index``
> >> +   * ``async_actions_update``
> >> +   * ``async_destroy``
> >> +   * ``push``
> >> +   * ``pull``
> >> +   * ``async_action_handle_create``
> >> +   * ``async_action_handle_destroy``
> >> +   * ``async_action_handle_update``
> >> +   * ``async_action_handle_query``
> >> +   * ``async_action_handle_query_update``
> >> +   * ``async_action_list_handle_create``
> >> +   * ``async_action_list_handle_destroy``
> >> +   * ``async_action_list_handle_query_update``
> > 
> > [...]
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -71,6 +71,10 @@ struct rte_eth_dev {
> >>struct rte_eth_dev_data *data;
> >>void *process_private; /**< Pointer to per-process device data */
> >>const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
> >> +  /**
> >> +   * Fast path flow API functions exported by PMD.
> >> +   */
> > 
> > This comment may be on one single line.
> > 
> >> +  const struct rte_flow_fp_ops *flow_fp_ops;
> >>struct rte_device *device; /**< Backing device */
> >>struct rte_intr_handle *intr_handle; /**< Device interrupt handle */
> > 
> >> --- a/lib/ethdev/meson.build
> >> +++ b/lib/ethdev/meson.build
> >> +if get_option('buildtype').contains('debug')
> >> +cflags += ['-DRTE_FLOW_DEBUG']
> >> +endif
> > 
> > This looks OK.
> > 
> > Acked-by: Thomas Monjalon 
> > 
> > 
> 
> Acked-by: Ferruh Yigit 
> 
> Applied to dpdk-next-net/main, thanks.

Ferruh, I was expecting a new version.
Did you address yourself the comments above?




Re: [PATCH v3 01/11] eventdev: improve doxygen introduction text

2024-02-07 Thread Jerin Jacob
On Fri, Feb 2, 2024 at 7:29 PM Bruce Richardson
 wrote:
>
> Make some textual improvements to the introduction to eventdev and event
> devices in the eventdev header file. This text appears in the doxygen
> output for the header file, and introduces the key concepts, for
> example: events, event devices, queues, ports and scheduling.
>
> This patch makes the following improvements:
> * small textual fixups, e.g. correcting use of singular/plural
> * rewrites of some sentences to improve clarity
> * using doxygen markdown to split the whole large block up into
>   sections, thereby making it easier to read.
>
> No large-scale changes are made, and blocks are not reordered
>
> Signed-off-by: Bruce Richardson 

Thanks Bruce, While you are cleaning up, Please add following or
similar change to fix for not properly
parsing the struct rte_event_vector. i.e it is coming as global
variables in html files.

l[dpdk.org] $ git diff
diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index e31c927905..ce4a195a8f 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1309,9 +1309,9 @@ struct rte_event_vector {
 */
struct {
uint16_t port;
-   /* Ethernet device port id. */
+   /**< Ethernet device port id. */
uint16_t queue;
-   /* Ethernet device queue id. */
+   /**< Ethernet device queue id. */
};
};
/**< Union to hold common attributes of the vector array. */
@@ -1340,7 +1340,11 @@ struct rte_event_vector {
 * vector array can be an array of mbufs or pointers or opaque u64
 * values.
 */
+#ifndef __DOXYGEN__
 } __rte_aligned(16);
+#else
+};
+#endif

 /* Scheduler type definitions */
 #define RTE_SCHED_TYPE_ORDERED  0

>
> ---
> V3: reworked following feedback from Mattias
> ---
>  lib/eventdev/rte_eventdev.h | 132 ++--
>  1 file changed, 81 insertions(+), 51 deletions(-)
>
> diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> index ec9b02455d..a741832e8e 100644
> --- a/lib/eventdev/rte_eventdev.h
> +++ b/lib/eventdev/rte_eventdev.h
> @@ -12,25 +12,33 @@
>   * @file
>   *
>   * RTE Event Device API
> + * 
>   *
> - * In a polling model, lcores poll ethdev ports and associated rx queues
> - * directly to look for packet. In an event driven model, by contrast, lcores
> - * call the scheduler that selects packets for them based on programmer
> - * specified criteria. Eventdev library adds support for event driven
> - * programming model, which offer applications automatic multicore scaling,
> - * dynamic load balancing, pipelining, packet ingress order maintenance and
> - * synchronization services to simplify application packet processing.
> + * In a traditional run-to-completion application model, lcores pick up 
> packets

Can we keep it is as poll mode instead of run-to-completion as event mode also
supports run to completion by having dequuee() and then Tx.

> + * from Ethdev ports and associated RX queues, run the packet processing to 
> completion,
> + * and enqueue the completed packets to a TX queue. NIC-level receive-side 
> scaling (RSS)
> + * may be used to balance the load across multiple CPU cores.
> + *
> + * In contrast, in an event-driver model, as supported by this "eventdev" 
> library,
> + * incoming packets are fed into an event device, which schedules those 
> packets across

packets -> events. We may need to bring in Rx adapter if the event is packet.

> + * the available lcores, in accordance with its configuration.
> + * This event-driven programming model offers applications automatic 
> multicore scaling,
> + * dynamic load balancing, pipelining, packet order maintenance, 
> synchronization,
> + * and prioritization/quality of service.
>   *
>   * The Event Device API is composed of two parts:
>   *
>   * - The application-oriented Event API that includes functions to setup
>   *   an event device (configure it, setup its queues, ports and start it), to
> - *   establish the link between queues to port and to receive events, and so 
> on.
> + *   establish the links between queues and ports to receive events, and so 
> on.
>   *
>   * - The driver-oriented Event API that exports a function allowing
> - *   an event poll Mode Driver (PMD) to simultaneously register itself as
> + *   an event poll Mode Driver (PMD) to register itself as
>   *   an event device driver.
>   *
> + * Application-oriented Event API
> + * --
> + *
>   * Event device components:
>   *
>   * +-+
> @@ -75,27 +83,39 @@
>   *|   |
>   *+---+
>   *
> - * Event device: A hardware or software-bas

Re: [PATCH v3 03/11] eventdev: update documentation on device capability flags

2024-02-07 Thread Jerin Jacob
On Sat, Feb 3, 2024 at 12:59 PM Bruce Richardson
 wrote:
>
> Update the device capability docs, to:
>
> * include more cross-references
> * split longer text into paragraphs, in most cases with each flag having
>   a single-line summary at the start of the doc block
> * general comment rewording and clarification as appropriate
>
> Signed-off-by: Bruce Richardson 
> ---
> V3: Updated following feedback from Mattias
> ---
>  lib/eventdev/rte_eventdev.h | 130 +---
>  1 file changed, 92 insertions(+), 38 deletions(-)

>   */
>  #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED   (1ULL << 2)
>  /**< Event device operates in distributed scheduling mode.
> + *
>   * In distributed scheduling mode, event scheduling happens in HW or
> - * rte_event_dequeue_burst() or the combination of these two.
> + * rte_event_dequeue_burst() / rte_event_enqueue_burst() or the combination 
> of these two.
>   * If the flag is not set then eventdev is centralized and thus needs a
>   * dedicated service core that acts as a scheduling thread .

Please remove space between thread and . in the existing code.

>   *
> - * @see rte_event_dequeue_burst()
> + * @see rte_event_dev_service_id_get

Could you add () around all the functions so that looks good across the series?


>   */
>  #define RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES (1ULL << 3)
>  /**< Event device is capable of enqueuing events of any type to any queue.
> - * If this capability is not set, the queue only supports events of the
> - *  *RTE_SCHED_TYPE_* type that it was created with.
>   *
> - * @see RTE_SCHED_TYPE_* values
> + * If this capability is not set, each queue only supports events of the
> + * *RTE_SCHED_TYPE_* type that it was created with.
> + * The behaviour when events of other scheduling types are sent to the queue 
> is
> + * currently undefined.

I think, in header file, we can remove "currently"


p
>   */
>
>  #define RTE_EVENT_DEV_CAP_PROFILE_LINK (1ULL << 12)
> -/**< Event device is capable of supporting multiple link profiles per event 
> port
> - * i.e., the value of `rte_event_dev_info::max_profiles_per_port` is greater
> - * than one.
> +/**< Event device is capable of supporting multiple link profiles per event 
> port.
> + *
> + *

The above line can be removed.

> + * When set, the value of `rte_event_dev_info::max_profiles_per_port` is 
> greater
> + * than one, and multiple profiles may be configured and then switched at 
> runtime.
> + * If not set, only a single profile may be configured, which may itself be
> + * runtime adjustable (if @ref RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK is set).
> + *
> + * @see rte_event_port_profile_links_set rte_event_port_profile_links_get
> + * @see rte_event_port_profile_switch
> + * @see RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK
>   */
>


Re: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 6:46 AM, Rahul Bhansali wrote:
> 
> 
>> -Original Message-
>> From: Ferruh Yigit 
>> Sent: Tuesday, February 6, 2024 11:55 PM
>> To: Rahul Bhansali ; dev@dpdk.org; Radu Nicolau
>> ; Akhil Goyal ; Konstantin
>> Ananyev ; Anoob Joseph
>> 
>> Subject: [EXT] Re: [PATCH] examples/ipsec-secgw: fix IPsec performance drop
>>
>> External Email
>>
>> --
>> On 2/6/2024 12:38 PM, Rahul Bhansali wrote:
>>> Single packet free using rte_pktmbuf_free_bulk() is dropping the
>>> performance. On cn10k, maximum of ~4% drop observed for IPsec event
>>> mode single SA outbound case.
>>>
>>> To fix this issue, single packet free will use rte_pktmbuf_free API.
>>>
>>> Fixes: bd7c063561b3 ("examples/ipsec-secgw: use bulk free")
>>>
>>> Signed-off-by: Rahul Bhansali 
>>> ---
>>>  examples/ipsec-secgw/ipsec-secgw.h | 7 +++
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/examples/ipsec-secgw/ipsec-secgw.h
>>> b/examples/ipsec-secgw/ipsec-secgw.h
>>> index 8baab44ee7..ec33a982df 100644
>>> --- a/examples/ipsec-secgw/ipsec-secgw.h
>>> +++ b/examples/ipsec-secgw/ipsec-secgw.h
>>> @@ -229,11 +229,10 @@ free_reassembly_fail_pkt(struct rte_mbuf *mb)  }
>>>
>>>  /* helper routine to free bulk of packets */ -static inline void
>>> -free_pkts(struct rte_mbuf *mb[], uint32_t n)
>>> +static __rte_always_inline void
>>> +free_pkts(struct rte_mbuf *mb[], const uint32_t n)
>>>  {
>>> -   rte_pktmbuf_free_bulk(mb, n);
>>> -
>>> +   n == 1 ? rte_pktmbuf_free(mb[0]) : rte_pktmbuf_free_bulk(mb, n);
>>> core_stats_update_drop(n);
>>>  }
>>>
>>
>> Hi Rahul,
>>
>> Do you think the 'rte_pktmbuf_free_bulk()' API performance can be improved by
>> similar change?
> 
> Hi Ferruh,
> Currently 'rte_pktmbuf_free_bulk() is not inline. If we make that along with 
> __rte_pktmbuf_free_seg_via_array()  both inline then performance can be 
> improved similar.
>

Ah, so performance improvement is coming from 'rte_pktmbuf_free()' being
inline, OK.

As you are doing performance testing in that area, can you please check
if '__rte_pktmbuf_free_seg_via_array()' is inlined, as it is static
function I expect it to be inlined. If not, can you please test with
force inlining it (__rte_always_inline)?


And I wonder if bulk() API may get single mbuf is a common theme, does
it makes sense add a new inline wrapper to library to cover this case,
if it is bringing ~4% improvement, like:
```
static inline void
rte_pktmbuf_free_bulk_or_one(... **mb, unsigned int n)
{
if (n == 1)
return rte_pktmbuf_free(mb[0]);
return rte_pktmbuf_free_bulk(mb, n);
}
```



Re: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 9:27 AM, Thomas Monjalon wrote:
> 07/02/2024 01:57, Ferruh Yigit:
>> On 2/6/2024 10:21 PM, Thomas Monjalon wrote:
>>> 06/02/2024 18:36, Dariusz Sosnowski:
 --- a/doc/guides/nics/build_and_test.rst
 +++ b/doc/guides/nics/build_and_test.rst
 +- ``RTE_FLOW_DEBUG`` (default **disabled**; enabled automatically on 
 debug builds)
 +
 +  Build with debug code in asynchronous flow APIs.
 +
  .. Note::

 -   The ethdev library use above options to wrap debug code to trace 
 invalid parameters
 +   The ethdev library uses above options to wrap debug code to trace 
 invalid parameters
 on data path APIs, so performance downgrade is expected when enabling 
 those options.
 -   Each PMD can decide to reuse them to wrap their own debug code in the 
 Rx/Tx path.
 +   Each PMD can decide to reuse them to wrap their own debug code in the 
 Rx/Tx path
 +   and in asynchronous flow APIs implementation.
>>>
>>> Good
>>>
 --- a/doc/guides/rel_notes/release_24_03.rst
 +++ b/doc/guides/rel_notes/release_24_03.rst
 +* ethdev: PMDs implementing asynchronous flow operations are required to 
 provide relevant functions
 +  implementation through ``rte_flow_fp_ops`` struct, instead of 
 ``rte_flow_ops`` struct.
 +  Pointer to device-dependent ``rte_flow_fp_ops`` should be provided to 
 ``rte_eth_dev.flow_fp_ops``.
>>>
>>> That's a change only for the driver.
>>> If there is no change for the application, it should not appear in the 
>>> release notes.
>>> BTW, API means Application Programming Interface :)
>>>
 +  This change applies to the following API functions:
 +
 +   * ``rte_flow_async_create``
 +   * ``rte_flow_async_create_by_index``
 +   * ``rte_flow_async_actions_update``
 +   * ``rte_flow_async_destroy``
 +   * ``rte_flow_push``
 +   * ``rte_flow_pull``
 +   * ``rte_flow_async_action_handle_create``
 +   * ``rte_flow_async_action_handle_destroy``
 +   * ``rte_flow_async_action_handle_update``
 +   * ``rte_flow_async_action_handle_query``
 +   * ``rte_flow_async_action_handle_query_update``
 +   * ``rte_flow_async_action_list_handle_create``
 +   * ``rte_flow_async_action_list_handle_destroy``
 +   * ``rte_flow_async_action_list_handle_query_update``
 +
 +* ethdev: Removed the following fields from ``rte_flow_ops`` struct:
 +
 +   * ``async_create``
 +   * ``async_create_by_index``
 +   * ``async_actions_update``
 +   * ``async_destroy``
 +   * ``push``
 +   * ``pull``
 +   * ``async_action_handle_create``
 +   * ``async_action_handle_destroy``
 +   * ``async_action_handle_update``
 +   * ``async_action_handle_query``
 +   * ``async_action_handle_query_update``
 +   * ``async_action_list_handle_create``
 +   * ``async_action_list_handle_destroy``
 +   * ``async_action_list_handle_query_update``
>>>
>>> [...]
 --- a/lib/ethdev/ethdev_driver.h
 +++ b/lib/ethdev/ethdev_driver.h
 @@ -71,6 +71,10 @@ struct rte_eth_dev {
struct rte_eth_dev_data *data;
void *process_private; /**< Pointer to per-process device data */
const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
 +  /**
 +   * Fast path flow API functions exported by PMD.
 +   */
>>>
>>> This comment may be on one single line.
>>>
 +  const struct rte_flow_fp_ops *flow_fp_ops;
struct rte_device *device; /**< Backing device */
struct rte_intr_handle *intr_handle; /**< Device interrupt handle */
>>>
 --- a/lib/ethdev/meson.build
 +++ b/lib/ethdev/meson.build
 +if get_option('buildtype').contains('debug')
 +cflags += ['-DRTE_FLOW_DEBUG']
 +endif
>>>
>>> This looks OK.
>>>
>>> Acked-by: Thomas Monjalon 
>>>
>>>
>>
>> Acked-by: Ferruh Yigit 
>>
>> Applied to dpdk-next-net/main, thanks.
> 
> Ferruh, I was expecting a new version.
> Did you address yourself the comments above?
> 
> 

No, I missed the comment, if it is simple I can apply in next-net, let
me sync with Dariusz.



Re: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 10:47 AM, Ferruh Yigit wrote:
> On 2/7/2024 9:27 AM, Thomas Monjalon wrote:
>> 07/02/2024 01:57, Ferruh Yigit:
>>> On 2/6/2024 10:21 PM, Thomas Monjalon wrote:
 06/02/2024 18:36, Dariusz Sosnowski:
> --- a/doc/guides/nics/build_and_test.rst
> +++ b/doc/guides/nics/build_and_test.rst
> +- ``RTE_FLOW_DEBUG`` (default **disabled**; enabled automatically on 
> debug builds)
> +
> +  Build with debug code in asynchronous flow APIs.
> +
>  .. Note::
>
> -   The ethdev library use above options to wrap debug code to trace 
> invalid parameters
> +   The ethdev library uses above options to wrap debug code to trace 
> invalid parameters
> on data path APIs, so performance downgrade is expected when enabling 
> those options.
> -   Each PMD can decide to reuse them to wrap their own debug code in the 
> Rx/Tx path.
> +   Each PMD can decide to reuse them to wrap their own debug code in the 
> Rx/Tx path
> +   and in asynchronous flow APIs implementation.

 Good

> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> +* ethdev: PMDs implementing asynchronous flow operations are required to 
> provide relevant functions
> +  implementation through ``rte_flow_fp_ops`` struct, instead of 
> ``rte_flow_ops`` struct.
> +  Pointer to device-dependent ``rte_flow_fp_ops`` should be provided to 
> ``rte_eth_dev.flow_fp_ops``.

 That's a change only for the driver.
 If there is no change for the application, it should not appear in the 
 release notes.
 BTW, API means Application Programming Interface :)

> +  This change applies to the following API functions:
> +
> +   * ``rte_flow_async_create``
> +   * ``rte_flow_async_create_by_index``
> +   * ``rte_flow_async_actions_update``
> +   * ``rte_flow_async_destroy``
> +   * ``rte_flow_push``
> +   * ``rte_flow_pull``
> +   * ``rte_flow_async_action_handle_create``
> +   * ``rte_flow_async_action_handle_destroy``
> +   * ``rte_flow_async_action_handle_update``
> +   * ``rte_flow_async_action_handle_query``
> +   * ``rte_flow_async_action_handle_query_update``
> +   * ``rte_flow_async_action_list_handle_create``
> +   * ``rte_flow_async_action_list_handle_destroy``
> +   * ``rte_flow_async_action_list_handle_query_update``
> +
> +* ethdev: Removed the following fields from ``rte_flow_ops`` struct:
> +
> +   * ``async_create``
> +   * ``async_create_by_index``
> +   * ``async_actions_update``
> +   * ``async_destroy``
> +   * ``push``
> +   * ``pull``
> +   * ``async_action_handle_create``
> +   * ``async_action_handle_destroy``
> +   * ``async_action_handle_update``
> +   * ``async_action_handle_query``
> +   * ``async_action_handle_query_update``
> +   * ``async_action_list_handle_create``
> +   * ``async_action_list_handle_destroy``
> +   * ``async_action_list_handle_query_update``

 [...]
> --- a/lib/ethdev/ethdev_driver.h
> +++ b/lib/ethdev/ethdev_driver.h
> @@ -71,6 +71,10 @@ struct rte_eth_dev {
>   struct rte_eth_dev_data *data;
>   void *process_private; /**< Pointer to per-process device data */
>   const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
> + /**
> +  * Fast path flow API functions exported by PMD.
> +  */

 This comment may be on one single line.

> + const struct rte_flow_fp_ops *flow_fp_ops;
>   struct rte_device *device; /**< Backing device */
>   struct rte_intr_handle *intr_handle; /**< Device interrupt handle */

> --- a/lib/ethdev/meson.build
> +++ b/lib/ethdev/meson.build
> +if get_option('buildtype').contains('debug')
> +cflags += ['-DRTE_FLOW_DEBUG']
> +endif

 This looks OK.

 Acked-by: Thomas Monjalon 


>>>
>>> Acked-by: Ferruh Yigit 
>>>
>>> Applied to dpdk-next-net/main, thanks.
>>
>> Ferruh, I was expecting a new version.
>> Did you address yourself the comments above?
>>
>>
> 
> No, I missed the comment, if it is simple I can apply in next-net, let
> me sync with Dariusz.
> 

As we synced with Dariusz, there is no good place to document
ethdev-drivers interfaces in the release notes.

Also this release there were more ethdev-drivers interface changes,
around get_ptype(), but those also not documented in the release notes,
so will remove these ones too.


But for further release notes, @Thomas, @John, what do you think to add
a new section (or sub-section) for "internal interface" ?? (device
abstraction - drivers) interface changes?



Re: [PATCH] net/mana: start secondary process queues by default

2024-02-07 Thread Ferruh Yigit
On 1/31/2024 12:46 AM, lon...@linuxonhyperv.com wrote:
> From: Long Li 
> 
> Secondary processes are started after primary, and in most cases with
> the device already started. Make them being able to process packets as
> soon as they start.
> 
> This also works with the case where the primary process decides to start
> the device at a later time after secondary processes have started. The
> application should guarantee not to send any packets before the device is
> started.
> 
> Signed-off-by: Long Li 
> 

Applied to dpdk-next-net/main, thanks.



[PATCH] doc: remove cmdline deprecation notice

2024-02-07 Thread Dariusz Sosnowski
Remove mention of cmdline_poll() function from deprecation notice,
because it was removed in 23.11 release.

Fixes: f44f2edd198a ("cmdline: remove poll function")
Cc: step...@networkplumber.org
Cc: sta...@dpdk.org

Signed-off-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/deprecation.rst | 4 
 1 file changed, 4 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 81b93515cb..10630ba255 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -27,10 +27,6 @@ Deprecation Notices
 * kvargs: The function ``rte_kvargs_process`` will get a new parameter
   for returning key match count. It will ease handling of no-match case.
 
-* cmdline: The function ``cmdline_poll`` does not work correctly on either
-  Linux or Windows and is unused by any part of DPDK.
-  This function is now deprecated and will be removed in DPDK 23.11.
-
 * telemetry: The functions ``rte_tel_data_add_array_u64`` and 
``rte_tel_data_add_dict_u64``,
   used by telemetry callbacks for adding unsigned integer values to be 
returned to the user,
   are renamed to ``rte_tel_data_add_array_uint`` and 
``rte_tel_data_add_dict_uint`` respectively.
-- 
2.25.1



RE: [PATCH 1/6] ethdev: add modify IPv4 next protocol field

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Ori Kam 
> Sent: Tuesday, February 6, 2024 14:01
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Dariusz Sosnowski 
> Subject: RE: [PATCH 1/6] ethdev: add modify IPv4 next protocol field
> 
> Hi Slava
> 
> > -Original Message-
> > From: Slava Ovsiienko 
> > Sent: Tuesday, February 6, 2024 2:18 PM 
> > Subject: [PATCH 1/6] ethdev: add modify IPv4 next protocol field
> >
> > Add IPv4 next protocol modify field definition.
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> > ---
> >  doc/guides/rel_notes/release_24_03.rst | 1 +
> >  lib/ethdev/rte_flow.h  | 3 ++-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/rel_notes/release_24_03.rst
> > b/doc/guides/rel_notes/release_24_03.rst
> > index 2b91217943..33e9303ae1 100644
> > --- a/doc/guides/rel_notes/release_24_03.rst
> > +++ b/doc/guides/rel_notes/release_24_03.rst
> > @@ -64,6 +64,7 @@ New Features
> >
> >* Added ``RTE_FLOW_ITEM_TYPE_RANDOM`` to match random value.
> >* Added ``RTE_FLOW_FIELD_RANDOM`` to represent it in field ID struct.
> > +  * Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID
> > struct.
> >
> >  * ** Support for getting the number of used descriptors of a Tx
> > queue. **
> >
> > diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
> > 1267c146e5..84af730dc7 100644
> > --- a/lib/ethdev/rte_flow.h
> > +++ b/lib/ethdev/rte_flow.h
> > @@ -3933,7 +3933,8 @@ enum rte_flow_field_id {
> > RTE_FLOW_FIELD_IPV4_IHL,/**< IPv4 IHL. */
> > RTE_FLOW_FIELD_IPV4_TOTAL_LEN,  /**< IPv4 total length. */
> > RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN,/**< IPv6 payload length. */
> > -   RTE_FLOW_FIELD_RANDOM   /**< Random value. */
> > +   RTE_FLOW_FIELD_RANDOM,  /**< Random value. */
> > +   RTE_FLOW_FIELD_IPV4_PROTO   /**< IPv4 next protocol. */
> >  };
> >
> >  /**
> > --
> > 2.18.1
> 
> Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


RE: [PATCH 2/6] app/testpmd: add modify IPv4 next protocol command line

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Ori Kam 
> Sent: Tuesday, February 6, 2024 14:03
> To: Slava Ovsiienko ; dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Dariusz Sosnowski 
> Subject: RE: [PATCH 2/6] app/testpmd: add modify IPv4 next protocol
> command line
> 
> Hi Slava
> 
> > -Original Message-
> > From: Slava Ovsiienko 
> > Sent: Tuesday, February 6, 2024 2:18 PM
> > Subject: [PATCH 2/6] app/testpmd: add modify IPv4 next protocol
> > command line
> >
> > Add new modify field action type string: "ipv4_proto".
> >
> > Signed-off-by: Viacheslav Ovsiienko 
> > ---
> >  app/test-pmd/cmdline_flow.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> > index 4062879552..03b418a5d8 100644
> > --- a/app/test-pmd/cmdline_flow.c
> > +++ b/app/test-pmd/cmdline_flow.c
> > @@ -962,6 +962,7 @@ static const char *const modify_field_ids[] = {
> > "geneve_opt_type", "geneve_opt_class", "geneve_opt_data", "mpls",
> > "tcp_data_off", "ipv4_ihl", "ipv4_total_len", "ipv6_payload_len",
> > "random",
> > +   "ipv4_proto",
> > NULL
> >  };
> >
> > --
> > 2.18.1
> 
> Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


RE: [PATCH 3/6] net/mlx5: add modify IPv4 protocol implementation

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Slava Ovsiienko 
> Sent: Tuesday, February 6, 2024 13:18
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH 3/6] net/mlx5: add modify IPv4 protocol implementation
> 
> Add modify IPv4 protocol implementation for mlx5 PMD.
> 
> Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


RE: [PATCH 5/6] app/testpmd: add modify ESP related fields command line

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Slava Ovsiienko 
> Sent: Tuesday, February 6, 2024 13:18
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH 5/6] app/testpmd: add modify ESP related fields command
> line
> 
> Add new modify field destination type strings:
> 
>   - "esp_spi", to modify Security Parameter Index field
>   - "esp_seq_num", to modify Sequence Number field
>   - "esp_proto", to modify next protocol field in ESP trailer
> 
> Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


RE: [PATCH 4/6] ethdev: add modify action support for IPsec fields

2024-02-07 Thread Dariusz Sosnowski
Hi Slava,

> -Original Message-
> From: Slava Ovsiienko 
> Sent: Tuesday, February 6, 2024 13:18
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH 4/6] ethdev: add modify action support for IPsec fields
> 
> The following IPsec related field definitions added:
> 
>  - RTE_FLOW_FIELD_ESP_SPI - SPI value in IPsec header
>  - RTE_FLOW_FIELD_ESP_SEQ_NUM - sequence number in header
>  - RTE_FLOW_FIELD_ESP_PROTO - next protocol value in trailer
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  doc/guides/rel_notes/release_24_03.rst | 4 
>  lib/ethdev/rte_flow.h  | 5 -
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/rel_notes/release_24_03.rst
> b/doc/guides/rel_notes/release_24_03.rst
> index 3e33ff2d86..2f78009dd8 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -65,6 +65,10 @@ New Features
>* Added ``RTE_FLOW_ITEM_TYPE_RANDOM`` to match random value.
>* Added ``RTE_FLOW_FIELD_RANDOM`` to represent it in field ID struct.
>* Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID struct.
> +  * Added ``RTE_FLOW_FIELD_ESP_SPI`` to represent it in field ID struct.
> +  * Added ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` to represent it in field ID struct.
> + * Added ``RTE_FLOW_FIELD_ESP_PROTO`` to represent it in field ID struct.
Could you please align this line with the rest of the items in that list, so 
that release notes generate properly?

Other than that, looks good to me.

Best regards,
Dariusz Sosnowski


RE: [PATCH 6/6] net/mlx5: add modify field action IPsec support

2024-02-07 Thread Dariusz Sosnowski
Hi Slava,

> +  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SPI`` 
> flow action.
> +  * Added HW steering support for modify field 
> ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` flow action.
> + * Added HW steering support for modify field 
> ``RTE_FLOW_FIELD_ESP_PROTO`` flow action.
Could you please align this line with the rest of the items in that list, so 
that release notes generate properly?

Other than that, looks good to me.

Best regards,
Dariusz Sosnowski


Re: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 10:56 AM, Ferruh Yigit wrote:
> On 2/7/2024 10:47 AM, Ferruh Yigit wrote:
>> On 2/7/2024 9:27 AM, Thomas Monjalon wrote:
>>> 07/02/2024 01:57, Ferruh Yigit:
 On 2/6/2024 10:21 PM, Thomas Monjalon wrote:
> 06/02/2024 18:36, Dariusz Sosnowski:
>> --- a/doc/guides/nics/build_and_test.rst
>> +++ b/doc/guides/nics/build_and_test.rst
>> +- ``RTE_FLOW_DEBUG`` (default **disabled**; enabled automatically on 
>> debug builds)
>> +
>> +  Build with debug code in asynchronous flow APIs.
>> +
>>  .. Note::
>>
>> -   The ethdev library use above options to wrap debug code to trace 
>> invalid parameters
>> +   The ethdev library uses above options to wrap debug code to trace 
>> invalid parameters
>> on data path APIs, so performance downgrade is expected when 
>> enabling those options.
>> -   Each PMD can decide to reuse them to wrap their own debug code in 
>> the Rx/Tx path.
>> +   Each PMD can decide to reuse them to wrap their own debug code in 
>> the Rx/Tx path
>> +   and in asynchronous flow APIs implementation.
>
> Good
>
>> --- a/doc/guides/rel_notes/release_24_03.rst
>> +++ b/doc/guides/rel_notes/release_24_03.rst
>> +* ethdev: PMDs implementing asynchronous flow operations are required 
>> to provide relevant functions
>> +  implementation through ``rte_flow_fp_ops`` struct, instead of 
>> ``rte_flow_ops`` struct.
>> +  Pointer to device-dependent ``rte_flow_fp_ops`` should be provided to 
>> ``rte_eth_dev.flow_fp_ops``.
>
> That's a change only for the driver.
> If there is no change for the application, it should not appear in the 
> release notes.
> BTW, API means Application Programming Interface :)
>
>> +  This change applies to the following API functions:
>> +
>> +   * ``rte_flow_async_create``
>> +   * ``rte_flow_async_create_by_index``
>> +   * ``rte_flow_async_actions_update``
>> +   * ``rte_flow_async_destroy``
>> +   * ``rte_flow_push``
>> +   * ``rte_flow_pull``
>> +   * ``rte_flow_async_action_handle_create``
>> +   * ``rte_flow_async_action_handle_destroy``
>> +   * ``rte_flow_async_action_handle_update``
>> +   * ``rte_flow_async_action_handle_query``
>> +   * ``rte_flow_async_action_handle_query_update``
>> +   * ``rte_flow_async_action_list_handle_create``
>> +   * ``rte_flow_async_action_list_handle_destroy``
>> +   * ``rte_flow_async_action_list_handle_query_update``
>> +
>> +* ethdev: Removed the following fields from ``rte_flow_ops`` struct:
>> +
>> +   * ``async_create``
>> +   * ``async_create_by_index``
>> +   * ``async_actions_update``
>> +   * ``async_destroy``
>> +   * ``push``
>> +   * ``pull``
>> +   * ``async_action_handle_create``
>> +   * ``async_action_handle_destroy``
>> +   * ``async_action_handle_update``
>> +   * ``async_action_handle_query``
>> +   * ``async_action_handle_query_update``
>> +   * ``async_action_list_handle_create``
>> +   * ``async_action_list_handle_destroy``
>> +   * ``async_action_list_handle_query_update``
>
> [...]
>> --- a/lib/ethdev/ethdev_driver.h
>> +++ b/lib/ethdev/ethdev_driver.h
>> @@ -71,6 +71,10 @@ struct rte_eth_dev {
>>  struct rte_eth_dev_data *data;
>>  void *process_private; /**< Pointer to per-process device data 
>> */
>>  const struct eth_dev_ops *dev_ops; /**< Functions exported by 
>> PMD */
>> +/**
>> + * Fast path flow API functions exported by PMD.
>> + */
>
> This comment may be on one single line.
>
>> +const struct rte_flow_fp_ops *flow_fp_ops;
>>  struct rte_device *device; /**< Backing device */
>>  struct rte_intr_handle *intr_handle; /**< Device interrupt 
>> handle */
>
>> --- a/lib/ethdev/meson.build
>> +++ b/lib/ethdev/meson.build
>> +if get_option('buildtype').contains('debug')
>> +cflags += ['-DRTE_FLOW_DEBUG']
>> +endif
>
> This looks OK.
>
> Acked-by: Thomas Monjalon 
>
>

 Acked-by: Ferruh Yigit 

 Applied to dpdk-next-net/main, thanks.
>>>
>>> Ferruh, I was expecting a new version.
>>> Did you address yourself the comments above?
>>>
>>>
>>
>> No, I missed the comment, if it is simple I can apply in next-net, let
>> me sync with Dariusz.
>>
> 
> As we synced with Dariusz, there is no good place to document
> ethdev-drivers interfaces in the release notes.
> 
> Also this release there were more ethdev-drivers interface changes,
> around get_ptype(), but those also not documented in the release notes,
> so will remove these ones too.
> 

Dariusz, I did the changes in next-net, can you please double check them:
https://git.dpdk.org/next/dpdk-next-net/commit/?h=main&id=23f1ee71a9c332210aaa5b1ec511609

RE: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Ferruh Yigit 
> Sent: Wednesday, February 7, 2024 12:54
> To: NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> Dariusz Sosnowski ; Mcnamara, John
> 
> Cc: Matan Azrad ; Slava Ovsiienko
> ; Ori Kam ; Suanming Mou
> ; Andrew Rybchenko
> ; dev@dpdk.org
> Subject: Re: [PATCH v3] ethdev: fast path async flow API
> 
> External email: Use caution opening links or attachments
> 
> 
> On 2/7/2024 10:56 AM, Ferruh Yigit wrote:
> > On 2/7/2024 10:47 AM, Ferruh Yigit wrote:
> >> On 2/7/2024 9:27 AM, Thomas Monjalon wrote:
> >>> 07/02/2024 01:57, Ferruh Yigit:
>  On 2/6/2024 10:21 PM, Thomas Monjalon wrote:
> > 06/02/2024 18:36, Dariusz Sosnowski:
> >> --- a/doc/guides/nics/build_and_test.rst
> >> +++ b/doc/guides/nics/build_and_test.rst
> >> +- ``RTE_FLOW_DEBUG`` (default **disabled**; enabled
> >> +automatically on debug builds)
> >> +
> >> +  Build with debug code in asynchronous flow APIs.
> >> +
> >>  .. Note::
> >>
> >> -   The ethdev library use above options to wrap debug code to trace
> invalid parameters
> >> +   The ethdev library uses above options to wrap debug code to
> >> + trace invalid parameters
> >> on data path APIs, so performance downgrade is expected when
> enabling those options.
> >> -   Each PMD can decide to reuse them to wrap their own debug code
> in the Rx/Tx path.
> >> +   Each PMD can decide to reuse them to wrap their own debug code
> in the Rx/Tx path
> >> +   and in asynchronous flow APIs implementation.
> >
> > Good
> >
> >> --- a/doc/guides/rel_notes/release_24_03.rst
> >> +++ b/doc/guides/rel_notes/release_24_03.rst
> >> +* ethdev: PMDs implementing asynchronous flow operations are
> >> +required to provide relevant functions
> >> +  implementation through ``rte_flow_fp_ops`` struct, instead of
> ``rte_flow_ops`` struct.
> >> +  Pointer to device-dependent ``rte_flow_fp_ops`` should be provided
> to ``rte_eth_dev.flow_fp_ops``.
> >
> > That's a change only for the driver.
> > If there is no change for the application, it should not appear in the
> release notes.
> > BTW, API means Application Programming Interface :)
> >
> >> +  This change applies to the following API functions:
> >> +
> >> +   * ``rte_flow_async_create``
> >> +   * ``rte_flow_async_create_by_index``
> >> +   * ``rte_flow_async_actions_update``
> >> +   * ``rte_flow_async_destroy``
> >> +   * ``rte_flow_push``
> >> +   * ``rte_flow_pull``
> >> +   * ``rte_flow_async_action_handle_create``
> >> +   * ``rte_flow_async_action_handle_destroy``
> >> +   * ``rte_flow_async_action_handle_update``
> >> +   * ``rte_flow_async_action_handle_query``
> >> +   * ``rte_flow_async_action_handle_query_update``
> >> +   * ``rte_flow_async_action_list_handle_create``
> >> +   * ``rte_flow_async_action_list_handle_destroy``
> >> +   * ``rte_flow_async_action_list_handle_query_update``
> >> +
> >> +* ethdev: Removed the following fields from ``rte_flow_ops`` struct:
> >> +
> >> +   * ``async_create``
> >> +   * ``async_create_by_index``
> >> +   * ``async_actions_update``
> >> +   * ``async_destroy``
> >> +   * ``push``
> >> +   * ``pull``
> >> +   * ``async_action_handle_create``
> >> +   * ``async_action_handle_destroy``
> >> +   * ``async_action_handle_update``
> >> +   * ``async_action_handle_query``
> >> +   * ``async_action_handle_query_update``
> >> +   * ``async_action_list_handle_create``
> >> +   * ``async_action_list_handle_destroy``
> >> +   * ``async_action_list_handle_query_update``
> >
> > [...]
> >> --- a/lib/ethdev/ethdev_driver.h
> >> +++ b/lib/ethdev/ethdev_driver.h
> >> @@ -71,6 +71,10 @@ struct rte_eth_dev {
> >>  struct rte_eth_dev_data *data;
> >>  void *process_private; /**< Pointer to per-process device 
> >> data */
> >>  const struct eth_dev_ops *dev_ops; /**< Functions
> >> exported by PMD */
> >> +/**
> >> + * Fast path flow API functions exported by PMD.
> >> + */
> >
> > This comment may be on one single line.
> >
> >> +const struct rte_flow_fp_ops *flow_fp_ops;
> >>  struct rte_device *device; /**< Backing device */
> >>  struct rte_intr_handle *intr_handle; /**< Device
> >> interrupt handle */
> >
> >> --- a/lib/ethdev/meson.build
> >> +++ b/lib/ethdev/meson.build
> >> +if get_option('buildtype').contains('debug')
> >> +cflags += ['-DRTE_FLOW_DEBUG'] endif
> >
> > This looks OK.
> >
> > Acked-by: Thomas Monjalon 
> >
> >
> 
>  Acked-by: Ferruh Yigit 
> 
>  Applied to dpdk-next-net/main, thanks.
> >>>
> >>> Ferruh, I was expecting a new version.
> >>> Did you address yourself the comments above?
> >>>

[PATCH v2 1/6] ethdev: add modify IPv4 next protocol field

2024-02-07 Thread Viacheslav Ovsiienko
Add IPv4 next protocol modify field definition.

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 4 
 lib/ethdev/rte_flow.h  | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 2b91217943..1e9134cc81 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -65,6 +65,10 @@ New Features
   * Added ``RTE_FLOW_ITEM_TYPE_RANDOM`` to match random value.
   * Added ``RTE_FLOW_FIELD_RANDOM`` to represent it in field ID struct.
 
+* **Added new field IDs in the experimental ``enum rte_flow_field_id``:
+
+  * Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID struct.
+
 * ** Support for getting the number of used descriptors of a Tx queue. **
 
   * Added a fath path function ``rte_eth_tx_queue_count`` to get the number of 
used
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 1267c146e5..84af730dc7 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3933,7 +3933,8 @@ enum rte_flow_field_id {
RTE_FLOW_FIELD_IPV4_IHL,/**< IPv4 IHL. */
RTE_FLOW_FIELD_IPV4_TOTAL_LEN,  /**< IPv4 total length. */
RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN,/**< IPv6 payload length. */
-   RTE_FLOW_FIELD_RANDOM   /**< Random value. */
+   RTE_FLOW_FIELD_RANDOM,  /**< Random value. */
+   RTE_FLOW_FIELD_IPV4_PROTO   /**< IPv4 next protocol. */
 };
 
 /**
-- 
2.18.1



[PATCH v2 2/6] app/testpmd: add modify IPv4 next protocol command line

2024-02-07 Thread Viacheslav Ovsiienko
Add new modify field action type string: "ipv4_proto".

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 
---
 app/test-pmd/cmdline_flow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 4062879552..03b418a5d8 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -962,6 +962,7 @@ static const char *const modify_field_ids[] = {
"geneve_opt_type", "geneve_opt_class", "geneve_opt_data", "mpls",
"tcp_data_off", "ipv4_ihl", "ipv4_total_len", "ipv6_payload_len",
"random",
+   "ipv4_proto",
NULL
 };
 
-- 
2.18.1



[PATCH v2 3/6] net/mlx5: add modify IPv4 protocol implementation

2024-02-07 Thread Viacheslav Ovsiienko
Add modify IPv4 protocol implementation for mlx5 PMD.

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 1 +
 drivers/common/mlx5/mlx5_prm.h | 1 +
 drivers/net/mlx5/mlx5_flow_dv.c| 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 1e9134cc81..140f7b0ac5 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -94,6 +94,7 @@ New Features
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_TYPE`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_CLASS`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_DATA`` flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_IPV4_PROTO`` 
flow action.
 
 
 Removed Items
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index f64f25dbb7..44413517d0 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -839,6 +839,7 @@ enum mlx5_modification_field {
MLX5_MODI_IN_MPLS_LABEL_2,
MLX5_MODI_IN_MPLS_LABEL_3,
MLX5_MODI_IN_MPLS_LABEL_4,
+   MLX5_MODI_OUT_IP_PROTOCOL = 0x4A,
MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A,
MLX5_MODI_META_REG_C_8 = 0x8F,
MLX5_MODI_META_REG_C_9 = 0x90,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 6998be107f..764940b700 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1384,6 +1384,7 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
case RTE_FLOW_FIELD_IPV4_DSCP:
return 6;
case RTE_FLOW_FIELD_IPV4_TTL:
+   case RTE_FLOW_FIELD_IPV4_PROTO:
return 8;
case RTE_FLOW_FIELD_IPV4_SRC:
case RTE_FLOW_FIELD_IPV4_DST:
@@ -2194,10 +2195,11 @@ mlx5_flow_field_id_to_modify_info
info[idx].offset = data->offset;
}
break;
+   case RTE_FLOW_FIELD_IPV4_PROTO: /* Fall-through. */
case RTE_FLOW_FIELD_IPV6_PROTO:
MLX5_ASSERT(data->offset + width <= 8);
off_be = 8 - (data->offset + width);
-   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IPV6_NEXT_HDR};
+   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IP_PROTOCOL};
if (mask)
mask[idx] = flow_modify_info_mask_8(width, off_be);
else
-- 
2.18.1



[PATCH v2 6/6] net/mlx5: add modify field action IPsec support

2024-02-07 Thread Viacheslav Ovsiienko
Add mlx5 PMD support for the IPsec fields:

  - RTE_FLOW_FIELD_ESP_SPI - SPI value in IPsec header
  - RTE_FLOW_FIELD_ESP_SEQ_NUM - sequence number in header
  - RTE_FLOW_FIELD_ESP_PROTO - next protocol value in trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst |  3 +++
 drivers/common/mlx5/mlx5_prm.h |  3 +++
 drivers/net/mlx5/mlx5_flow_dv.c| 31 ++
 3 files changed, 37 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 0403157202..189724f660 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -98,6 +98,9 @@ New Features
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_CLASS`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_DATA`` flow action.
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_IPV4_PROTO`` 
flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SPI`` flow 
action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` 
flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_PROTO`` 
flow action.
 
 
 Removed Items
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 44413517d0..3150412580 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -854,6 +854,9 @@ enum mlx5_modification_field {
MLX5_MODI_OUT_IPV6_PAYLOAD_LEN = 0x11E,
MLX5_MODI_OUT_IPV4_IHL = 0x11F,
MLX5_MODI_OUT_TCP_DATA_OFFSET = 0x120,
+   MLX5_MODI_OUT_ESP_SPI = 0x5E,
+   MLX5_MODI_OUT_ESP_SEQ_NUM = 0x82,
+   MLX5_MODI_OUT_IPSEC_NEXT_HDR = 0x126,
MLX5_MODI_INVALID = INT_MAX,
 };
 
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 764940b700..90413f4a38 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1414,7 +1414,11 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
case RTE_FLOW_FIELD_GTP_TEID:
case RTE_FLOW_FIELD_MPLS:
case RTE_FLOW_FIELD_TAG:
+   case RTE_FLOW_FIELD_ESP_SPI:
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
return 32;
+   case RTE_FLOW_FIELD_ESP_PROTO:
+   return 8;
case RTE_FLOW_FIELD_MARK:
return rte_popcount32(priv->sh->dv_mark_mask);
case RTE_FLOW_FIELD_META:
@@ -2205,6 +2209,33 @@ mlx5_flow_field_id_to_modify_info
else
info[idx].offset = off_be;
break;
+   case RTE_FLOW_FIELD_ESP_PROTO:
+   MLX5_ASSERT(data->offset + width <= 8);
+   off_be = 8 - (data->offset + width);
+   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IPSEC_NEXT_HDR};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_8(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
+   case RTE_FLOW_FIELD_ESP_SPI:
+   MLX5_ASSERT(data->offset + width <= 32);
+   off_be = 32 - (data->offset + width);
+   info[idx] = (struct field_modify_info){4, 0, 
MLX5_MODI_OUT_ESP_SPI};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
+   MLX5_ASSERT(data->offset + width <= 32);
+   off_be = 32 - (data->offset + width);
+   info[idx] = (struct field_modify_info){4, 0, 
MLX5_MODI_OUT_ESP_SEQ_NUM};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
case RTE_FLOW_FIELD_FLEX_ITEM:
MLX5_ASSERT(data->flex_handle != NULL && !(data->offset & 0x7));
mlx5_modify_flex_item(dev, (const struct mlx5_flex_item 
*)data->flex_handle,
-- 
2.18.1



[PATCH v2 5/6] app/testpmd: add modify ESP related fields command line

2024-02-07 Thread Viacheslav Ovsiienko
Add new modify field destination type strings:

  - "esp_spi", to modify Security Parameter Index field
  - "esp_seq_num", to modify Sequence Number field
  - "esp_proto", to modify next protocol field in ESP trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 app/test-pmd/cmdline_flow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 03b418a5d8..9e1048f945 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -963,6 +963,7 @@ static const char *const modify_field_ids[] = {
"tcp_data_off", "ipv4_ihl", "ipv4_total_len", "ipv6_payload_len",
"random",
"ipv4_proto",
+   "esp_spi", "esp_seq_num", "esp_proto",
NULL
 };
 
-- 
2.18.1



[PATCH v2 4/6] ethdev: add modify action support for IPsec fields

2024-02-07 Thread Viacheslav Ovsiienko
The following IPsec related field definitions added:

 - RTE_FLOW_FIELD_ESP_SPI - SPI value in IPsec header
 - RTE_FLOW_FIELD_ESP_SEQ_NUM - sequence number in header
 - RTE_FLOW_FIELD_ESP_PROTO - next protocol value in trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 3 +++
 lib/ethdev/rte_flow.h  | 5 -
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 140f7b0ac5..0403157202 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -68,6 +68,9 @@ New Features
 * **Added new field IDs in the experimental ``enum rte_flow_field_id``:
 
   * Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_SPI`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_PROTO`` to represent it in field ID struct.
 
 * ** Support for getting the number of used descriptors of a Tx queue. **
 
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 84af730dc7..6efba67f12 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3934,7 +3934,10 @@ enum rte_flow_field_id {
RTE_FLOW_FIELD_IPV4_TOTAL_LEN,  /**< IPv4 total length. */
RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN,/**< IPv6 payload length. */
RTE_FLOW_FIELD_RANDOM,  /**< Random value. */
-   RTE_FLOW_FIELD_IPV4_PROTO   /**< IPv4 next protocol. */
+   RTE_FLOW_FIELD_IPV4_PROTO,  /**< IPv4 next protocol. */
+   RTE_FLOW_FIELD_ESP_SPI, /**< ESP SPI. */
+   RTE_FLOW_FIELD_ESP_SEQ_NUM, /**< ESP Sequence Number. */
+   RTE_FLOW_FIELD_ESP_PROTO/**< ESP next protocol value. */
 };
 
 /**
-- 
2.18.1



[PATCH v1] dts: strip whitespaces from stdout and stderr

2024-02-07 Thread Juraj Linkeš
There could be a newline at the end of stdout or stderr of a remotely
executed command. These cause issues when used later, such as when
joining paths from such commands - a newline in the middle of a path is
not valid.

Fixes: ad80f550dbc5 ("dts: add SSH command verification")
Signed-off-by: Juraj Linkeš 
---
 .../remote_session/remote_session.py  | 24 +++
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/dts/framework/remote_session/remote_session.py 
b/dts/framework/remote_session/remote_session.py
index 2059f9a981..6bea1a2306 100644
--- a/dts/framework/remote_session/remote_session.py
+++ b/dts/framework/remote_session/remote_session.py
@@ -10,8 +10,8 @@
 """
 
 
-import dataclasses
 from abc import ABC, abstractmethod
+from dataclasses import InitVar, dataclass, field
 from pathlib import PurePath
 
 from framework.config import NodeConfiguration
@@ -20,7 +20,7 @@
 from framework.settings import SETTINGS
 
 
-@dataclasses.dataclass(slots=True, frozen=True)
+@dataclass(slots=True, frozen=True)
 class CommandResult:
 """The result of remote execution of a command.
 
@@ -34,9 +34,25 @@ class CommandResult:
 
 name: str
 command: str
-stdout: str
-stderr: str
+init_stdout: InitVar[str]
+init_stderr: InitVar[str]
 return_code: int
+stdout: str = field(init=False)
+stderr: str = field(init=False)
+
+def __post_init__(self, init_stdout, init_stderr):
+"""Strip the whitespaces from stdout and stderr.
+
+The generated __init__ method uses object.__setattr__() when the 
dataclass is frozen,
+so that's what we use here as well.
+
+In order to get access to dataclass fields in the __post_init__ method,
+we have to type them as InitVars. These InitVars are included in the 
__init__ method's
+signature, so we have to exclude the actual stdout and stderr fields
+from the __init__ method's signature, so that we have the proper 
number of arguments.
+"""
+object.__setattr__(self, "stdout", init_stdout.strip())
+object.__setattr__(self, "stderr", init_stderr.strip())
 
 def __str__(self) -> str:
 """Format the command outputs."""
-- 
2.34.1



[PATCH v3 1/6] ethdev: add modify IPv4 next protocol field

2024-02-07 Thread Viacheslav Ovsiienko
Add IPv4 next protocol modify field definition.

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 4 
 lib/ethdev/rte_flow.h  | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 80db117206..99981ae2ea 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -65,6 +65,10 @@ New Features
   * Added ``RTE_FLOW_ITEM_TYPE_RANDOM`` to match random value.
   * Added ``RTE_FLOW_FIELD_RANDOM`` to represent it in field ID struct.
 
+* **Added new field IDs in the experimental ``enum rte_flow_field_id``:
+
+  * Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID struct.
+
 * ** Support for getting the number of used descriptors of a Tx queue. **
 
   * Added a fath path function ``rte_eth_tx_queue_count`` to get the number of 
used
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 09c1b13381..9e76e53905 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -2421,7 +2421,8 @@ enum rte_flow_field_id {
RTE_FLOW_FIELD_IPV4_IHL,/**< IPv4 IHL. */
RTE_FLOW_FIELD_IPV4_TOTAL_LEN,  /**< IPv4 total length. */
RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN,/**< IPv6 payload length. */
-   RTE_FLOW_FIELD_RANDOM   /**< Random value. */
+   RTE_FLOW_FIELD_RANDOM,  /**< Random value. */
+   RTE_FLOW_FIELD_IPV4_PROTO   /**< IPv4 next protocol. */
 };
 
 /**
-- 
2.18.1



[PATCH v3 2/6] app/testpmd: add modify IPv4 next protocol command line

2024-02-07 Thread Viacheslav Ovsiienko
Add new modify field action type string: "ipv4_proto".

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Ori Kam 
Acked-by: Dariusz Sosnowski 
---
 app/test-pmd/cmdline_flow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a4131e1b39..1b5919dd18 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -990,6 +990,7 @@ static const char *const flow_field_ids[] = {
"geneve_opt_type", "geneve_opt_class", "geneve_opt_data", "mpls",
"tcp_data_off", "ipv4_ihl", "ipv4_total_len", "ipv6_payload_len",
"random",
+   "ipv4_proto",
NULL
 };
 
-- 
2.18.1



[PATCH v3 3/6] net/mlx5: add modify IPv4 protocol implementation

2024-02-07 Thread Viacheslav Ovsiienko
Add modify IPv4 protocol implementation for mlx5 PMD.

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 1 +
 drivers/common/mlx5/mlx5_prm.h | 1 +
 drivers/net/mlx5/mlx5_flow_dv.c| 4 +++-
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 99981ae2ea..c9a4809254 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -108,6 +108,7 @@ New Features
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_TYPE`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_CLASS`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_DATA`` flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_IPV4_PROTO`` 
flow action.
 
 
 Removed Items
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index abff8e4dc3..3168ce76a5 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -839,6 +839,7 @@ enum mlx5_modification_field {
MLX5_MODI_IN_MPLS_LABEL_2,
MLX5_MODI_IN_MPLS_LABEL_3,
MLX5_MODI_IN_MPLS_LABEL_4,
+   MLX5_MODI_OUT_IP_PROTOCOL = 0x4A,
MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A,
MLX5_MODI_META_REG_C_8 = 0x8F,
MLX5_MODI_META_REG_C_9 = 0x90,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a4ed7b1444..eb7cbf808c 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1384,6 +1384,7 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
case RTE_FLOW_FIELD_IPV4_DSCP:
return 6;
case RTE_FLOW_FIELD_IPV4_TTL:
+   case RTE_FLOW_FIELD_IPV4_PROTO:
return 8;
case RTE_FLOW_FIELD_IPV4_SRC:
case RTE_FLOW_FIELD_IPV4_DST:
@@ -2194,10 +2195,11 @@ mlx5_flow_field_id_to_modify_info
info[idx].offset = data->offset;
}
break;
+   case RTE_FLOW_FIELD_IPV4_PROTO: /* Fall-through. */
case RTE_FLOW_FIELD_IPV6_PROTO:
MLX5_ASSERT(data->offset + width <= 8);
off_be = 8 - (data->offset + width);
-   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IPV6_NEXT_HDR};
+   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IP_PROTOCOL};
if (mask)
mask[idx] = flow_modify_info_mask_8(width, off_be);
else
-- 
2.18.1



[PATCH v3 5/6] app/testpmd: add modify ESP related fields command line

2024-02-07 Thread Viacheslav Ovsiienko
Add new modify field destination type strings:

  - "esp_spi", to modify Security Parameter Index field
  - "esp_seq_num", to modify Sequence Number field
  - "esp_proto", to modify next protocol field in ESP trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 app/test-pmd/cmdline_flow.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 1b5919dd18..102b4d67c9 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -991,6 +991,7 @@ static const char *const flow_field_ids[] = {
"tcp_data_off", "ipv4_ihl", "ipv4_total_len", "ipv6_payload_len",
"random",
"ipv4_proto",
+   "esp_spi", "esp_seq_num", "esp_proto",
NULL
 };
 
-- 
2.18.1



[PATCH v3 4/6] ethdev: add modify action support for IPsec fields

2024-02-07 Thread Viacheslav Ovsiienko
The following IPsec related field definitions added:

 - RTE_FLOW_FIELD_ESP_SPI - SPI value in IPsec header
 - RTE_FLOW_FIELD_ESP_SEQ_NUM - sequence number in header
 - RTE_FLOW_FIELD_ESP_PROTO - next protocol value in trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst | 3 +++
 lib/ethdev/rte_flow.h  | 5 -
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index c9a4809254..d0c3389287 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -68,6 +68,9 @@ New Features
 * **Added new field IDs in the experimental ``enum rte_flow_field_id``:
 
   * Added ``RTE_FLOW_FIELD_IPV4_PROTO`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_SPI`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_ESP_PROTO`` to represent it in field ID struct.
 
 * ** Support for getting the number of used descriptors of a Tx queue. **
 
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 9e76e53905..627a856537 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -2422,7 +2422,10 @@ enum rte_flow_field_id {
RTE_FLOW_FIELD_IPV4_TOTAL_LEN,  /**< IPv4 total length. */
RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN,/**< IPv6 payload length. */
RTE_FLOW_FIELD_RANDOM,  /**< Random value. */
-   RTE_FLOW_FIELD_IPV4_PROTO   /**< IPv4 next protocol. */
+   RTE_FLOW_FIELD_IPV4_PROTO,  /**< IPv4 next protocol. */
+   RTE_FLOW_FIELD_ESP_SPI, /**< ESP SPI. */
+   RTE_FLOW_FIELD_ESP_SEQ_NUM, /**< ESP Sequence Number. */
+   RTE_FLOW_FIELD_ESP_PROTO/**< ESP next protocol value. */
 };
 
 /**
-- 
2.18.1



[PATCH v3 6/6] net/mlx5: add modify field action IPsec support

2024-02-07 Thread Viacheslav Ovsiienko
Add mlx5 PMD support for the IPsec fields:

  - RTE_FLOW_FIELD_ESP_SPI - SPI value in IPsec header
  - RTE_FLOW_FIELD_ESP_SEQ_NUM - sequence number in header
  - RTE_FLOW_FIELD_ESP_PROTO - next protocol value in trailer

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 doc/guides/rel_notes/release_24_03.rst |  3 +++
 drivers/common/mlx5/mlx5_prm.h |  3 +++
 drivers/net/mlx5/mlx5_flow_dv.c| 31 ++
 3 files changed, 37 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index d0c3389287..0f8d2fd81c 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -112,6 +112,9 @@ New Features
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_CLASS`` flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_GENEVE_OPT_DATA`` flow action.
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_IPV4_PROTO`` 
flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SPI`` flow 
action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` 
flow action.
+  * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_PROTO`` 
flow action.
 
 
 Removed Items
diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 3168ce76a5..0035a1e616 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -854,6 +854,9 @@ enum mlx5_modification_field {
MLX5_MODI_OUT_IPV6_PAYLOAD_LEN = 0x11E,
MLX5_MODI_OUT_IPV4_IHL = 0x11F,
MLX5_MODI_OUT_TCP_DATA_OFFSET = 0x120,
+   MLX5_MODI_OUT_ESP_SPI = 0x5E,
+   MLX5_MODI_OUT_ESP_SEQ_NUM = 0x82,
+   MLX5_MODI_OUT_IPSEC_NEXT_HDR = 0x126,
MLX5_MODI_INVALID = INT_MAX,
 };
 
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index eb7cbf808c..6fded15d91 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1414,7 +1414,11 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
case RTE_FLOW_FIELD_GTP_TEID:
case RTE_FLOW_FIELD_MPLS:
case RTE_FLOW_FIELD_TAG:
+   case RTE_FLOW_FIELD_ESP_SPI:
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
return 32;
+   case RTE_FLOW_FIELD_ESP_PROTO:
+   return 8;
case RTE_FLOW_FIELD_MARK:
return rte_popcount32(priv->sh->dv_mark_mask);
case RTE_FLOW_FIELD_META:
@@ -2205,6 +2209,33 @@ mlx5_flow_field_id_to_modify_info
else
info[idx].offset = off_be;
break;
+   case RTE_FLOW_FIELD_ESP_PROTO:
+   MLX5_ASSERT(data->offset + width <= 8);
+   off_be = 8 - (data->offset + width);
+   info[idx] = (struct field_modify_info){1, 0, 
MLX5_MODI_OUT_IPSEC_NEXT_HDR};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_8(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
+   case RTE_FLOW_FIELD_ESP_SPI:
+   MLX5_ASSERT(data->offset + width <= 32);
+   off_be = 32 - (data->offset + width);
+   info[idx] = (struct field_modify_info){4, 0, 
MLX5_MODI_OUT_ESP_SPI};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
+   MLX5_ASSERT(data->offset + width <= 32);
+   off_be = 32 - (data->offset + width);
+   info[idx] = (struct field_modify_info){4, 0, 
MLX5_MODI_OUT_ESP_SEQ_NUM};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
case RTE_FLOW_FIELD_FLEX_ITEM:
MLX5_ASSERT(data->flex_handle != NULL && !(data->offset & 0x7));
mlx5_modify_flex_item(dev, (const struct mlx5_flex_item 
*)data->flex_handle,
-- 
2.18.1



Azure(hyperv) hugepages issues with Mellanox NICs(mlx5)

2024-02-07 Thread Vladimir Ratnikov
Hello!

We observe the problem with hugepages on Azure environment with mellanox
devices there(Currently MT27710 Connect-X 4) which use mlx5 PMD.
We use a DPDK based application, but when using testpmd for debug purposes
we observe exact the same issue. After each restart of the process, 2
hugepages(2 Mellanox NICs) disappear from the pool until they're totally
exhausted.
Checking using /proc/meminfo.

That's very weird behavior. If user space application exits, hugepages will
be freed in any case(except in situations where files are being kept on
hugetlb FS).
So it probably tells that maybe some kernel process holds those pages(tried
to rmmod/modprobe all related modules and it didn't help. On hugetlb FS
there's nothing at this point)

Found that one of the places where the issue appears to happen is
mlx5_malloc function when ~2MB of memory is being allocated while creating
the device.

Stack trace is:

> #0  mlx5_malloc (flags=4, size=2097168, align=64, socket=-1) at
> ../src-dpdk/drivers/common/mlx5/mlx5_malloc.c:174
> #1  0x7fffae258cad in _mlx5_ipool_malloc_cache (pool=0xac036ac40,
> cidx=0, idx=0x7fffa8759e90) at ../src-dpdk/drivers/net/mlx5/mlx5_utils.c:410
> #2  0x7fffae258e42 in mlx5_ipool_malloc_cache (pool=0xac036ac40,
> idx=0x7fffa8759e90) at ../src-dpdk/drivers/net/mlx5/mlx5_utils.c:441
> #3  0x7fffae259208 in mlx5_ipool_malloc (pool=0xac036ac40,
> idx=0x7fffa8759e90) at ../src-dpdk/drivers/net/mlx5/mlx5_utils.c:521
> #4  0x7fffae2593d0 in mlx5_ipool_zmalloc (pool=0xac036ac40,
> idx=0x7fffa8759e90) at ../src-dpdk/drivers/net/mlx5/mlx5_utils.c:575
> #5  0x7fffabbb1b73 in flow_dv_discover_priorities (dev=0x7fffaf97ae80
> , vprio=0x7fffaf1e492e , vprio_n=2) at
> ../src-dpdk/drivers/net/mlx5/mlx5_flow_dv.c:19706
> #6  0x7fffabb4d126 in mlx5_flow_discover_priorities
> (dev=0x7fffaf97ae80 ) at
> ../src-dpdk/drivers/net/mlx5/mlx5_flow.c:11963
> #7  0x7fffae52cf48 in mlx5_dev_spawn (dpdk_dev=0x5575bca0,
> spawn=0xac03d7340, eth_da=0x7fffa875a1d0, mkvlist=0x0) at
> ../src-dpdk/drivers/net/mlx5/linux/mlx5_os.c:1649
> #8  0x7fffae52f1a3 in mlx5_os_pci_probe_pf (cdev=0xac03d9d80,
> req_eth_da=0x7fffa875a300, owner_id=0, mkvlist=0x0) at
> ../src-dpdk/drivers/net/mlx5/linux/mlx5_os.c:2348
> #9  0x7fffae52f91f in mlx5_os_pci_probe (cdev=0xac03d9d80,
> mkvlist=0x0) at ../src-dpdk/drivers/net/mlx5/linux/mlx5_os.c:2497
> #10 0x7fffae52fd09 in mlx5_os_net_probe (cdev=0xac03d9d80,
> mkvlist=0x0) at ../src-dpdk/drivers/net/mlx5/linux/mlx5_os.c:2578
> #11 0x7fffa93f297b in drivers_probe (cdev=0xac03d9d80, user_classes=1,
> mkvlist=0x0) at ../src-dpdk/drivers/common/mlx5/mlx5_common.c:937
> #12 0x7fffa93f2c95 in mlx5_common_dev_probe (eal_dev=0x5575bca0)
> at ../src-dpdk/drivers/common/mlx5/mlx5_common.c:1027
> #13 0x7fffa94105b3 in mlx5_common_pci_probe (pci_drv=0x7fffaf492680
> , pci_dev=0x5575bc90) at
> ../src-dpdk/drivers/common/mlx5/mlx5_common_pci.c:168
> #14 0x7fffa9297950 in rte_pci_probe_one_driver (dr=0x7fffaf492680
> , dev=0x5575bc90) at
> ../src-dpdk/drivers/bus/pci/pci_common.c:312
> #15 0x7fffa9297be4 in pci_probe_all_drivers (dev=0x5575bc90) at
> ../src-dpdk/drivers/bus/pci/pci_common.c:396
> #16 0x7fffa9297c6d in pci_probe () at
> ../src-dpdk/drivers/bus/pci/pci_common.c:423
> #17 0x7fffa9c8f551 in rte_bus_probe () at
> ../src-dpdk/lib/eal/common/eal_common_bus.c:78
> #18 0x7fffa9cd80c2 in rte_eal_init (argc=7, argv=0x7fffbba74ff8) at
> ../src-dpdk/lib/eal/linux/eal.c:1300


We found that there's a devarg for mlx5 PMD `sys_mem_en=1` which allows
using system memory instead of RTE memory. It helped a bit. Now just one
hugepage is missing after each restart.
Also tried to reproduce the same on hardware and it's fine there(a bit
different NICs MT27800, but using the same PMD mlx5).

So our thoughts that something could be related somehow with Azure(hyperv?)
environment. So wondering if someone observed the same issue?

Many thanks!


Re: [PATCH v3] ethdev: fast path async flow API

2024-02-07 Thread Thomas Monjalon
07/02/2024 11:56, Ferruh Yigit:
> As we synced with Dariusz, there is no good place to document
> ethdev-drivers interfaces in the release notes.
> 
> Also this release there were more ethdev-drivers interface changes,
> around get_ptype(), but those also not documented in the release notes,
> so will remove these ones too.
> 
> 
> But for further release notes, @Thomas, @John, what do you think to add
> a new section (or sub-section) for "internal interface" ?? (device
> abstraction - drivers) interface changes?

When a driver interface is changed, the drivers are updated accordingly.
If it's a driver addition, then we need to follow-up with drivers maintainers.
Adding a new section in the release notes for driver interface changes
is possible but means more work. I'm afraid a lot of changes won't be described,
so I'm not sure of the value of such incomplete doc.




RE: [PATCH 1/1] ethdev: add IPv6 FL and TC field identifiers

2024-02-07 Thread Dariusz Sosnowski
> -Original Message-
> From: Michael Baum 
> Sent: Tuesday, February 6, 2024 15:27
> To: dev@dpdk.org
> Cc: Ori Kam ; Dariusz Sosnowski
> ; Ferruh Yigit ; NBU-
> Contact-Thomas Monjalon (EXTERNAL) 
> Subject: [PATCH 1/1] ethdev: add IPv6 FL and TC field identifiers
> 
> Add new "rte_flow_field_id" enumeration values to describe both IPv6 traffic
> class and IPv6 flow label fields.
> 
> The TC value is "RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS" in flow API and
> "ipv6_traffic_class" in testpmd command.
> The FL value is "RTE_FLOW_FIELD_IPV6_FLOW_LABEL" in flow API and
> "ipv6_flow_label" in testpmd command.
> 
> Signed-off-by: Michael Baum 
Acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


Re: [PATCH] net/hns3: fix Rx packet truncation when KEEP CRC enabled

2024-02-07 Thread Ferruh Yigit
On 2/6/2024 1:10 AM, Jie Hai wrote:
> From: Dengdui Huang 
> 
> When KEEP_CRC offload is enabled, some packets will be truncated and
> the CRC is still be stripped in following cases:
> 1. For HIP08 hardware, the packet type is TCP and the length
>is less than or equal to 60B.
> 2. For other hardwares, the packet type is IP and the length
>is less than or equal to 60B.
> 

If a device doesn't support the offload by some packets, it can be
option to disable offload for that device, instead of calculating it in
software and append it.
Unless you have a specific usecase, or requirement to support the offload.

<...>

> @@ -2492,10 +2544,16 @@ hns3_recv_pkts_simple(void *rx_queue,
>   goto pkt_err;
>  
>   rxm->packet_type = hns3_rx_calc_ptype(rxq, l234_info, ol_info);
> -
>   if (rxm->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC)
>   rxm->ol_flags |= RTE_MBUF_F_RX_IEEE1588_PTP;
>  
> + if (unlikely(rxq->crc_len > 0)) {
> + if (hns3_need_recalculate_crc(rxq, rxm))
> + hns3_recalculate_crc(rxq, rxm);
> + rxm->pkt_len -= rxq->crc_len;
> + rxm->data_len -= rxq->crc_len;
>

Removing 'crc_len' from 'mbuf->pkt_len' & 'mbuf->data_len' is
practically same as stripping CRC.

We don't count CRC length in the statistics, but it should be accessible
in the payload by the user.



Re: [PATCH] ethdev: recommend against using locks in event callbacks

2024-02-07 Thread Kevin Traynor
On 06/02/2024 20:33, Ferruh Yigit wrote:
> On 2/1/2024 10:08 AM, Kevin Traynor wrote:
>> On 01/02/2024 08:43, David Marchand wrote:
>>> As described in a recent bugzilla opened against the net/iavf driver,
>>> a driver may call a event callback from other calls of the ethdev API.
>>>
>>> Nothing guarantees in the ethdev API against such behavior.
>>>
>>> Add a notice against using locks in those callbacks.
>>>
>>> Bugzilla ID: 1337
>>>
>>> Signed-off-by: David Marchand 
>>> ---
>>>  lib/ethdev/rte_ethdev.h | 14 +-
>>>  1 file changed, 13 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
>>> index 21e3a21903..5c6b104fb4 100644
>>> --- a/lib/ethdev/rte_ethdev.h
>>> +++ b/lib/ethdev/rte_ethdev.h
>>> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>>> RTE_ETH_EVENT_MAX   /**< max value of this enum */
>>>  };
>>>  
>>> -/** User application callback to be registered for interrupts. */
>>> +/**
>>> + * User application callback to be registered for interrupts.
>>> + *
>>> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
>>> + *   called in the middle of other parts of the ethdev API. For 
>>> example,
>>> + *   imagine that thread A calls rte_eth_dev_start() and as part of 
>>> this
>>> + *   call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
>>> + *   associated callback is ran on thread A. In that example, if the
>>> + *   application protects its internal data using locks before calling
>>> + *   rte_eth_dev_start(), and the callback takes a same lock, a 
>>> deadlock
>>> + *   occurs. Because of this, it is highly recommended NOT to take 
>>> locks in
>>> + *   those callbacks.
>>> + */
>>
>> That is a good practical recommendation for an application developer.
>>
>> I wonder if it should taken further so that the API formally states the
>> callback MUST be non-blocking?
>>
> 
> Application still can manage the locks in a safe way, but needs to be
> aware of above condition and possible deadlock.
> 

Just to explain a bit more, if you look at the original issue in the
Bugzilla [0], I think there was an assumption that
rte_eth_dev_configure() would not block or deadlock with the
eal-intr-thread. So then it was assumed that waiting for the lock in the
callback was ok, because rte_eth_dev_configure() would return and
callback would obtain the lock.

So i'm showing that in this example the lack of a guarantee or clarity
or bad assumption about the behavior of rte_eth_dev_configure() made it
difficult for an app developer to know if their locks were safe or not.
That's why I was thinking about something more formal.

> I think above note is sufficient instead of forbidding locks in
> callbacks completely.
> 

In the end the difference between "highly recommended NOT to" and "must
not" is not much and either way is probably enough to scare someone
enough to avoid them.

[0] https://bugs.dpdk.org/show_bug.cgi?id=1337#c0



Re: [PATCH] ethdev: recommend against using locks in event callbacks

2024-02-07 Thread Kevin Traynor
On 01/02/2024 08:43, David Marchand wrote:
> As described in a recent bugzilla opened against the net/iavf driver,
> a driver may call a event callback from other calls of the ethdev API.
> 
> Nothing guarantees in the ethdev API against such behavior.
> 
> Add a notice against using locks in those callbacks.
> 
> Bugzilla ID: 1337
> 
> Signed-off-by: David Marchand 
> ---
>  lib/ethdev/rte_ethdev.h | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 21e3a21903..5c6b104fb4 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -4090,7 +4090,19 @@ enum rte_eth_event_type {
>   RTE_ETH_EVENT_MAX   /**< max value of this enum */
>  };
>  
> -/** User application callback to be registered for interrupts. */
> +/**
> + * User application callback to be registered for interrupts.
> + *
> + * Note: there is no guarantee in the DPDK drivers that a callback won't be
> + *   called in the middle of other parts of the ethdev API. For example,
> + *   imagine that thread A calls rte_eth_dev_start() and as part of this
> + *   call, a RTE_ETH_EVENT_INTR_RESET event gets generated and the
> + *   associated callback is ran on thread A. In that example, if the
> + *   application protects its internal data using locks before calling
> + *   rte_eth_dev_start(), and the callback takes a same lock, a deadlock
> + *   occurs. Because of this, it is highly recommended NOT to take locks 
> in
> + *   those callbacks.
> + */
>  typedef int (*rte_eth_dev_cb_fn)(uint16_t port_id,
>   enum rte_eth_event_type event, void *cb_arg, void *ret_param);
>  

Acked-by: Kevin Traynor 



[PATCH] lib/hash,lib/rcu: feature hidden key count in hash

2024-02-07 Thread Abdullah Ömer Yamaç
This patch introduce a new API to get the hidden key count in the hash
table if the rcu qsbr is enabled. When using rte_hash_count with rcu
qsbr enabled, it will return the number of elements that are not in the
free queue. Unless rte_rcu_qsbr_dq_reclaim is called, the number of
elements in the defer queue will not be counted and freed. Therefore I
added a new API to get the number of hidden (defer queue) elements
in the hash table. Then the user can calculate the total number of
elements that are available in the hash table.

Signed-off-by: Abdullah Ömer Yamaç 

---
Cc: Honnappa Nagarahalli 
Cc: Yipeng Wang 
Cc: Sameh Gobriel 
Cc: Bruce Richardson 
Cc: Vladimir Medvedkin 
---
 lib/hash/rte_cuckoo_hash.c |  9 +
 lib/hash/rte_hash.h| 13 +
 lib/hash/version.map   |  1 +
 lib/rcu/rte_rcu_qsbr.c |  8 
 lib/rcu/rte_rcu_qsbr.h | 11 +++
 lib/rcu/version.map|  1 +
 6 files changed, 43 insertions(+)

diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index 70456754c4..3553f3efc7 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -555,6 +555,15 @@ rte_hash_max_key_id(const struct rte_hash *h)
return h->entries;
 }
 
+int32_t
+rte_hash_dq_count(const struct rte_hash *h)
+{
+   if (h->dq == NULL)
+   return -EINVAL;
+
+   return rte_rcu_qsbr_dq_count(h->dq);
+}
+
 int32_t
 rte_hash_count(const struct rte_hash *h)
 {
diff --git a/lib/hash/rte_hash.h b/lib/hash/rte_hash.h
index 7ecc02..8ea97e297d 100644
--- a/lib/hash/rte_hash.h
+++ b/lib/hash/rte_hash.h
@@ -193,6 +193,19 @@ rte_hash_free(struct rte_hash *h);
 void
 rte_hash_reset(struct rte_hash *h);
 
+
+/**
+ * Return the number of records in the defer queue of the hash table 
+ * if RCU is enabled.
+ * @param h
+ *  Hash table to query from
+ * @return
+ *   - -EINVAL if parameters are invalid
+ *   - A value indicating how many records were inserted in the table.
+ */
+int32_t
+rte_hash_dq_count(const struct rte_hash *h);
+
 /**
  * Return the number of keys in the hash table
  * @param h
diff --git a/lib/hash/version.map b/lib/hash/version.map
index 6b2afebf6b..7f7b158cf1 100644
--- a/lib/hash/version.map
+++ b/lib/hash/version.map
@@ -9,6 +9,7 @@ DPDK_24 {
rte_hash_add_key_with_hash;
rte_hash_add_key_with_hash_data;
rte_hash_count;
+   rte_hash_dq_count;
rte_hash_crc32_alg;
rte_hash_crc_set_alg;
rte_hash_create;
diff --git a/lib/rcu/rte_rcu_qsbr.c b/lib/rcu/rte_rcu_qsbr.c
index bd0b83be0c..89f8da4c4c 100644
--- a/lib/rcu/rte_rcu_qsbr.c
+++ b/lib/rcu/rte_rcu_qsbr.c
@@ -450,6 +450,14 @@ rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq, 
unsigned int n,
return 0;
 }
 
+/**
+ * Return the number of entries in a defer queue.
+ */
+unsigned int rte_rcu_qsbr_dq_count(struct rte_rcu_qsbr_dq *dq)
+{
+   return rte_ring_count(dq->r);
+}
+
 /* Delete a defer queue. */
 int
 rte_rcu_qsbr_dq_delete(struct rte_rcu_qsbr_dq *dq)
diff --git a/lib/rcu/rte_rcu_qsbr.h b/lib/rcu/rte_rcu_qsbr.h
index 23c9f89805..ed5a590edd 100644
--- a/lib/rcu/rte_rcu_qsbr.h
+++ b/lib/rcu/rte_rcu_qsbr.h
@@ -794,6 +794,17 @@ int
 rte_rcu_qsbr_dq_reclaim(struct rte_rcu_qsbr_dq *dq, unsigned int n,
unsigned int *freed, unsigned int *pending, unsigned int *available);
 
+/**
+ * Return the number of entries in a defer queue.
+ *
+ * @param dq
+ *   Defer queue.
+ * @return
+ *   The number of entries in the defer queue.
+ */
+unsigned int
+rte_rcu_qsbr_dq_count(struct rte_rcu_qsbr_dq *dq);
+
 /**
  * Delete a defer queue.
  *
diff --git a/lib/rcu/version.map b/lib/rcu/version.map
index 982ffd59d9..f410ab41e7 100644
--- a/lib/rcu/version.map
+++ b/lib/rcu/version.map
@@ -5,6 +5,7 @@ DPDK_24 {
rte_rcu_qsbr_dq_create;
rte_rcu_qsbr_dq_delete;
rte_rcu_qsbr_dq_enqueue;
+   rte_rcu_qsbr_dq_count;
rte_rcu_qsbr_dq_reclaim;
rte_rcu_qsbr_dump;
rte_rcu_qsbr_get_memsize;
-- 
2.34.1



Re: [PATCH v3 00/13] net/ionic: miscellaneous fixes and improvements

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 3:13 AM, Andrew Boyer wrote:
> This patchset provides miscellaneous fixes and improvements for
> the net/ionic driver used by AMD Pensando devices.
> 
> V3:
> - Resend to fix patchwork threading.
> 
> V2:
> - Update device stop and device start patches to use compound literals
>   as suggested by review.
> 
> Akshay Dorwat (1):
>   net/ionic: fix RSS query routine
> 
> Andrew Boyer (8):
>   net/ionic: add stat for completion queue entries processed
>   net/ionic: increase max supported MTU to 9750 bytes
>   net/ionic: don't auto-enable Rx scatter-gather a second time
>   net/ionic: replace non-standard type in structure definition
>   net/ionic: fix device close sequence to avoid crash
>   net/ionic: optimize device close operation
>   net/ionic: optimize device stop operation
>   net/ionic: optimize device start operation
> 
> Brad Larson (1):
>   net/ionic: add flexible firmware xstat counters
> 
> Neel Patel (2):
>   net/ionic: fix missing volatile type for cqe pointers
>   net/ionic: memcpy descriptors when using Q-in-CMB
> 
> Vamsi Krishna Atluri (1):
>   net/ionic: report 1G and 200G link speeds when applicable
> 

For series,
Acked-by: Ferruh Yigit 



[PATCH v2 0/1] ethdev: add IPv6 field identifiers

2024-02-07 Thread Michael Baum
Add new field identifiers for IPv6 traffic class and flow label.

Depends-on: series-31008 ("ethdev: add modify IPv4 next protocol field")

v2:
- Rebase.
- Add "Acked-by" label from v1.

Michael Baum (1):
  ethdev: add IPv6 FL and TC field identifiers

 app/test-pmd/cmdline_flow.c| 1 +
 doc/guides/rel_notes/release_24_03.rst | 2 ++
 lib/ethdev/rte_flow.h  | 4 +++-
 3 files changed, 6 insertions(+), 1 deletion(-)

-- 
2.25.1



[PATCH v2 1/1] ethdev: add IPv6 FL and TC field identifiers

2024-02-07 Thread Michael Baum
Add new "rte_flow_field_id" enumeration values to describe both IPv6
traffic class and IPv6 flow label fields.

The TC value is "RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS" in flow API and
"ipv6_traffic_class" in testpmd command.
The FL value is "RTE_FLOW_FIELD_IPV6_FLOW_LABEL" in flow API and
"ipv6_flow_label" in testpmd command.

Signed-off-by: Michael Baum 
Acked-by: Dariusz Sosnowski 
---
 app/test-pmd/cmdline_flow.c| 1 +
 doc/guides/rel_notes/release_24_03.rst | 2 ++
 lib/ethdev/rte_flow.h  | 4 +++-
 3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 102b4d67c9..ab8bece28e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -992,6 +992,7 @@ static const char *const flow_field_ids[] = {
"random",
"ipv4_proto",
"esp_spi", "esp_seq_num", "esp_proto",
+   "ipv6_flow_label", "ipv6_traffic_class",
NULL
 };
 
diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 0909a2245d..f548eacc5e 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -68,6 +68,8 @@ New Features
   * Added ``RTE_FLOW_FIELD_ESP_SPI`` to represent it in field ID struct.
   * Added ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` to represent it in field ID struct.
   * Added ``RTE_FLOW_FIELD_ESP_PROTO`` to represent it in field ID struct.
+  * Added ``RTE_FLOW_FIELD_IPV6_FLOW_LABEL`` to represent it in field ID 
struct.
+  * Added ``RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS`` to represent it in field ID 
struct.
 
 * ** Support for getting the number of used descriptors of a Tx queue. **
 
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index b8fc16b819..8b32a69d8d 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -2425,7 +2425,9 @@ enum rte_flow_field_id {
RTE_FLOW_FIELD_IPV4_PROTO,  /**< IPv4 next protocol. */
RTE_FLOW_FIELD_ESP_SPI, /**< ESP SPI. */
RTE_FLOW_FIELD_ESP_SEQ_NUM, /**< ESP Sequence Number. */
-   RTE_FLOW_FIELD_ESP_PROTO/**< ESP next protocol value. */
+   RTE_FLOW_FIELD_ESP_PROTO,   /**< ESP next protocol value. */
+   RTE_FLOW_FIELD_IPV6_FLOW_LABEL, /**< IPv6 flow label. */
+   RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS/**< IPv6 traffic class. */
 };
 
 /**
-- 
2.25.1



[PATCH v2 2/7] common/mlx5: reorder modification field PRM list

2024-02-07 Thread Michael Baum
Reorder modification field PRM list according to values from lowest to
highest.
This patch also removes value specification from all fields which their
value is one more than previous one.

Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index af16bf4cf6..b758557ef9 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -816,6 +816,7 @@ enum mlx5_modification_field {
MLX5_MODI_OUT_IPV6_HOPLIMIT,
MLX5_MODI_IN_IPV6_HOPLIMIT,
MLX5_MODI_META_DATA_REG_A,
+   MLX5_MODI_OUT_IP_PROTOCOL,
MLX5_MODI_META_DATA_REG_B = 0x50,
MLX5_MODI_META_REG_C_0,
MLX5_MODI_META_REG_C_1,
@@ -829,32 +830,31 @@ enum mlx5_modification_field {
MLX5_MODI_IN_TCP_SEQ_NUM,
MLX5_MODI_OUT_TCP_ACK_NUM,
MLX5_MODI_IN_TCP_ACK_NUM = 0x5C,
+   MLX5_MODI_OUT_ESP_SPI = 0x5E,
MLX5_MODI_GTP_TEID = 0x6E,
MLX5_MODI_OUT_IP_ECN = 0x73,
MLX5_MODI_TUNNEL_HDR_DW_1 = 0x75,
-   MLX5_MODI_GTPU_FIRST_EXT_DW_0 = 0x76,
+   MLX5_MODI_GTPU_FIRST_EXT_DW_0,
MLX5_MODI_HASH_RESULT = 0x81,
+   MLX5_MODI_OUT_ESP_SEQ_NUM,
MLX5_MODI_IN_MPLS_LABEL_0 = 0x8a,
MLX5_MODI_IN_MPLS_LABEL_1,
MLX5_MODI_IN_MPLS_LABEL_2,
MLX5_MODI_IN_MPLS_LABEL_3,
MLX5_MODI_IN_MPLS_LABEL_4,
-   MLX5_MODI_OUT_IP_PROTOCOL = 0x4A,
-   MLX5_MODI_META_REG_C_8 = 0x8F,
-   MLX5_MODI_META_REG_C_9 = 0x90,
-   MLX5_MODI_META_REG_C_10 = 0x91,
-   MLX5_MODI_META_REG_C_11 = 0x92,
-   MLX5_MODI_META_REG_C_12 = 0x93,
-   MLX5_MODI_META_REG_C_13 = 0x94,
-   MLX5_MODI_META_REG_C_14 = 0x95,
-   MLX5_MODI_META_REG_C_15 = 0x96,
+   MLX5_MODI_META_REG_C_8,
+   MLX5_MODI_META_REG_C_9,
+   MLX5_MODI_META_REG_C_10,
+   MLX5_MODI_META_REG_C_11,
+   MLX5_MODI_META_REG_C_12,
+   MLX5_MODI_META_REG_C_13,
+   MLX5_MODI_META_REG_C_14,
+   MLX5_MODI_META_REG_C_15,
MLX5_MODI_OUT_IPV6_TRAFFIC_CLASS = 0x11C,
-   MLX5_MODI_OUT_IPV4_TOTAL_LEN = 0x11D,
-   MLX5_MODI_OUT_IPV6_PAYLOAD_LEN = 0x11E,
-   MLX5_MODI_OUT_IPV4_IHL = 0x11F,
-   MLX5_MODI_OUT_TCP_DATA_OFFSET = 0x120,
-   MLX5_MODI_OUT_ESP_SPI = 0x5E,
-   MLX5_MODI_OUT_ESP_SEQ_NUM = 0x82,
+   MLX5_MODI_OUT_IPV4_TOTAL_LEN,
+   MLX5_MODI_OUT_IPV6_PAYLOAD_LEN,
+   MLX5_MODI_OUT_IPV4_IHL,
+   MLX5_MODI_OUT_TCP_DATA_OFFSET,
MLX5_MODI_OUT_IPSEC_NEXT_HDR = 0x126,
MLX5_MODI_INVALID = INT_MAX,
 };
-- 
2.25.1



[PATCH v2 0/7] net/mlx5: support copy from inner fields

2024-02-07 Thread Michael Baum
This patch-set adds support of encapsulation level for HWS modify field
in MLX5 PMD.
Outermost is represented by 0,1 and inner is represented by 2.
In addition, modify inner/outer us added for both IPv6 flow label and
IPv6 traffic class.

Depends-on: series-31008 ("ethdev: add modify IPv4 next protocol field")
Depends-on: series-31010 ("ethdev: add IPv6 field identifiers")

v2:
- Rebase.
- Add "copy from inner" to release notes.

Michael Baum (7):
  common/mlx5: remove enum value duplication
  common/mlx5: reorder modification field PRM list
  common/mlx5: add inner PRM fields
  common/mlx5: add IPv6 flow label PRM field
  net/mlx5: add support for modify inner fields
  net/mlx5: support modify IPv6 traffic class field
  net/mlx5: support modify IPv6 flow label field

 doc/guides/nics/mlx5.rst   |  28 -
 doc/guides/rel_notes/release_24_03.rst |   4 +
 drivers/common/mlx5/mlx5_prm.h |  49 +
 drivers/net/mlx5/hws/mlx5dr_action.c   |   4 +-
 drivers/net/mlx5/hws/mlx5dr_pat_arg.c  |   2 +-
 drivers/net/mlx5/mlx5_flow.c   |  12 ++-
 drivers/net/mlx5/mlx5_flow_dv.c| 136 +++--
 drivers/net/mlx5/mlx5_flow_hw.c| 134 +++-
 8 files changed, 284 insertions(+), 85 deletions(-)

-- 
2.25.1



[PATCH v2 1/7] common/mlx5: remove enum value duplication

2024-02-07 Thread Michael Baum
The "mlx5_modification_field" enumeration has 2 different fields
representing the same value 0x4A.
 1. "MLX5_MODI_OUT_IPV6_NEXT_HDR" - specific for IPv6.
 2. "MLX5_MODI_OUT_IP_PROTOCOL" - for both IPv4 and IPv6.

This patch removes "MLX5_MODI_OUT_IPV6_NEXT_HDR" and replaces all its
usages with "MLX5_MODI_OUT_IP_PROTOCOL".

Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h| 1 -
 drivers/net/mlx5/hws/mlx5dr_action.c  | 4 ++--
 drivers/net/mlx5/hws/mlx5dr_pat_arg.c | 2 +-
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 0035a1e616..af16bf4cf6 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -840,7 +840,6 @@ enum mlx5_modification_field {
MLX5_MODI_IN_MPLS_LABEL_3,
MLX5_MODI_IN_MPLS_LABEL_4,
MLX5_MODI_OUT_IP_PROTOCOL = 0x4A,
-   MLX5_MODI_OUT_IPV6_NEXT_HDR = 0x4A,
MLX5_MODI_META_REG_C_8 = 0x8F,
MLX5_MODI_META_REG_C_9 = 0x90,
MLX5_MODI_META_REG_C_10 = 0x91,
diff --git a/drivers/net/mlx5/hws/mlx5dr_action.c 
b/drivers/net/mlx5/hws/mlx5dr_action.c
index 862ee3e332..2828a82d5b 100644
--- a/drivers/net/mlx5/hws/mlx5dr_action.c
+++ b/drivers/net/mlx5/hws/mlx5dr_action.c
@@ -2287,7 +2287,7 @@ mlx5dr_action_create_pop_ipv6_route_ext_mhdr3(struct 
mlx5dr_action *action)
MLX5_SET(copy_action_in, cmd, length, 8);
MLX5_SET(copy_action_in, cmd, src_offset, 24);
MLX5_SET(copy_action_in, cmd, src_field, mod_id);
-   MLX5_SET(copy_action_in, cmd, dst_field, MLX5_MODI_OUT_IPV6_NEXT_HDR);
+   MLX5_SET(copy_action_in, cmd, dst_field, MLX5_MODI_OUT_IP_PROTOCOL);
 
pattern.data = (__be64 *)cmd;
pattern.sz = sizeof(cmd);
@@ -2348,7 +2348,7 @@ mlx5dr_action_create_push_ipv6_route_ext_mhdr1(struct 
mlx5dr_action *action)
/* Set ipv6.protocol to IPPROTO_ROUTING */
MLX5_SET(set_action_in, cmd, action_type, MLX5_MODIFICATION_TYPE_SET);
MLX5_SET(set_action_in, cmd, length, 8);
-   MLX5_SET(set_action_in, cmd, field, MLX5_MODI_OUT_IPV6_NEXT_HDR);
+   MLX5_SET(set_action_in, cmd, field, MLX5_MODI_OUT_IP_PROTOCOL);
MLX5_SET(set_action_in, cmd, data, IPPROTO_ROUTING);
 
pattern.data = (__be64 *)cmd;
diff --git a/drivers/net/mlx5/hws/mlx5dr_pat_arg.c 
b/drivers/net/mlx5/hws/mlx5dr_pat_arg.c
index a949844d24..513549ff3c 100644
--- a/drivers/net/mlx5/hws/mlx5dr_pat_arg.c
+++ b/drivers/net/mlx5/hws/mlx5dr_pat_arg.c
@@ -67,7 +67,7 @@ bool mlx5dr_pat_require_reparse(__be64 *actions, uint16_t 
num_of_actions)
 
/* Below fields can change packet structure require a reparse */
if (field == MLX5_MODI_OUT_ETHERTYPE ||
-   field == MLX5_MODI_OUT_IPV6_NEXT_HDR)
+   field == MLX5_MODI_OUT_IP_PROTOCOL)
return true;
}
 
-- 
2.25.1



[PATCH v2 4/7] common/mlx5: add IPv6 flow label PRM field

2024-02-07 Thread Michael Baum
Add IPv6 flow label field into PRM modify field list.
The new values are "MLX5_MODI_OUT_IPV6_FLOW_LABEL" and
"MLX5_MODI_IN_IPV6_FLOW_LABEL".

Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index c6d409833a..59d885e43b 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -864,6 +864,8 @@ enum mlx5_modification_field {
MLX5_MODI_IN_IPV4_IHL,
MLX5_MODI_IN_TCP_DATA_OFFSET,
MLX5_MODI_OUT_IPSEC_NEXT_HDR,
+   MLX5_MODI_OUT_IPV6_FLOW_LABEL,
+   MLX5_MODI_IN_IPV6_FLOW_LABEL,
MLX5_MODI_INVALID = INT_MAX,
 };
 
-- 
2.25.1



[PATCH v2 3/7] common/mlx5: add inner PRM fields

2024-02-07 Thread Michael Baum
This patch adds inner values into PRM modify field list for each
existing outer field.

Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index b758557ef9..c6d409833a 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -829,14 +829,17 @@ enum mlx5_modification_field {
MLX5_MODI_OUT_TCP_SEQ_NUM,
MLX5_MODI_IN_TCP_SEQ_NUM,
MLX5_MODI_OUT_TCP_ACK_NUM,
-   MLX5_MODI_IN_TCP_ACK_NUM = 0x5C,
+   MLX5_MODI_IN_TCP_ACK_NUM,
MLX5_MODI_OUT_ESP_SPI = 0x5E,
+   MLX5_MODI_IN_ESP_SPI,
MLX5_MODI_GTP_TEID = 0x6E,
MLX5_MODI_OUT_IP_ECN = 0x73,
-   MLX5_MODI_TUNNEL_HDR_DW_1 = 0x75,
+   MLX5_MODI_IN_IP_ECN,
+   MLX5_MODI_TUNNEL_HDR_DW_1,
MLX5_MODI_GTPU_FIRST_EXT_DW_0,
MLX5_MODI_HASH_RESULT = 0x81,
MLX5_MODI_OUT_ESP_SEQ_NUM,
+   MLX5_MODI_IN_ESP_SEQ_NUM,
MLX5_MODI_IN_MPLS_LABEL_0 = 0x8a,
MLX5_MODI_IN_MPLS_LABEL_1,
MLX5_MODI_IN_MPLS_LABEL_2,
@@ -855,7 +858,12 @@ enum mlx5_modification_field {
MLX5_MODI_OUT_IPV6_PAYLOAD_LEN,
MLX5_MODI_OUT_IPV4_IHL,
MLX5_MODI_OUT_TCP_DATA_OFFSET,
-   MLX5_MODI_OUT_IPSEC_NEXT_HDR = 0x126,
+   MLX5_MODI_IN_IPV6_TRAFFIC_CLASS,
+   MLX5_MODI_IN_IPV4_TOTAL_LEN,
+   MLX5_MODI_IN_IPV6_PAYLOAD_LEN,
+   MLX5_MODI_IN_IPV4_IHL,
+   MLX5_MODI_IN_TCP_DATA_OFFSET,
+   MLX5_MODI_OUT_IPSEC_NEXT_HDR,
MLX5_MODI_INVALID = INT_MAX,
 };
 
-- 
2.25.1



[PATCH v2 5/7] net/mlx5: add support for modify inner fields

2024-02-07 Thread Michael Baum
This patch adds support for copying from inner fields using "level" 2.

Signed-off-by: Michael Baum 
---
 doc/guides/nics/mlx5.rst   |  28 +-
 doc/guides/rel_notes/release_24_03.rst |   2 +
 drivers/net/mlx5/mlx5_flow.c   |  12 ++-
 drivers/net/mlx5/mlx5_flow_dv.c| 113 +++--
 drivers/net/mlx5/mlx5_flow_hw.c| 130 -
 5 files changed, 224 insertions(+), 61 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index fa013b03bb..5439e8fd7d 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -595,7 +595,6 @@ Limitations
 Only DWs configured in :ref:`parser creation ` can be 
modified,
 'type' and 'class' fields can be modified when ``match_on_class_mode=2``.
   - Modification of GENEVE TLV option data supports one DW per action.
-  - Encapsulation levels are not supported, can modify outermost header fields 
only.
   - Offsets cannot skip past the boundary of a field.
   - If the field type is ``RTE_FLOW_FIELD_MAC_TYPE``
 and packet contains one or more VLAN headers,
@@ -609,6 +608,33 @@ Limitations
   - For flow metadata fields (e.g. META or TAG)
 offset specifies the number of bits to skip from field's start,
 starting from LSB in the least significant byte, in the host order.
+  - Modification of the MPLS header is supported with some limitations:
+
+- Only in HW steering.
+- Only in ``src`` field.
+- Only for outermost tunnel header (``level=2``).
+  For ``RTE_FLOW_FIELD_MPLS``,
+  the default encapsulation level ``0`` describes the outermost tunnel 
header.
+
+  .. note::
+
+ the default encapsulation level ``0`` describes the "outermost that 
match is supported",
+ currently it is first tunnel, but it can be changed to outer when it 
is supported.
+
+  - Default encapsulation level ``0`` describes outermost.
+  - Encapsulation level ``1`` is supported.
+  - Encapsulation level ``2`` is supported with some limitations:
+
+- Only in HW steering.
+- Only in ``src`` field.
+- ``RTE_FLOW_FIELD_VLAN_ID`` is not supported.
+- ``RTE_FLOW_FIELD_IPV4_PROTO`` is not supported.
+- ``RTE_FLOW_FIELD_IPV6_PROTO/DSCP/ECN`` are not supported.
+- ``RTE_FLOW_FIELD_ESP_PROTO/SPI/SEQ_NUM`` are not supported.
+- ``RTE_FLOW_FIELD_TCP_SEQ/ACK_NUM`` are not supported.
+- Second tunnel fields are not supported.
+
+  - Encapsulation levels greater than ``2`` are not supported.
 
 - Age action:
 
diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index f548eacc5e..a504aebe15 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -92,6 +92,8 @@ New Features
 
   * Added support for accumulating from src field to dst field.
 
+  * Added support for copy inner fields in HW Steering flow engine.
+
   * Added support for VXLAN-GPE flags/rsvd0/rsvd fields matching in DV flow
 engine (``dv_flow_en`` = 1).
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 40376c99ba..b8cb385564 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -2507,10 +2507,14 @@ flow_validate_modify_field_level(const struct 
rte_flow_field_data *data,
if (data->level == 0)
return 0;
if (data->field != RTE_FLOW_FIELD_TAG &&
-   data->field != (enum rte_flow_field_id)MLX5_RTE_FLOW_FIELD_META_REG)
-   return rte_flow_error_set(error, ENOTSUP,
- RTE_FLOW_ERROR_TYPE_ACTION, NULL,
- "inner header fields modification is 
not supported");
+   data->field != (enum 
rte_flow_field_id)MLX5_RTE_FLOW_FIELD_META_REG) {
+   if (data->level > 1)
+   return rte_flow_error_set(error, ENOTSUP,
+ RTE_FLOW_ERROR_TYPE_ACTION,
+ NULL,
+ "inner header fields 
modification is not supported");
+   return 0;
+   }
if (data->tag_index != 0)
return rte_flow_error_set(error, EINVAL,
  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 6fded15d91..46f9f59e67 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -82,6 +82,9 @@
} \
} while (0)
 
+#define CALC_MODI_ID(field, level) \
+   (((level) > 1) ? MLX5_MODI_IN_##field : MLX5_MODI_OUT_##field)
+
 union flow_dv_attr {
struct {
uint32_t valid:1;
@@ -1638,8 +1641,8 @@ mlx5_flow_field_id_to_modify_info
MLX5_ASSERT(data->offset + width <= 48);
off_be = 48 - (data->offset + width);
if (off_be < 16) {
-   

[PATCH v2 6/7] net/mlx5: support modify IPv6 traffic class field

2024-02-07 Thread Michael Baum
Add HW steering support for IPv6 traffic class field modification.
Copy from inner IPv6 traffic class field is also supported using
"level=2".

Signed-off-by: Michael Baum 
---
 doc/guides/rel_notes/release_24_03.rst |  1 +
 drivers/net/mlx5/mlx5_flow_dv.c| 11 +++
 drivers/net/mlx5/mlx5_flow_hw.c|  3 ++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index a504aebe15..50ba66daac 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -113,6 +113,7 @@ New Features
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SPI`` flow 
action.
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` 
flow action.
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_PROTO`` 
flow action.
+  * Added HW steering support for modify field 
``RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS`` flow action.
 
 
 Removed Items
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 46f9f59e67..7d92c1cc24 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1394,6 +1394,7 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
return 32;
case RTE_FLOW_FIELD_IPV6_DSCP:
return 6;
+   case RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS:
case RTE_FLOW_FIELD_IPV6_HOPLIMIT:
case RTE_FLOW_FIELD_IPV6_PROTO:
return 8;
@@ -1795,6 +1796,16 @@ mlx5_flow_field_id_to_modify_info
else
info[idx].offset = off_be;
break;
+   case RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS:
+   MLX5_ASSERT(data->offset + width <= 8);
+   off_be = 8 - (data->offset + width);
+   modi_id = CALC_MODI_ID(IPV6_TRAFFIC_CLASS, data->level);
+   info[idx] = (struct field_modify_info){1, 0, modi_id};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_8(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
case RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN:
MLX5_ASSERT(data->offset + width <= 16);
off_be = 16 - (data->offset + width);
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 7a1821f457..a41b46e18f 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -2874,7 +2874,7 @@ flow_hw_modify_field_construct(struct mlx5_hw_q_job *job,
 * bits left. Shift the data left for IPV6 DSCP
 */
if (field->id == MLX5_MODI_OUT_IPV6_TRAFFIC_CLASS &&
-   !(mask & MLX5_IPV6_HDR_ECN_MASK))
+   mhdr_action->dst.field == RTE_FLOW_FIELD_IPV6_DSCP)
data <<= MLX5_IPV6_HDR_DSCP_SHIFT;
data = (data & mask) >> off_b;
job->mhdr_cmd[i++].data1 = rte_cpu_to_be_32(data);
@@ -5063,6 +5063,7 @@ flow_hw_validate_modify_field_level(const struct 
rte_flow_field_data *data,
case RTE_FLOW_FIELD_IPV4_TTL:
case RTE_FLOW_FIELD_IPV4_SRC:
case RTE_FLOW_FIELD_IPV4_DST:
+   case RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS:
case RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN:
case RTE_FLOW_FIELD_IPV6_HOPLIMIT:
case RTE_FLOW_FIELD_IPV6_SRC:
-- 
2.25.1



[PATCH v2 7/7] net/mlx5: support modify IPv6 flow label field

2024-02-07 Thread Michael Baum
Add HW steering support for IPv6 flow label field modification.
Copy from inner IPv6 flow label field is also supported using "level=2".

Signed-off-by: Michael Baum 
---
 doc/guides/rel_notes/release_24_03.rst |  1 +
 drivers/net/mlx5/mlx5_flow_dv.c| 12 
 drivers/net/mlx5/mlx5_flow_hw.c|  1 +
 3 files changed, 14 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 50ba66daac..267851f920 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -114,6 +114,7 @@ New Features
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_SEQ_NUM`` 
flow action.
   * Added HW steering support for modify field ``RTE_FLOW_FIELD_ESP_PROTO`` 
flow action.
   * Added HW steering support for modify field 
``RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS`` flow action.
+  * Added HW steering support for modify field 
``RTE_FLOW_FIELD_IPV6_FLOW_LABEL`` flow action.
 
 
 Removed Items
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 7d92c1cc24..bf5cd37f2f 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1394,6 +1394,8 @@ mlx5_flow_item_field_width(struct rte_eth_dev *dev,
return 32;
case RTE_FLOW_FIELD_IPV6_DSCP:
return 6;
+   case RTE_FLOW_FIELD_IPV6_FLOW_LABEL:
+   return 20;
case RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS:
case RTE_FLOW_FIELD_IPV6_HOPLIMIT:
case RTE_FLOW_FIELD_IPV6_PROTO:
@@ -1806,6 +1808,16 @@ mlx5_flow_field_id_to_modify_info
else
info[idx].offset = off_be;
break;
+   case RTE_FLOW_FIELD_IPV6_FLOW_LABEL:
+   MLX5_ASSERT(data->offset + width <= 20);
+   off_be = 20 - (data->offset + width);
+   modi_id = CALC_MODI_ID(IPV6_FLOW_LABEL, data->level);
+   info[idx] = (struct field_modify_info){4, 0, modi_id};
+   if (mask)
+   mask[idx] = flow_modify_info_mask_32(width, off_be);
+   else
+   info[idx].offset = off_be;
+   break;
case RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN:
MLX5_ASSERT(data->offset + width <= 16);
off_be = 16 - (data->offset + width);
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index a41b46e18f..1d003c9389 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -5064,6 +5064,7 @@ flow_hw_validate_modify_field_level(const struct 
rte_flow_field_data *data,
case RTE_FLOW_FIELD_IPV4_SRC:
case RTE_FLOW_FIELD_IPV4_DST:
case RTE_FLOW_FIELD_IPV6_TRAFFIC_CLASS:
+   case RTE_FLOW_FIELD_IPV6_FLOW_LABEL:
case RTE_FLOW_FIELD_IPV6_PAYLOAD_LEN:
case RTE_FLOW_FIELD_IPV6_HOPLIMIT:
case RTE_FLOW_FIELD_IPV6_SRC:
-- 
2.25.1



[PATCH v4 0/2] net/mlx5: add random compare support

2024-02-07 Thread Michael Baum
Add support for compare item with "RTE_FLOW_FIELD_RANDOM".

v2:
 - Rebase.
 - Add "RTE_FLOW_FIELD_META" compare support.
 - Reduce the "Depends-on" list.

v3:
 - Rebase.
 - Fix typo in function name, r/tranlate/translate.
 - Fix adding a line without newline at end of file.

v4:
 - Rebase.
 - Update documentation.
 - Remove the "Depends-on" label.

Hamdan Igbaria (1):
  net/mlx5/hws: add support for compare matcher

Michael Baum (1):
  net/mlx5: add support to compare random value

 doc/guides/nics/mlx5.rst  |   9 +-
 drivers/common/mlx5/mlx5_prm.h|  16 ++
 drivers/net/mlx5/hws/mlx5dr_cmd.c |   9 +-
 drivers/net/mlx5/hws/mlx5dr_cmd.h |   1 +
 drivers/net/mlx5/hws/mlx5dr_debug.c   |   4 +-
 drivers/net/mlx5/hws/mlx5dr_debug.h   |   1 +
 drivers/net/mlx5/hws/mlx5dr_definer.c | 243 +-
 drivers/net/mlx5/hws/mlx5dr_definer.h |  33 
 drivers/net/mlx5/hws/mlx5dr_matcher.c |  48 +
 drivers/net/mlx5/hws/mlx5dr_matcher.h |  12 +-
 drivers/net/mlx5/mlx5_flow_hw.c   |  70 ++--
 11 files changed, 417 insertions(+), 29 deletions(-)

-- 
2.25.1



[PATCH v4 1/2] net/mlx5/hws: add support for compare matcher

2024-02-07 Thread Michael Baum
From: Hamdan Igbaria 

Add support for compare matcher, this matcher will allow
direct comparison between two packet fields, or a packet
field and a value, with fully masked DW.
For now this matcher hash table is limited to size 1x1,
thus it supports only 1 rule STE.

Signed-off-by: Hamdan Igbaria 
Signed-off-by: Michael Baum 
---
 drivers/common/mlx5/mlx5_prm.h|  16 ++
 drivers/net/mlx5/hws/mlx5dr_cmd.c |   9 +-
 drivers/net/mlx5/hws/mlx5dr_cmd.h |   1 +
 drivers/net/mlx5/hws/mlx5dr_debug.c   |   4 +-
 drivers/net/mlx5/hws/mlx5dr_debug.h   |   1 +
 drivers/net/mlx5/hws/mlx5dr_definer.c | 243 +-
 drivers/net/mlx5/hws/mlx5dr_definer.h |  33 
 drivers/net/mlx5/hws/mlx5dr_matcher.c |  48 +
 drivers/net/mlx5/hws/mlx5dr_matcher.h |  12 +-
 9 files changed, 358 insertions(+), 9 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 0035a1e616..f8956c8a87 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -3459,6 +3459,7 @@ enum mlx5_ifc_rtc_ste_format {
MLX5_IFC_RTC_STE_FORMAT_8DW = 0x4,
MLX5_IFC_RTC_STE_FORMAT_11DW = 0x5,
MLX5_IFC_RTC_STE_FORMAT_RANGE = 0x7,
+   MLX5_IFC_RTC_STE_FORMAT_4DW_RANGE = 0x8,
 };
 
 enum mlx5_ifc_rtc_reparse_mode {
@@ -3497,6 +3498,21 @@ struct mlx5_ifc_rtc_bits {
u8 reserved_at_1a0[0x260];
 };
 
+struct mlx5_ifc_ste_match_4dw_range_ctrl_dw_bits {
+   u8 match[0x1];
+   u8 reserved_at_1[0x2];
+   u8 base1[0x1];
+   u8 inverse1[0x1];
+   u8 reserved_at_5[0x1];
+   u8 operator1[0x2];
+   u8 reserved_at_8[0x3];
+   u8 base0[0x1];
+   u8 inverse0[0x1];
+   u8 reserved_at_a[0x1];
+   u8 operator0[0x2];
+   u8 compare_delta[0x10];
+};
+
 struct mlx5_ifc_alias_context_bits {
u8 vhca_id_to_be_accessed[0x10];
u8 reserved_at_10[0xd];
diff --git a/drivers/net/mlx5/hws/mlx5dr_cmd.c 
b/drivers/net/mlx5/hws/mlx5dr_cmd.c
index 876a47147d..702d6fadac 100644
--- a/drivers/net/mlx5/hws/mlx5dr_cmd.c
+++ b/drivers/net/mlx5/hws/mlx5dr_cmd.c
@@ -370,9 +370,12 @@ mlx5dr_cmd_rtc_create(struct ibv_context *ctx,
 attr, obj_type, MLX5_GENERAL_OBJ_TYPE_RTC);
 
attr = MLX5_ADDR_OF(create_rtc_in, in, rtc);
-   MLX5_SET(rtc, attr, ste_format_0, rtc_attr->is_frst_jumbo ?
-   MLX5_IFC_RTC_STE_FORMAT_11DW :
-   MLX5_IFC_RTC_STE_FORMAT_8DW);
+   if (rtc_attr->is_compare) {
+   MLX5_SET(rtc, attr, ste_format_0, 
MLX5_IFC_RTC_STE_FORMAT_4DW_RANGE);
+   } else {
+   MLX5_SET(rtc, attr, ste_format_0, rtc_attr->is_frst_jumbo ?
+MLX5_IFC_RTC_STE_FORMAT_11DW : 
MLX5_IFC_RTC_STE_FORMAT_8DW);
+   }
 
if (rtc_attr->is_scnd_range) {
MLX5_SET(rtc, attr, ste_format_1, 
MLX5_IFC_RTC_STE_FORMAT_RANGE);
diff --git a/drivers/net/mlx5/hws/mlx5dr_cmd.h 
b/drivers/net/mlx5/hws/mlx5dr_cmd.h
index 18c2b07fc8..073ffd9633 100644
--- a/drivers/net/mlx5/hws/mlx5dr_cmd.h
+++ b/drivers/net/mlx5/hws/mlx5dr_cmd.h
@@ -82,6 +82,7 @@ struct mlx5dr_cmd_rtc_create_attr {
uint8_t reparse_mode;
bool is_frst_jumbo;
bool is_scnd_range;
+   bool is_compare;
 };
 
 struct mlx5dr_cmd_alias_obj_create_attr {
diff --git a/drivers/net/mlx5/hws/mlx5dr_debug.c 
b/drivers/net/mlx5/hws/mlx5dr_debug.c
index 11557bcab8..a9094cd35b 100644
--- a/drivers/net/mlx5/hws/mlx5dr_debug.c
+++ b/drivers/net/mlx5/hws/mlx5dr_debug.c
@@ -99,6 +99,7 @@ static int
 mlx5dr_debug_dump_matcher_match_template(FILE *f, struct mlx5dr_matcher 
*matcher)
 {
bool is_root = matcher->tbl->level == MLX5DR_ROOT_LEVEL;
+   bool is_compare = mlx5dr_matcher_is_compare(matcher);
enum mlx5dr_debug_res_type type;
int i, ret;
 
@@ -117,7 +118,8 @@ mlx5dr_debug_dump_matcher_match_template(FILE *f, struct 
mlx5dr_matcher *matcher
return rte_errno;
}
 
-   type = MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_MATCH_DEFINER;
+   type = is_compare ? 
MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_COMPARE_MATCH_DEFINER :
+   
MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_MATCH_DEFINER;
ret = mlx5dr_debug_dump_matcher_template_definer(f, mt, 
mt->definer, type);
if (ret)
return ret;
diff --git a/drivers/net/mlx5/hws/mlx5dr_debug.h 
b/drivers/net/mlx5/hws/mlx5dr_debug.h
index 5cffdb10b5..a89a6a0b1d 100644
--- a/drivers/net/mlx5/hws/mlx5dr_debug.h
+++ b/drivers/net/mlx5/hws/mlx5dr_debug.h
@@ -24,6 +24,7 @@ enum mlx5dr_debug_res_type {
MLX5DR_DEBUG_RES_TYPE_MATCHER_ACTION_TEMPLATE = 4204,
MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_HASH_DEFINER = 4205,
MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_RANGE_DEFINER = 4206,
+   MLX5DR_DEBUG_RES_TYPE_MATCHER_TEMPLATE_COMPARE_MATCH_DEFINER = 4207,
 };
 
 static inline uint64_t
diff --git a/drivers/net/mlx5/hws/mlx5dr_defi

[PATCH v4 2/2] net/mlx5: add support to compare random value

2024-02-07 Thread Michael Baum
Add support to use "RTE_FLOW_ITEM_TYPE_COMPARE" with
"RTE_FLOW_FIELD_RAMDOM" as an argument.
The random field is supported only when base is an immediate value,
random field cannot be compared with enother field.

Signed-off-by: Michael Baum 
---
 doc/guides/nics/mlx5.rst|  9 -
 drivers/net/mlx5/mlx5_flow_hw.c | 70 -
 2 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index fa013b03bb..43ef8a99dc 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -820,8 +820,13 @@ Limitations
 
   - Only supported in HW steering(``dv_flow_en`` = 2) mode.
   - Only single flow is supported to the flow table.
-  - Only 32-bit comparison is supported.
-  - Only match with compare result between packet fields is supported.
+  - Only single item is supported per pattern template.
+  - Only 32-bit comparison is supported or 16-bits for random field.
+  - Only supported for ``RTE_FLOW_FIELD_META``, ``RTE_FLOW_FIELD_TAG``,
+``RTE_FLOW_FIELD_RANDOM`` and ``RTE_FLOW_FIELD_VALUE``.
+  - The field type ``RTE_FLOW_FIELD_VALUE`` must be the base (``b``) field.
+  - The field type ``RTE_FLOW_FIELD_RANDOM`` can only be compared with
+``RTE_FLOW_FIELD_VALUE``.
 
 
 Statistics
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 3af5e1f160..b5741f0817 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -6717,18 +6717,55 @@ flow_hw_prepend_item(const struct rte_flow_item *items,
return copied_items;
 }
 
-static inline bool
-flow_hw_item_compare_field_supported(enum rte_flow_field_id field)
+static int
+flow_hw_item_compare_field_validate(enum rte_flow_field_id arg_field,
+   enum rte_flow_field_id base_field,
+   struct rte_flow_error *error)
 {
-   switch (field) {
+   switch (arg_field) {
+   case RTE_FLOW_FIELD_TAG:
+   case RTE_FLOW_FIELD_META:
+   break;
+   case RTE_FLOW_FIELD_RANDOM:
+   if (base_field == RTE_FLOW_FIELD_VALUE)
+   return 0;
+   return rte_flow_error_set(error, EINVAL,
+ RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+ NULL,
+ "compare random is supported only 
with immediate value");
+   default:
+   return rte_flow_error_set(error, ENOTSUP,
+ RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+ NULL,
+ "compare item argument field is not 
supported");
+   }
+   switch (base_field) {
case RTE_FLOW_FIELD_TAG:
case RTE_FLOW_FIELD_META:
case RTE_FLOW_FIELD_VALUE:
-   return true;
+   break;
+   default:
+   return rte_flow_error_set(error, ENOTSUP,
+ RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
+ NULL,
+ "compare item base field is not 
supported");
+   }
+   return 0;
+}
+
+static inline uint32_t
+flow_hw_item_compare_width_supported(enum rte_flow_field_id field)
+{
+   switch (field) {
+   case RTE_FLOW_FIELD_TAG:
+   case RTE_FLOW_FIELD_META:
+   return 32;
+   case RTE_FLOW_FIELD_RANDOM:
+   return 16;
default:
break;
}
-   return false;
+   return 0;
 }
 
 static int
@@ -6737,6 +6774,7 @@ flow_hw_validate_item_compare(const struct rte_flow_item 
*item,
 {
const struct rte_flow_item_compare *comp_m = item->mask;
const struct rte_flow_item_compare *comp_v = item->spec;
+   int ret;
 
if (unlikely(!comp_m))
return rte_flow_error_set(error, EINVAL,
@@ -6748,19 +6786,13 @@ flow_hw_validate_item_compare(const struct 
rte_flow_item *item,
   RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
   NULL,
   "compare item only support full mask");
-   if (!flow_hw_item_compare_field_supported(comp_m->a.field) ||
-   !flow_hw_item_compare_field_supported(comp_m->b.field))
-   return rte_flow_error_set(error, ENOTSUP,
-  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
-  NULL,
-  "compare item field not support");
-   if (comp_m->a.field == RTE_FLOW_FIELD_VALUE &&
-   comp_m->b.field == RTE_FLOW_FIELD_VALUE)
-   return rte_flow_error_set(error, EINVAL,
-  RTE_FLOW_ERROR_TYPE_UNSPECIFIED,
-  NULL,
-  "compare between value is not valid");
+   ret = flow_hw_

[PATCH] net/mlx5/hws: add compare ESP sequence number support

2024-02-07 Thread Michael Baum
Add support for compare item with "RTE_FLOW_FIELD_ESP_SEQ_NUM" field.

Signed-off-by: Michael Baum 
---

Depends-on: series-31008 ("ethdev: add modify IPv4 next protocol field")
Depends-on: series-31041 ("net/mlx5: add random compare support")

 doc/guides/nics/mlx5.rst  |  1 +
 drivers/net/mlx5/hws/mlx5dr_definer.c | 22 --
 drivers/net/mlx5/mlx5_flow_hw.c   |  3 +++
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 43ef8a99dc..b793f1ef58 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -823,6 +823,7 @@ Limitations
   - Only single item is supported per pattern template.
   - Only 32-bit comparison is supported or 16-bits for random field.
   - Only supported for ``RTE_FLOW_FIELD_META``, ``RTE_FLOW_FIELD_TAG``,
+``RTE_FLOW_FIELD_ESP_SEQ_NUM``,
 ``RTE_FLOW_FIELD_RANDOM`` and ``RTE_FLOW_FIELD_VALUE``.
   - The field type ``RTE_FLOW_FIELD_VALUE`` must be the base (``b``) field.
   - The field type ``RTE_FLOW_FIELD_RANDOM`` can only be compared with
diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 2d86175ca2..b29d7451e7 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -396,10 +396,20 @@ mlx5dr_definer_compare_base_value_set(const void 
*item_spec,
 
value = (const uint32_t *)&b->value[0];
 
-   if (a->field == RTE_FLOW_FIELD_RANDOM)
+   switch (a->field) {
+   case RTE_FLOW_FIELD_RANDOM:
*base = htobe32(*value << 16);
-   else
+   break;
+   case RTE_FLOW_FIELD_TAG:
+   case RTE_FLOW_FIELD_META:
*base = htobe32(*value);
+   break;
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
+   *base = *value;
+   break;
+   default:
+   break;
+   }
 
MLX5_SET(ste_match_4dw_range_ctrl_dw, ctrl, base0, 1);
 }
@@ -2887,6 +2897,14 @@ mlx5dr_definer_conv_item_compare_field(const struct 
rte_flow_field_data *f,
fc->compare_idx = dw_offset;
DR_CALC_SET_HDR(fc, random_number, random_number);
break;
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
+   fc = &cd->fc[MLX5DR_DEFINER_FNAME_ESP_SEQUENCE_NUMBER];
+   fc->item_idx = item_idx;
+   fc->tag_set = &mlx5dr_definer_compare_set;
+   fc->tag_mask_set = &mlx5dr_definer_ones_set;
+   fc->compare_idx = dw_offset;
+   DR_CALC_SET_HDR(fc, ipsec, sequence_number);
+   break;
default:
DR_LOG(ERR, "%u field is not supported", f->field);
goto err_notsup;
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index b5741f0817..4d6fb489b2 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -6725,6 +6725,7 @@ flow_hw_item_compare_field_validate(enum 
rte_flow_field_id arg_field,
switch (arg_field) {
case RTE_FLOW_FIELD_TAG:
case RTE_FLOW_FIELD_META:
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
break;
case RTE_FLOW_FIELD_RANDOM:
if (base_field == RTE_FLOW_FIELD_VALUE)
@@ -6743,6 +6744,7 @@ flow_hw_item_compare_field_validate(enum 
rte_flow_field_id arg_field,
case RTE_FLOW_FIELD_TAG:
case RTE_FLOW_FIELD_META:
case RTE_FLOW_FIELD_VALUE:
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
break;
default:
return rte_flow_error_set(error, ENOTSUP,
@@ -6759,6 +6761,7 @@ flow_hw_item_compare_width_supported(enum 
rte_flow_field_id field)
switch (field) {
case RTE_FLOW_FIELD_TAG:
case RTE_FLOW_FIELD_META:
+   case RTE_FLOW_FIELD_ESP_SEQ_NUM:
return 32;
case RTE_FLOW_FIELD_RANDOM:
return 16;
-- 
2.25.1



Re: [PATCH v2 1/7] ethdev: support report register names and filter

2024-02-07 Thread Ferruh Yigit
On 2/5/2024 10:51 AM, Jie Hai wrote:
> This patch adds "filter" and "names" fields to "rte_dev_reg_info"
> structure. Names of registers in data fields can be reported and
> the registers can be filtered by their names.
> 
> For compatibility, the original API rte_eth_dev_get_reg_info()
> does not use the name and filter fields. The new API
> rte_eth_dev_get_reg_info_ext() is added to support reporting
> names and filtering by names. If the drivers does not report
> the names, set them to "offset_XXX".
> 
> Signed-off-by: Jie Hai 
> ---
>  doc/guides/rel_notes/release_24_03.rst |  8 ++
>  lib/ethdev/rte_dev_info.h  | 11 
>  lib/ethdev/rte_ethdev.c| 36 ++
>  lib/ethdev/rte_ethdev.h| 22 
>  4 files changed, 77 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/release_24_03.rst 
> b/doc/guides/rel_notes/release_24_03.rst
> index 84d3144215c6..5d402341223a 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -75,6 +75,11 @@ New Features
>* Added support for Atomic Rules' TK242 packet-capture family of devices
>  with PCI IDs: ``0x1024, 0x1025, 0x1026``.
>  
> +* **Added support for dumping regiters with names and filter.**
>

s/regiters/registers/

> +
> +  * Added new API functions ``rte_eth_dev_get_reg_info_ext()`` to and filter
> +  * the registers by their names and get the information of registers(names,
> +  * values and other attributes).
>  

'*' makes a bullet, but above seems one sentences, if so please only
keep the first '*'.

>  Removed Items
>  -
> @@ -124,6 +129,9 @@ ABI Changes
>  
>  * No ABI change that would break compatibility with 23.11.
>  
> +* ethdev: Added ``filter`` and ``names`` fields to ``rte_dev_reg_info``
> +  structure for reporting names of regiters and filtering them by names.
> +
>  

This will break the ABI.

Think about a case, an application compiled with an old version of DPDK,
later same application started to use this version without re-compile,
application will send old version of 'struct rte_dev_reg_info', but new
version of DPDK will try to access or update new fields of the 'struct
rte_dev_reg_info'

One option is:
- to add a new 'struct rte_dev_reg_info_ext',
- 'rte_eth_dev_get_reg_info()' still uses old 'struct rte_dev_reg_info'
- 'get_reg()' dev_ops will use this new 'struct rte_dev_reg_info_ext'
- Add deprecation notice to update 'rte_eth_dev_get_reg_info()' to use
new struct in next LTS release


>  Known Issues
>  
> diff --git a/lib/ethdev/rte_dev_info.h b/lib/ethdev/rte_dev_info.h
> index 67cf0ae52668..2f4541bd46c8 100644
> --- a/lib/ethdev/rte_dev_info.h
> +++ b/lib/ethdev/rte_dev_info.h
> @@ -11,6 +11,11 @@ extern "C" {
>  
>  #include 
>  
> +#define RTE_ETH_REG_NAME_SIZE 128
> +struct rte_eth_reg_name {
> + char name[RTE_ETH_REG_NAME_SIZE];
> +};
> +
>  /*
>   * Placeholder for accessing device registers
>   */
> @@ -20,6 +25,12 @@ struct rte_dev_reg_info {
>   uint32_t length; /**< Number of registers to fetch */
>   uint32_t width; /**< Size of device register */
>   uint32_t version; /**< Device version */
> + /**
> +  * Filter for target subset of registers.
> +  * This field could affects register selection for data/length/names.
> +  */
> + char *filter;
> + struct rte_eth_reg_name *names; /**< Registers name saver */
>  };
>  
>  /*
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index f1c658f49e80..3e0294e49092 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -6388,8 +6388,39 @@ rte_eth_read_clock(uint16_t port_id, uint64_t *clock)
>  
>  int
>  rte_eth_dev_get_reg_info(uint16_t port_id, struct rte_dev_reg_info *info)
> +{
> + struct rte_dev_reg_info reg_info;
> + int ret;
> +
> + if (info == NULL) {
> + RTE_ETHDEV_LOG_LINE(ERR,
> + "Cannot get ethdev port %u register info to NULL",
> + port_id);
> + return -EINVAL;
> + }
> +
> + reg_info.length = info->length;
> + reg_info.data = info->data;
> + reg_info.names = NULL;
> + reg_info.filter = NULL;
> +
> + ret = rte_eth_dev_get_reg_info_ext(port_id, ®_info);
> + if (ret != 0)
> + return ret;
> +
> + info->length = reg_info.length;
> + info->width = reg_info.width;
> + info->version = reg_info.version;
> + info->offset = reg_info.offset;
> +
> + return 0;
> +}
> +
> +int
> +rte_eth_dev_get_reg_info_ext(uint16_t port_id, struct rte_dev_reg_info *info)
>  {
>   struct rte_eth_dev *dev;
> + uint32_t i;
>   int ret;
>  
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> @@ -6408,6 +6439,11 @@ rte_eth_dev_get_reg_info(uint16_t port_id, struct 
> rte_dev_reg_info *info)
>  
>   rte_ethdev_trace_get_reg_info(port_id, info, ret);
>  
> + /* Report the default names if drivers not

Re: [PATCH v2 2/7] ethdev: add telemetry cmd for registers

2024-02-07 Thread Ferruh Yigit
On 2/5/2024 10:51 AM, Jie Hai wrote:
> This patch adds a telemetry command for registers dump,
> and supports get registers with specified names.
> The length of the string exported by telemetry is limited
> by MAX_OUTPUT_LEN. Therefore, the filter should be more
> precise.
> 
> An example usage is shown below:
> --> /ethdev/regs,0,INTR
> {
>   "/ethdev/regs": {
> "registers_length": 318,
> "registers_width": 4,
> "register_offset": "0x0",
> "version": "0x1140011",
> "group_0": {
>   "HNS3_CMDQ_INTR_STS_REG": "0x0",
>   "HNS3_CMDQ_INTR_EN_REG": "0x2",
>   "HNS3_CMDQ_INTR_GEN_REG": "0x0",
>   "queue_0_HNS3_TQP_INTR_CTRL_REG": "0x0",
>   "queue_0_HNS3_TQP_INTR_GL0_REG": "0xa",
>   "queue_0_HNS3_TQP_INTR_GL1_REG": "0xa",
>   "queue_0_HNS3_TQP_INTR_GL2_REG": "0x0",
>   ...
>   },
> "group_1": {
> ...
> },
> ...
> }
> 

What is the intention of 'RTE_TEL_MAX_DICT_ENTRIES' and grouping above?

> or as below if the number of registers not exceed the
> RTE_TEL_MAX_DICT_ENTRIES:
> --> /ethdev/regs,0,ppp
> {
>   "/ethdev/regs": {
> "registers_length": 156,
> "registers_width": 4,
> "register_offset": "0x0",
> "version": "0x1140011",
> "ppp_key_drop_num": "0x0",
> "ppp_rlt_drop_num": "0x0",
> "ssu_ppp_mac_key_num_l": "0x1",
> "ssu_ppp_mac_key_num_h": "0x0",
> "ssu_ppp_host_key_num_l": "0x1",
> "ssu_ppp_host_key_num_h": "0x0",
> "ppp_ssu_mac_rlt_num_l": "0x1",
> ...
>}
> }
> 
> Signed-off-by: Jie Hai 



[PATCH v3] app/testpmd: command to get descriptor used count

2024-02-07 Thread skoteshwar
From: Satha Rao 

Existing Rx desc used count command extended to get Tx queue
used count.
testpmd> show port 0 rxq 0 desc used count
testpmd> show port 0 txq 0 desc used count

Signed-off-by: Satha Rao 
---
Depends-on: series-30833 ("ethdev: support Tx queue used count")

v2:
 extended rx_queue_desc_used_count command to support Tx
 updated testpmd_app_ug with new command

v3:
 Updated help string, log message as per review comments

 app/test-pmd/cmdline.c  | 116 
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  10 +--
 2 files changed, 70 insertions(+), 56 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index f704319..d8ea2b7 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -237,9 +237,8 @@ static void cmd_help_long_parsed(void *parsed_result,
"show port (port_id) rxq|txq (queue_id) desc (desc_id) 
status"
"   Show status of rx|tx descriptor.\n\n"
 
-   "show port (port_id) rxq (queue_id) desc used count\n"
-   "Show current number of filled receive"
-   " packet descriptors.\n\n"
+   "show port (port_id) rxq|txq (queue_id) desc used 
count\n"
+   "Show current number of used descriptor count for 
rx|tx.\n\n"
 
"show port (port_id) macs|mcast_macs"
"   Display list of mac addresses added to 
port.\n\n"
@@ -12745,11 +12744,11 @@ struct cmd_show_rx_tx_desc_status_result {
},
 };
 
-/* *** display rx queue desc used count *** */
-struct cmd_show_rx_queue_desc_used_count_result {
+/* *** display rx/tx queue descriptor used count *** */
+struct cmd_show_rx_tx_queue_desc_used_count_result {
cmdline_fixed_string_t cmd_show;
cmdline_fixed_string_t cmd_port;
-   cmdline_fixed_string_t cmd_rxq;
+   cmdline_fixed_string_t cmd_dir;
cmdline_fixed_string_t cmd_desc;
cmdline_fixed_string_t cmd_used;
cmdline_fixed_string_t cmd_count;
@@ -12758,73 +12757,88 @@ struct cmd_show_rx_queue_desc_used_count_result {
 };
 
 static void
-cmd_show_rx_queue_desc_used_count_parsed(void *parsed_result,
-   __rte_unused struct cmdline *cl,
-   __rte_unused void *data)
+cmd_show_rx_tx_queue_desc_used_count_parsed(void *parsed_result, __rte_unused 
struct cmdline *cl,
+   __rte_unused void *data)
 {
-   struct cmd_show_rx_queue_desc_used_count_result *res = parsed_result;
+   struct cmd_show_rx_tx_queue_desc_used_count_result *res = parsed_result;
int rc;
 
-   if (rte_eth_rx_queue_is_valid(res->cmd_pid, res->cmd_qid) != 0) {
-   fprintf(stderr,
-   "Invalid input: port id = %d, queue id = %d\n",
-   res->cmd_pid, res->cmd_qid);
-   return;
-   }
+   if (!strcmp(res->cmd_dir, "rxq")) {
+   if (rte_eth_rx_queue_is_valid(res->cmd_pid, res->cmd_qid) != 0) 
{
+   fprintf(stderr, "Invalid input: port id = %d, queue id 
= %d\n",
+   res->cmd_pid, res->cmd_qid);
+   return;
+   }
 
-   rc = rte_eth_rx_queue_count(res->cmd_pid, res->cmd_qid);
-   if (rc < 0) {
-   fprintf(stderr, "Invalid queueid = %d\n", res->cmd_qid);
-   return;
+   rc = rte_eth_rx_queue_count(res->cmd_pid, res->cmd_qid);
+   if (rc < 0) {
+   fprintf(stderr, "Rx queue count get failed rc=%d 
queue_id=%d\n", rc,
+   res->cmd_qid);
+   return;
+   }
+   printf("RxQ %d used desc count = %d\n", res->cmd_qid, rc);
+   } else if (!strcmp(res->cmd_dir, "txq")) {
+   if (rte_eth_tx_queue_is_valid(res->cmd_pid, res->cmd_qid) != 0) 
{
+   fprintf(stderr, "Invalid input: port id = %d, queue id 
= %d\n",
+   res->cmd_pid, res->cmd_qid);
+   return;
+   }
+
+   rc = rte_eth_tx_queue_count(res->cmd_pid, res->cmd_qid);
+   if (rc < 0) {
+   fprintf(stderr, "Tx queue count get failed rc=%d 
queue_id=%d\n", rc,
+   res->cmd_qid);
+   return;
+   }
+   printf("TxQ %d used desc count = %d\n", res->cmd_qid, rc);
}
-   printf("Used desc count = %d\n", rc);
 }
 
-static cmdline_parse_token_string_t cmd_show_rx_queue_desc_used_count_show =
+static cmdline_parse_token_string_t cmd_show_rx_tx_queue_desc_used_count_show =
TOKEN_STRING_INITIALIZER
-   (struct cmd_show_rx_queue_desc_used_count_result,
+   (struct cmd_show_rx_tx_queue_desc_used_count_result,
 cmd_show, "show");
-stati

[PATCH v6 0/6] changes for 24.03

2024-02-07 Thread Hernan Vargas
v6: Rework total_num_queues in separate commit. Fix typo in comment.
v5: Created separate commit for doc fix. Cosmetic uppercase changes.
v4: Targeting 24.03. Updated FPGA PMD based on review comments.
v3: Made changes requested during review.
v2: Targeting 23.11. Update in commits 1,2 based on review comments.
v1: Targeting 23.07 if possible. Add support for AGX100 (N6000) and corner case 
fixes.

Hernan Vargas (6):
  doc: fix fpga 5gnr configuration values
  baseband/fpga_5gnr_fec: renaming for consistency
  baseband/fpga_5gnr_fec: add Vista Creek variant
  baseband/fpga_5gnr_fec: rework total number queues
  baseband/fpga_5gnr_fec: add AGX100 support
  baseband/fpga_5gnr_fec: cosmetic comment changes

 doc/guides/bbdevs/fpga_5gnr_fec.rst   |   76 +-
 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |  273 ++
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|  353 +--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 2270 -
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |   27 +-
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  |  139 +
 6 files changed, 2204 insertions(+), 934 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
 create mode 100644 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h

-- 
2.37.1



[PATCH v6 1/6] doc: fix fpga 5gnr configuration values

2024-02-07 Thread Hernan Vargas
flr_timeout was removed from the code a while ago, updating doc.
Fix minor typo in 5GNR example.

Fixes: 2d4306438c92 ("baseband/fpga_5gnr_fec: add configure function")
Cc: sta...@dpdk.org

Signed-off-by: Hernan Vargas 
Reviewed-by: Maxime Coquelin 
---
 doc/guides/bbdevs/fpga_5gnr_fec.rst | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/fpga_5gnr_fec.rst 
b/doc/guides/bbdevs/fpga_5gnr_fec.rst
index 956dd6bed560..99fc936829a8 100644
--- a/doc/guides/bbdevs/fpga_5gnr_fec.rst
+++ b/doc/guides/bbdevs/fpga_5gnr_fec.rst
@@ -100,7 +100,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
   uint8_t dl_bandwidth;
   uint8_t ul_load_balance;
   uint8_t dl_load_balance;
-  uint16_t flr_time_out;
   };
 
 - ``pf_mode_en``: identifies whether only PF is to be used, or the VFs. PF and
@@ -126,10 +125,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
   If all hardware queues exceeds the watermark, no code blocks will be
   streamed in from UL/DL code block FIFO.
 
-- ``flr_time_out``: specifies how many 16.384us to be FLR time out. The
-  time_out = flr_time_out x 16.384us. For instance, if you want to set 10ms for
-  the FLR time out then set this setting to 0x262=610.
-
 
 An example configuration code calling the function 
``rte_fpga_5gnr_fec_configure()`` is shown
 below:
@@ -154,7 +149,7 @@ below:
   /* setup FPGA PF */
   ret = rte_fpga_5gnr_fec_configure(info->dev_name, &conf);
   TEST_ASSERT_SUCCESS(ret,
-  "Failed to configure 4G FPGA PF for bbdev %s",
+  "Failed to configure 5GNR FPGA PF for bbdev %s",
   info->dev_name);
 
 
-- 
2.37.1



[PATCH v6 2/6] baseband/fpga_5gnr_fec: renaming for consistency

2024-02-07 Thread Hernan Vargas
Rename generic functions and constants using the FPGA 5GNR prefix naming
to prepare for code reuse for new FPGA implementation variant.
No functional impact.

Signed-off-by: Hernan Vargas 
Reviewed-by: Maxime Coquelin 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h| 117 +++--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 455 --
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |  17 +-
 3 files changed, 269 insertions(+), 320 deletions(-)

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index e3038112fabb..9300349a731b 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -31,26 +31,26 @@
 #define FPGA_5GNR_FEC_VF_DEVICE_ID (0x0D90)
 
 /* Align DMA descriptors to 256 bytes - cache-aligned */
-#define FPGA_RING_DESC_ENTRY_LENGTH (8)
+#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
 /* Ring size is in 256 bits (32 bytes) units */
 #define FPGA_RING_DESC_LEN_UNIT_BYTES (32)
 /* Maximum size of queue */
-#define FPGA_RING_MAX_SIZE (1024)
+#define FPGA_5GNR_RING_MAX_SIZE (1024)
 
 #define FPGA_NUM_UL_QUEUES (32)
 #define FPGA_NUM_DL_QUEUES (32)
 #define FPGA_TOTAL_NUM_QUEUES (FPGA_NUM_UL_QUEUES + FPGA_NUM_DL_QUEUES)
 #define FPGA_NUM_INTR_VEC (FPGA_TOTAL_NUM_QUEUES - RTE_INTR_VEC_RXTX_OFFSET)
 
-#define FPGA_INVALID_HW_QUEUE_ID (0x)
+#define FPGA_5GNR_INVALID_HW_QUEUE_ID (0x)
 
-#define FPGA_QUEUE_FLUSH_TIMEOUT_US (1000)
-#define FPGA_HARQ_RDY_TIMEOUT (10)
-#define FPGA_TIMEOUT_CHECK_INTERVAL (5)
-#define FPGA_DDR_OVERFLOW (0x10)
+#define FPGA_5GNR_QUEUE_FLUSH_TIMEOUT_US (1000)
+#define FPGA_5GNR_HARQ_RDY_TIMEOUT (10)
+#define FPGA_5GNR_TIMEOUT_CHECK_INTERVAL (5)
+#define FPGA_5GNR_DDR_OVERFLOW (0x10)
 
-#define FPGA_5GNR_FEC_DDR_WR_DATA_LEN_IN_BYTES 8
-#define FPGA_5GNR_FEC_DDR_RD_DATA_LEN_IN_BYTES 8
+#define FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES 8
+#define FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES 8
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -152,7 +152,7 @@ struct __rte_packed fpga_dma_enc_desc {
};
 
uint8_t sw_ctxt[FPGA_RING_DESC_LEN_UNIT_BYTES *
-   (FPGA_RING_DESC_ENTRY_LENGTH - 1)];
+   (FPGA_5GNR_RING_DESC_ENTRY_LENGTH - 1)];
};
 };
 
@@ -197,7 +197,7 @@ struct __rte_packed fpga_dma_dec_desc {
uint8_t cbs_in_op;
};
 
-   uint32_t sw_ctxt[8 * (FPGA_RING_DESC_ENTRY_LENGTH - 1)];
+   uint32_t sw_ctxt[8 * (FPGA_5GNR_RING_DESC_ENTRY_LENGTH - 1)];
};
 };
 
@@ -207,8 +207,8 @@ union fpga_dma_desc {
struct fpga_dma_dec_desc dec_req;
 };
 
-/* FPGA 5GNR FEC Ring Control Register */
-struct __rte_packed fpga_ring_ctrl_reg {
+/* FPGA 5GNR Ring Control Register. */
+struct __rte_packed fpga_5gnr_ring_ctrl_reg {
uint64_t ring_base_addr;
uint64_t ring_head_addr;
uint16_t ring_size:11;
@@ -226,38 +226,37 @@ struct __rte_packed fpga_ring_ctrl_reg {
uint16_t rsrvd3;
uint16_t head_point;
uint16_t rsrvd4;
-
 };
 
-/* Private data structure for each FPGA FEC device */
+/* Private data structure for each FPGA 5GNR device. */
 struct fpga_5gnr_fec_device {
-   /** Base address of MMIO registers (BAR0) */
+   /** Base address of MMIO registers (BAR0). */
void *mmio_base;
-   /** Base address of memory for sw rings */
+   /** Base address of memory for sw rings. */
void *sw_rings;
-   /** Physical address of sw_rings */
+   /** Physical address of sw_rings. */
rte_iova_t sw_rings_phys;
/** Number of bytes available for each queue in device. */
uint32_t sw_ring_size;
-   /** Max number of entries available for each queue in device */
+   /** Max number of entries available for each queue in device. */
uint32_t sw_ring_max_depth;
-   /** Base address of response tail pointer buffer */
+   /** Base address of response tail pointer buffer. */
uint32_t *tail_ptrs;
-   /** Physical address of tail pointers */
+   /** Physical address of tail pointers. */
rte_iova_t tail_ptr_phys;
-   /** Queues flush completion flag */
+   /** Queues flush completion flag. */
uint64_t *flush_queue_status;
-   /* Bitmap capturing which Queues are bound to the PF/VF */
+   /** Bitmap capturing which Queues are bound to the PF/VF. */
uint64_t q_bound_bit_map;
-   /* Bitmap capturing which Queues have already been assigned */
+   /** Bitmap capturing which Queues have already been assigned. */
uint64_t q_assigned_bit_map;
-   /** True if this is a PF FPGA FEC device */
+   /** True if this is a PF FPGA 5GNR device. */
bool pf_device;
 };
 
-/* Structure associated with each queue. */
-struct __rte_cache_aligned fpga_queue {

[PATCH v6 3/6] baseband/fpga_5gnr_fec: add Vista Creek variant

2024-02-07 Thread Hernan Vargas
Create a new file vc_5gnr_pmd.h to store structures and macros specific
to Vista Creek 5G FPGA implementation and rename functions specific to
the Vista Creek variant.

Signed-off-by: Hernan Vargas 
Reviewed-by: Maxime Coquelin 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h| 183 ++-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 475 +-
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  | 140 ++
 3 files changed, 398 insertions(+), 400 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index 9300349a731b..982e956dc819 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -8,6 +8,8 @@
 #include 
 #include 
 
+#include "vc_5gnr_pmd.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
rte_log(RTE_LOG_ ## level, fpga_5gnr_fec_logtype, fmt "\n", \
@@ -25,32 +27,20 @@
 #define FPGA_5GNR_FEC_PF_DRIVER_NAME intel_fpga_5gnr_fec_pf
 #define FPGA_5GNR_FEC_VF_DRIVER_NAME intel_fpga_5gnr_fec_vf
 
-/* FPGA 5GNR FEC PCI vendor & device IDs */
-#define FPGA_5GNR_FEC_VENDOR_ID (0x8086)
-#define FPGA_5GNR_FEC_PF_DEVICE_ID (0x0D8F)
-#define FPGA_5GNR_FEC_VF_DEVICE_ID (0x0D90)
-
-/* Align DMA descriptors to 256 bytes - cache-aligned */
-#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
-/* Ring size is in 256 bits (32 bytes) units */
-#define FPGA_RING_DESC_LEN_UNIT_BYTES (32)
-/* Maximum size of queue */
-#define FPGA_5GNR_RING_MAX_SIZE (1024)
-
-#define FPGA_NUM_UL_QUEUES (32)
-#define FPGA_NUM_DL_QUEUES (32)
-#define FPGA_TOTAL_NUM_QUEUES (FPGA_NUM_UL_QUEUES + FPGA_NUM_DL_QUEUES)
-#define FPGA_NUM_INTR_VEC (FPGA_TOTAL_NUM_QUEUES - RTE_INTR_VEC_RXTX_OFFSET)
-
 #define FPGA_5GNR_INVALID_HW_QUEUE_ID (0x)
-
 #define FPGA_5GNR_QUEUE_FLUSH_TIMEOUT_US (1000)
 #define FPGA_5GNR_HARQ_RDY_TIMEOUT (10)
 #define FPGA_5GNR_TIMEOUT_CHECK_INTERVAL (5)
 #define FPGA_5GNR_DDR_OVERFLOW (0x10)
-
 #define FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES 8
 #define FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES 8
+/* Align DMA descriptors to 256 bytes - cache-aligned. */
+#define FPGA_5GNR_RING_DESC_ENTRY_LENGTH (8)
+/* Maximum size of queue. */
+#define FPGA_5GNR_RING_MAX_SIZE (1024)
+
+#define VC_5GNR_FPGA_VARIANT   0
+#define AGX100_FPGA_VARIANT1
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -62,32 +52,7 @@
 #define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
 #define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
 
-/* FPGA 5GNR FEC Register mapping on BAR0 */
-enum {
-   FPGA_5GNR_FEC_VERSION_ID = 0x, /* len: 4B */
-   FPGA_5GNR_FEC_CONFIGURATION = 0x0004, /* len: 2B */
-   FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE = 0x0008, /* len: 1B */
-   FPGA_5GNR_FEC_LOAD_BALANCE_FACTOR = 0x000a, /* len: 2B */
-   FPGA_5GNR_FEC_RING_DESC_LEN = 0x000c, /* len: 2B */
-   FPGA_5GNR_FEC_VFQ_FLUSH_STATUS_LW = 0x0018, /* len: 4B */
-   FPGA_5GNR_FEC_VFQ_FLUSH_STATUS_HI = 0x001c, /* len: 4B */
-   FPGA_5GNR_FEC_QUEUE_MAP = 0x0040, /* len: 256B */
-   FPGA_5GNR_FEC_RING_CTRL_REGS = 0x0200, /* len: 2048B */
-   FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS = 0x0A00, /* len: 4B */
-   FPGA_5GNR_FEC_DDR4_WR_DATA_REGS = 0x0A08, /* len: 8B */
-   FPGA_5GNR_FEC_DDR4_WR_DONE_REGS = 0x0A10, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS = 0x0A18, /* len: 4B */
-   FPGA_5GNR_FEC_DDR4_RD_DONE_REGS = 0x0A20, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_RDY_REGS = 0x0A28, /* len: 1B */
-   FPGA_5GNR_FEC_DDR4_RD_DATA_REGS = 0x0A30, /* len: 8B */
-   FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS = 0x0A38, /* len: 1B */
-   FPGA_5GNR_FEC_HARQ_BUF_SIZE_RDY_REGS = 0x0A40, /* len: 1B */
-   FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS = 0x0A48, /* len: 4B */
-   FPGA_5GNR_FEC_MUTEX = 0x0A60, /* len: 4B */
-   FPGA_5GNR_FEC_MUTEX_RESET = 0x0A68  /* len: 4B */
-};
-
-/* FPGA 5GNR FEC Ring Control Registers */
+/* FPGA 5GNR Ring Control Registers. */
 enum {
FPGA_5GNR_FEC_RING_HEAD_ADDR = 0x0008,
FPGA_5GNR_FEC_RING_SIZE = 0x0010,
@@ -98,113 +63,27 @@ enum {
FPGA_5GNR_FEC_RING_HEAD_POINT = 0x001C
 };
 
-/* FPGA 5GNR FEC DESCRIPTOR ERROR */
+/* VC 5GNR and AGX100 common register mapping on BAR0. */
 enum {
-   DESC_ERR_NO_ERR = 0x0,
-   DESC_ERR_K_P_OUT_OF_RANGE = 0x1,
-   DESC_ERR_Z_C_NOT_LEGAL = 0x2,
-   DESC_ERR_DESC_OFFSET_ERR = 0x3,
-   DESC_ERR_DESC_READ_FAIL = 0x8,
-   DESC_ERR_DESC_READ_TIMEOUT = 0x9,
-   DESC_ERR_DESC_READ_TLP_POISONED = 0xA,
-   DESC_ERR_HARQ_INPUT_LEN = 0xB,
-   DESC_ERR_CB_READ_FAIL = 0xC,
-   DESC_ERR_CB_READ_TIMEOUT = 0xD,
-   DESC_ERR_CB_READ_TLP_POISONED = 0xE,
-   DESC_ERR_HBSTORE_ERR = 0xF
-};
-
-
-/* FPGA 5GNR 

[PATCH v6 4/6] baseband/fpga_5gnr_fec: rework total number queues

2024-02-07 Thread Hernan Vargas
Add total_num_queues to the FPGA device struct as a preliminary rework
for the introduction of different FPGA variants.

Signed-off-by: Hernan Vargas 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|  2 +
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 37 +++
 2 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index 982e956dc819..879e5467ef3d 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -131,6 +131,8 @@ struct fpga_5gnr_fec_device {
uint64_t q_assigned_bit_map;
/** True if this is a PF FPGA 5GNR device. */
bool pf_device;
+   /** Maximum number of possible queues for this device. */
+   uint8_t total_num_queues;
 };
 
 /** Structure associated with each queue. */
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c 
b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index f9a776e6aea5..3fb505775f61 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -203,7 +203,7 @@ fpga_5gnr_setup_queues(struct rte_bbdev *dev, uint16_t 
num_queues, int socket_id
 * replaced with a queue ID and if it's not then
 * FPGA_5GNR_INVALID_HW_QUEUE_ID is returned.
 */
-   for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
+   for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
VC_5GNR_QUEUE_MAP + (q_id << 2));
 
@@ -367,7 +367,7 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct 
rte_bbdev_driver_info *dev_
 
/* Calculates number of queues assigned to device */
dev_info->max_num_queues = 0;
-   for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
+   for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
VC_5GNR_QUEUE_MAP + (q_id << 2));
if (hw_q_id != FPGA_5GNR_INVALID_HW_QUEUE_ID)
@@ -394,11 +394,11 @@ fpga_5gnr_find_free_queue_idx(struct rte_bbdev *dev,
struct fpga_5gnr_fec_device *d = dev->data->dev_private;
uint64_t q_idx;
uint8_t i = 0;
-   uint8_t range = VC_5GNR_TOTAL_NUM_QUEUES >> 1;
+   uint8_t range = d->total_num_queues >> 1;
 
if (conf->op_type == RTE_BBDEV_OP_LDPC_ENC) {
-   i = VC_5GNR_NUM_DL_QUEUES;
-   range = VC_5GNR_TOTAL_NUM_QUEUES;
+   i = d->total_num_queues >> 1;
+   range = d->total_num_queues;
}
 
for (; i < range; ++i) {
@@ -661,7 +661,7 @@ fpga_5gnr_dev_interrupt_handler(void *cb_arg)
uint8_t i;
 
/* Scan queue assigned to this device */
-   for (i = 0; i < VC_5GNR_TOTAL_NUM_QUEUES; ++i) {
+   for (i = 0; i < d->total_num_queues; ++i) {
q_idx = 1ULL << i;
if (d->q_bound_bit_map & q_idx) {
queue_id = get_queue_id(dev->data, i);
@@ -710,22 +710,25 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev)
 {
int ret;
uint8_t i;
+   struct fpga_5gnr_fec_device *d = dev->data->dev_private;
+   uint8_t num_intr_vec;
 
+   num_intr_vec = d->total_num_queues - RTE_INTR_VEC_RXTX_OFFSET;
if (!rte_intr_cap_multiple(dev->intr_handle)) {
rte_bbdev_log(ERR, "Multiple intr vector is not supported by 
FPGA (%s)",
dev->data->name);
return -ENOTSUP;
}
 
-   /* Create event file descriptors for each of 64 queue. Event fds will be
-* mapped to FPGA IRQs in rte_intr_enable(). This is a 1:1 mapping where
-* the IRQ number is a direct translation to the queue number.
+   /* Create event file descriptors for each of the supported queues 
(Maximum 64).
+* Event fds will be mapped to FPGA IRQs in rte_intr_enable().
+* This is a 1:1 mapping where the IRQ number is a direct translation 
to the queue number.
 *
-* 63 (VC_5GNR_NUM_INTR_VEC) event fds are created as rte_intr_enable()
+* num_intr_vec event fds are created as rte_intr_enable()
 * mapped the first IRQ to already created interrupt event file
 * descriptor (intr_handle->fd).
 */
-   if (rte_intr_efd_enable(dev->intr_handle, VC_5GNR_NUM_INTR_VEC)) {
+   if (rte_intr_efd_enable(dev->intr_handle, num_intr_vec)) {
rte_bbdev_log(ERR, "Failed to create fds for %u queues", 
dev->data->num_queues);
return -1;
}
@@ -735,7 +738,7 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev)
 * It ensures that callback function assigned to that descriptor will
 * invoked when any FPGA queue issues interrupt.
 */
-   for (i = 0; i < VC_5GNR_NUM_INTR_VEC; ++i)

[PATCH v6 5/6] baseband/fpga_5gnr_fec: add AGX100 support

2024-02-07 Thread Hernan Vargas
Add support for new FPGA variant AGX100 (on Arrow Creek N6000).

Signed-off-by: Hernan Vargas 
---
 doc/guides/bbdevs/fpga_5gnr_fec.rst   |   69 +-
 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h   |  273 
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|   10 +-
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1199 +++--
 drivers/baseband/fpga_5gnr_fec/vc_5gnr_pmd.h  |1 -
 5 files changed, 1438 insertions(+), 114 deletions(-)
 create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h

diff --git a/doc/guides/bbdevs/fpga_5gnr_fec.rst 
b/doc/guides/bbdevs/fpga_5gnr_fec.rst
index 99fc936829a8..1ae192a86b25 100644
--- a/doc/guides/bbdevs/fpga_5gnr_fec.rst
+++ b/doc/guides/bbdevs/fpga_5gnr_fec.rst
@@ -6,12 +6,13 @@ Intel(R) FPGA 5GNR FEC Poll Mode Driver
 
 The BBDEV FPGA 5GNR FEC poll mode driver (PMD) supports an FPGA implementation 
of a VRAN
 LDPC Encode / Decode 5GNR wireless acceleration function, using Intel's PCI-e 
and FPGA
-based Vista Creek device.
+based Vista Creek (N3000, referred to as VC_5GNR in the code) as well as Arrow 
Creek (N6000,
+referred to as AGX100 in the code).
 
 Features
 
 
-FPGA 5GNR FEC PMD supports the following features:
+FPGA 5GNR FEC PMD supports the following BBDEV capabilities:
 
 - LDPC Encode in the DL
 - LDPC Decode in the UL
@@ -67,10 +68,18 @@ Initialization
 
 When the device first powers up, its PCI Physical Functions (PF) can be listed 
through this command:
 
+Vista Creek (N3000)
+
 .. code-block:: console
 
   sudo lspci -vd8086:0d8f
 
+Arrow Creek (N6000)
+
+.. code-block:: console
+
+  sudo lspci -vd8086:5799
+
 The physical and virtual functions are compatible with Linux UIO drivers:
 ``vfio_pci`` and ``igb_uio``. However, in order to work the FPGA 5GNR FEC 
device firstly needs
 to be bound to one of these linux drivers through DPDK.
@@ -78,6 +87,7 @@ to be bound to one of these linux drivers through DPDK.
 For more details on how to bind the PF device and create VF devices, see
 :ref:`linux_gsg_binding_kernel`.
 
+
 Configure the VFs through PF
 
 
@@ -110,12 +120,12 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` 
structure:
 
 - ``vf_*l_queues_number``: defines the hardware queue mapping for every VF.
 
-- ``*l_bandwidth``: in case of congestion on PCIe interface. The device
-  allocates different bandwidth to UL and DL. The weight is configured by this
-  setting. The unit of weight is 3 code blocks. For example, if the code block
-  cbps (code block per second) ratio between UL and DL is 12:1, then the
-  configuration value should be set to 36:3. The schedule algorithm is based
-  on code block regardless the length of each block.
+- ``*l_bandwidth``: Only used for the Vista Creek schedule algorithm in case of
+  congestion on PCIe interface. The device allocates different bandwidth to UL
+  and DL. The weight is configured by this setting. The unit of weight is 3 
code
+  blocks. For example, if the code block cbps (code block per second) ratio 
between
+  UL and DL is 12:1, then the configuration value should be set to 36:3.
+  The schedule algorithm is based on code block regardless the length of each 
block.
 
 - ``*l_load_balance``: hardware queues are load-balanced in a round-robin
   fashion. Queues get filled first-in first-out until they reach a pre-defined
@@ -159,8 +169,38 @@ Test Application
 BBDEV provides a test application, ``test-bbdev.py`` and range of test data 
for testing
 the functionality of the device, depending on the device's capabilities.
 
-For more details on how to use the test application,
-see :ref:`test_bbdev_application`.
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params" : EAL arguments which are passed to the test app.
+  "-t", "--timeout": Timeout in seconds (default=300).
+  "-c", "--test-cases" : Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector": Test vector path 
(default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops": Number of operations to process on device 
(default=32).
+  "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32).
+  "-l", "--num-lcores" : Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the 
PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. 
The default values
+are defined in test_bbdev_perf.c as:
+
+- VF_UL_QUEUE_VALUE 4
+- VF_DL_QUEUE_VALUE 4
+- UL_BANDWIDTH 3
+- DL_BANDWIDTH 3
+- UL_LOAD_BALANCE 128
+- DL_LOAD_BALANCE 1

[PATCH v6 6/6] baseband/fpga_5gnr_fec: cosmetic comment changes

2024-02-07 Thread Hernan Vargas
Cosmetic changes for comments.
No functional impact.

Signed-off-by: Hernan Vargas 
Reviewed-by: Maxime Coquelin 
---
 .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h|  49 ++--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 258 +-
 .../fpga_5gnr_fec/rte_pmd_fpga_5gnr_fec.h |  16 +-
 3 files changed, 160 insertions(+), 163 deletions(-)

diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h 
b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
index 224684902569..6e97a3e9e2d4 100644
--- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
+++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
@@ -11,7 +11,7 @@
 #include "agx100_pmd.h"
 #include "vc_5gnr_pmd.h"
 
-/* Helper macro for logging */
+/* Helper macro for logging. */
 #define rte_bbdev_log(level, fmt, ...) \
rte_log(RTE_LOG_ ## level, fpga_5gnr_fec_logtype, fmt "\n", \
##__VA_ARGS__)
@@ -24,7 +24,7 @@
 #define rte_bbdev_log_debug(fmt, ...)
 #endif
 
-/* FPGA 5GNR FEC driver names */
+/* FPGA 5GNR FEC driver names. */
 #define FPGA_5GNR_FEC_PF_DRIVER_NAME intel_fpga_5gnr_fec_pf
 #define FPGA_5GNR_FEC_VF_DRIVER_NAME intel_fpga_5gnr_fec_vf
 
@@ -43,15 +43,15 @@
 #define VC_5GNR_FPGA_VARIANT   0
 #define AGX100_FPGA_VARIANT1
 
-/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
-#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
-#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
-#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
-#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
-#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
-#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
-#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
-#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2. */
+#define N_ZC_1 66 /**< N = 66 Zc for BG 1. */
+#define N_ZC_2 50 /**< N = 50 Zc for BG 2. */
+#define K0_1_1 17 /**< K0 fraction numerator for rv 1 and BG 1. */
+#define K0_1_2 13 /**< K0 fraction numerator for rv 1 and BG 2. */
+#define K0_2_1 33 /**< K0 fraction numerator for rv 2 and BG 1. */
+#define K0_2_2 25 /**< K0 fraction numerator for rv 2 and BG 2. */
+#define K0_3_1 56 /**< K0 fraction numerator for rv 3 and BG 1. */
+#define K0_3_2 43 /**< K0 fraction numerator for rv 3 and BG 2. */
 
 /* FPGA 5GNR Ring Control Registers. */
 enum {
@@ -93,7 +93,7 @@ struct __rte_packed fpga_5gnr_ring_ctrl_reg {
uint64_t ring_head_addr;
uint16_t ring_size:11;
uint16_t rsrvd0;
-   union { /* Miscellaneous register */
+   union { /* Miscellaneous register. */
uint8_t misc;
uint8_t max_ul_dec:5,
max_ul_dec_en:1,
@@ -140,26 +140,23 @@ struct fpga_5gnr_fec_device {
 
 /** Structure associated with each queue. */
 struct __rte_cache_aligned fpga_5gnr_queue {
-   struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg;  /**< Ring Control 
Register */
+   struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg;  /**< Ring Control 
Register. */
union {
/** Virtual address of VC 5GNR software ring. */
union vc_5gnr_dma_desc *vc_5gnr_ring_addr;
/** Virtual address of AGX100 software ring. */
union agx100_dma_desc *agx100_ring_addr;
};
-   uint64_t *ring_head_addr;  /* Virtual address of completion_head */
-   uint64_t shadow_completion_head; /* Shadow completion head value */
-   uint16_t head_free_desc;  /* Ring head */
-   uint16_t tail;  /* Ring tail */
-   /* Mask used to wrap enqueued descriptors on the sw ring */
-   uint32_t sw_ring_wrap_mask;
-   uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
-   uint8_t q_idx;  /* Queue index */
-   /** uuid used for MUTEX acquision for DDR */
-   uint16_t ddr_mutex_uuid;
-   struct fpga_5gnr_fec_device *d;
-   /* MMIO register of shadow_tail used to enqueue descriptors */
-   void *shadow_tail_addr;
+   uint64_t *ring_head_addr;  /**< Virtual address of completion_head. */
+   uint64_t shadow_completion_head; /**< Shadow completion head value. */
+   uint16_t head_free_desc;  /**< Ring head. */
+   uint16_t tail;  /**< Ring tail. */
+   uint32_t sw_ring_wrap_mask; /**< Mask used to wrap enqueued descriptors 
on the sw ring. */
+   uint32_t irq_enable;  /**< Enable ops dequeue interrupts if set to 1. */
+   uint8_t q_idx;  /**< Queue index. */
+   uint16_t ddr_mutex_uuid; /**< uuid used for MUTEX acquision for DDR. */
+   struct fpga_5gnr_fec_device *d; /**< FPGA 5GNR device structure. */
+   void *shadow_tail_addr; /**< MMIO register of shadow_tail used to 
enqueue descriptors. */
 };
 
 /* Write to 16 bit MMIO register address. */
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c 
b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 6beb10e546c4..f59516610f3e 100644

[PATCH v4] ethdev: add template table resize API

2024-02-07 Thread Gregory Etelson
Template table creation API sets table flows capacity.
If application needs more flows then the table was designed for,
the following procedures must be completed:
1. Create a new template table with larger flows capacity.
2. Re-create existing flows in the new table and delete flows from
   the original table.
3. Destroy original table.

Application cannot always execute that procedure:
* Port may not have sufficient resources to allocate a new table
  while maintaining original table.
* Application may not have existing flows "recipes" to re-create
  flows in a new table.

The patch defines a new API that allows application to resize
existing template table:

* Resizable template table must be created with the
RTE_FLOW_TABLE_SPECIALIZE_RESIZABLE_TABLE bit set.

* Application resizes existing table with the
  `rte_flow_template_table_resize()` function call.
  The table resize procedure updates the table maximal flow number
  only. Other table attributes are not affected by the table resize.
  ** The table resize procedure must not interrupt
 existing table flows operations in hardware.
  ** The table resize procedure must not alter flow handlers held by
 application.

* After `rte_flow_template_table_resize()` returned, application must
  update all existing table flow rules by calling
  `rte_flow_async_update_resized()`.
  The table resize procedure does not change application flow handler.
  However, flow object can reference internal PMD resources that are
  obsolete after table resize.
  `rte_flow_async_update_resized()` moves internal flow references
  to the updated table resources.
  The flow update must not interrupt hardware flow operations.

* When all table flow were updated, application must call
  `rte_flow_template_table_resize_complete()`.
  The function releases PMD resources related to the original
  table.
  Application can start new table resize after
  `rte_flow_template_table_resize_complete()` returned.

Testpmd commands:

* Create resizable template table
flow template_table  create table_id  resizable \
  [transfer|ingress|egres] group  \
  rules_number  \
  pattern_template   [ pattern_template  [ ... ]] \
  actions_template   [ actions_template  [ ... ]]

* Resize table:
flow template_table  resize table_resize_id  \
  table_resize_rules_num 

* Queue a flow update:
flow queue  update_resized  rule 

* Complete table resize:
flow template_table  resize_complete table 

Signed-off-by: Gregory Etelson 
Acked-by: Ori Kam 
---
v2: Update the patch comment.
Add table resize commands to testpmd user guide.
v3: Rename RTE_FLOW_TABLE_SPECIALIZE_RESIZABLE macro.
v4: Remove inline.
Add use case to rte_flow.rst.
---
 app/test-pmd/cmdline_flow.c |  86 ++-
 app/test-pmd/config.c   | 102 ++
 app/test-pmd/testpmd.h  |   6 ++
 doc/guides/howto/rte_flow.rst   | 111 
 doc/guides/rel_notes/release_24_03.rst  |   2 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  15 ++-
 lib/ethdev/ethdev_trace.h   |  33 ++
 lib/ethdev/ethdev_trace_points.c|   9 ++
 lib/ethdev/rte_flow.c   |  77 ++
 lib/ethdev/rte_flow.h   | 111 
 lib/ethdev/rte_flow_driver.h|  15 +++
 lib/ethdev/version.map  |   6 ++
 12 files changed, 567 insertions(+), 6 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index ce71818705..1a2556d53b 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -134,6 +134,7 @@ enum index {
/* Queue arguments. */
QUEUE_CREATE,
QUEUE_DESTROY,
+   QUEUE_FLOW_UPDATE_RESIZED,
QUEUE_UPDATE,
QUEUE_AGED,
QUEUE_INDIRECT_ACTION,
@@ -191,8 +192,12 @@ enum index {
/* Table arguments. */
TABLE_CREATE,
TABLE_DESTROY,
+   TABLE_RESIZE,
+   TABLE_RESIZE_COMPLETE,
TABLE_CREATE_ID,
TABLE_DESTROY_ID,
+   TABLE_RESIZE_ID,
+   TABLE_RESIZE_RULES_NUMBER,
TABLE_INSERTION_TYPE,
TABLE_INSERTION_TYPE_NAME,
TABLE_HASH_FUNC,
@@ -204,6 +209,7 @@ enum index {
TABLE_TRANSFER,
TABLE_TRANSFER_WIRE_ORIG,
TABLE_TRANSFER_VPORT_ORIG,
+   TABLE_RESIZABLE,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1323,6 +1329,8 @@ static const enum index next_group_attr[] = {
 static const enum index next_table_subcmd[] = {
TABLE_CREATE,
TABLE_DESTROY,
+   TABLE_RESIZE,
+   TABLE_RESIZE_COMPLETE,
ZERO,
 };
 
@@ -1337,6 +1345,7 @@ static const enum index next_table_attr[] = {
TABLE_TRANSFER,
TABLE_TRANSFER_WIRE_ORIG,
TABLE_TRANSFER_VPORT_ORIG,
+   TABLE_RESIZABLE,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -13

Re: [PATCH] net/hns3: support power monitor

2024-02-07 Thread Ferruh Yigit
On 2/5/2024 8:35 AM, Jie Hai wrote:
> From: Chengwen Feng 
> 
> This commit supports power monitor on the Rx queue descriptor of the
> next poll.
> 
> Note: Although rte_power_monitor() on the ARM platform does not support
> callback, this commit still implements the callback so that it does not
> need to be adjusted after the ARM platform supports callback.
> 
> Signed-off-by: Chengwen Feng 
> Signed-off-by: Jie Hai 
>

Applied to dpdk-next-net/main, thanks.


Re: [PATCH v3] app/testpmd: command to get descriptor used count

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 5:04 PM, skotesh...@marvell.com wrote:
> From: Satha Rao 
> 
> Existing Rx desc used count command extended to get Tx queue
> used count.
> testpmd> show port 0 rxq 0 desc used count
> testpmd> show port 0 txq 0 desc used count
> 
> Signed-off-by: Satha Rao 
>

Reviewed-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.



Re: [PATCH 1/2] net/mana: use a MR variable on the stack instead of allocating it

2024-02-07 Thread Ferruh Yigit
On 1/30/2024 1:24 AM, lon...@linuxonhyperv.com wrote:
> From: Long Li 
> 
> The content of the MR is copied to the cache trees, it's not necessary to
> allocate a MR to do this. Use a variable on the stack instead.
> 
> This also fixes the memory leak in the code where a MR is allocated but
> never freed.
> 

patch title describes what is done, but not gives information about
reasoning and impact.

Is this a performance improvement (if so how much), or is this a fix for
the memory leak (if so we need fixes tag for backport), or just a
refactoring?

> Signed-off-by: Long Li 
> ---
>  drivers/net/mana/mr.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/mana/mr.c b/drivers/net/mana/mr.c
> index d6a5ad1460..c9d0f7ef5a 100644
> --- a/drivers/net/mana/mr.c
> +++ b/drivers/net/mana/mr.c
> @@ -40,7 +40,7 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree, struct 
> mana_priv *priv,
>   struct ibv_mr *ibv_mr;
>   struct mana_range ranges[pool->nb_mem_chunks];
>   uint32_t i;
> - struct mana_mr_cache *mr;
> + struct mana_mr_cache mr;
>   int ret;
>  
>   rte_mempool_mem_iter(pool, mana_mempool_chunk_cb, ranges);
> @@ -75,14 +75,13 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree, struct 
> mana_priv *priv,
>   DP_LOG(DEBUG, "MR lkey %u addr %p len %zu",
>  ibv_mr->lkey, ibv_mr->addr, ibv_mr->length);
>  
> - mr = rte_calloc("MANA MR", 1, sizeof(*mr), 0);
> - mr->lkey = ibv_mr->lkey;
> - mr->addr = (uintptr_t)ibv_mr->addr;
> - mr->len = ibv_mr->length;
> - mr->verb_obj = ibv_mr;
> + mr.lkey = ibv_mr->lkey;
> + mr.addr = (uintptr_t)ibv_mr->addr;
> + mr.len = ibv_mr->length;
> + mr.verb_obj = ibv_mr;
>  
>   rte_spinlock_lock(&priv->mr_btree_lock);
> - ret = mana_mr_btree_insert(&priv->mr_btree, mr);
> + ret = mana_mr_btree_insert(&priv->mr_btree, &mr);
>   rte_spinlock_unlock(&priv->mr_btree_lock);
>   if (ret) {
>   ibv_dereg_mr(ibv_mr);
> @@ -90,7 +89,7 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree, struct 
> mana_priv *priv,
>   return ret;
>   }
>  
> - ret = mana_mr_btree_insert(local_tree, mr);
> + ret = mana_mr_btree_insert(local_tree, &mr);
>   if (ret) {
>   /* Don't need to clean up MR as it's already
>* in the global tree



Re: [PATCH 2/2] net/mana: properly deal with MR cache expansion failure

2024-02-07 Thread Ferruh Yigit
On 1/30/2024 1:24 AM, lon...@linuxonhyperv.com wrote:
> From: Long Li 
> 
> On MR cache expension failure, the request should fail as there is no path
> to get a new MR into the tree. Attempting to insert a new MR to the cache
> tree will result in memory violation.
>

if this patch is fixing memory violation, can you please update commit
log as fix commit and add fixes tag?



RE: [PATCH 1/2] net/mana: use a MR variable on the stack instead of allocating it

2024-02-07 Thread Long Li
> > From: Long Li 
> >
> > The content of the MR is copied to the cache trees, it's not necessary
> > to allocate a MR to do this. Use a variable on the stack instead.
> >
> > This also fixes the memory leak in the code where a MR is allocated
> > but never freed.
> >
> 
> patch title describes what is done, but not gives information about reasoning 
> and
> impact.
> 
> Is this a performance improvement (if so how much), or is this a fix for the
> memory leak (if so we need fixes tag for backport), or just a refactoring?

It's for fixing memory leak. I'll send v2 for better wording.

> 
> > Signed-off-by: Long Li 
> > ---
> >  drivers/net/mana/mr.c | 15 +++
> >  1 file changed, 7 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/net/mana/mr.c b/drivers/net/mana/mr.c index
> > d6a5ad1460..c9d0f7ef5a 100644
> > --- a/drivers/net/mana/mr.c
> > +++ b/drivers/net/mana/mr.c
> > @@ -40,7 +40,7 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree,
> struct mana_priv *priv,
> > struct ibv_mr *ibv_mr;
> > struct mana_range ranges[pool->nb_mem_chunks];
> > uint32_t i;
> > -   struct mana_mr_cache *mr;
> > +   struct mana_mr_cache mr;
> > int ret;
> >
> > rte_mempool_mem_iter(pool, mana_mempool_chunk_cb, ranges); @@
> -75,14
> > +75,13 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree, struct
> mana_priv *priv,
> > DP_LOG(DEBUG, "MR lkey %u addr %p len %zu",
> >ibv_mr->lkey, ibv_mr->addr, ibv_mr->length);
> >
> > -   mr = rte_calloc("MANA MR", 1, sizeof(*mr), 0);
> > -   mr->lkey = ibv_mr->lkey;
> > -   mr->addr = (uintptr_t)ibv_mr->addr;
> > -   mr->len = ibv_mr->length;
> > -   mr->verb_obj = ibv_mr;
> > +   mr.lkey = ibv_mr->lkey;
> > +   mr.addr = (uintptr_t)ibv_mr->addr;
> > +   mr.len = ibv_mr->length;
> > +   mr.verb_obj = ibv_mr;
> >
> > rte_spinlock_lock(&priv->mr_btree_lock);
> > -   ret = mana_mr_btree_insert(&priv->mr_btree, mr);
> > +   ret = mana_mr_btree_insert(&priv->mr_btree, &mr);
> > rte_spinlock_unlock(&priv->mr_btree_lock);
> > if (ret) {
> > ibv_dereg_mr(ibv_mr);
> > @@ -90,7 +89,7 @@ mana_new_pmd_mr(struct mana_mr_btree *local_tree,
> struct mana_priv *priv,
> > return ret;
> > }
> >
> > -   ret = mana_mr_btree_insert(local_tree, mr);
> > +   ret = mana_mr_btree_insert(local_tree, &mr);
> > if (ret) {
> > /* Don't need to clean up MR as it's already
> >  * in the global tree



RE: [PATCH 2/2] net/mana: properly deal with MR cache expansion failure

2024-02-07 Thread Long Li
> > On MR cache expension failure, the request should fail as there is no
> > path to get a new MR into the tree. Attempting to insert a new MR to
> > the cache tree will result in memory violation.
> >
> 
> if this patch is fixing memory violation, can you please update commit log as 
> fix
> commit and add fixes tag?

Will send v2.


Re: [PATCH] net/tap: Modified TAP BPF program as per the Kernel-version upgrade requirements.

2024-02-07 Thread Stephen Hemminger
On Fri, 12 Jan 2024 19:18:21 +0530
madhuker.myt...@oracle.com wrote:

> +struct  {
> + __uint(type,   BPF_MAP_TYPE_HASH);
> + __type(key,  __u32);
> + __type(value, struct rss_key);
> + __uint(max_entries,  256);
> +} map_keys SEC(".maps");
>  

Overall this patch is a big step forward in getting TAP BPF going again.
But using the new BTF maps won't work with how the tap device is
loading the BPF program. Getting BTF to work requires more steps and
is best done by using libbpf. With this part of your version current
kernels will give type mismatch in verifier since the type information
for map_keys is not loaded.

See my followon RFC for what libbpf integration looks like.
Ends up being a deeper rewrite.


Re: [PATCH v2] common/sfc: replace out of bounds condition with static_assert

2024-02-07 Thread Ferruh Yigit
On 1/19/2024 10:13 PM, Stephen Hemminger wrote:
> The sfc base code had its own definition of static assertions
> using the out of bound array access hack. Replace it with a
> static_assert like rte_common.h.
> 
> Fixes: f67e4719147d ("net/sfc/base: fix coding style")
> Signed-off-by: Stephen Hemminger 
> Acked-by: Morten Brørup 
> ---
> v2 - add assert.h to make sure it works in other environments
> 
>  drivers/common/sfc_efx/base/efx.h | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/common/sfc_efx/base/efx.h 
> b/drivers/common/sfc_efx/base/efx.h
> index 3312c2fa8f81..38f2aed3e336 100644
> --- a/drivers/common/sfc_efx/base/efx.h
> +++ b/drivers/common/sfc_efx/base/efx.h
> @@ -7,6 +7,8 @@
>  #ifndef  _SYS_EFX_H
>  #define  _SYS_EFX_H
>  
> +#include 
> +
>  #include "efx_annote.h"
>  #include "efsys.h"
>  #include "efx_types.h"
> @@ -17,8 +19,8 @@
>  extern "C" {
>  #endif
>  
> -#define  EFX_STATIC_ASSERT(_cond)\
> - ((void)sizeof (char[(_cond) ? 1 : -1]))
> +#define  EFX_STATIC_ASSERT(_cond) \
> + do { static_assert((_cond), "assert failed" #_cond); } while (0)
>  
>  #define  EFX_ARRAY_SIZE(_array)  \
>   (sizeof (_array) / sizeof ((_array)[0]))

Getting following build error with clang:

FAILED: drivers/common/sfc_efx/base/libsfc_base.a.p/ef10_filter.c.o

./drivers/common/sfc_efx/base/ef10_filter.c
../drivers/common/sfc_efx/base/ef10_filter.c:503:2: error: static_assert
expression is not an integral constant expression
EFX_STATIC_ASSERT((EFX_FIELD_OFFSET(efx_filter_spec_t,
efs_outer_vid) %

^~~
../drivers/common/sfc_efx/base/efx.h:23:21: note: expanded from macro
'EFX_STATIC_ASSERT'
do { static_assert((_cond), "assert failed" #_cond); } while (0)
   ^~~
../drivers/common/sfc_efx/base/ef10_filter.c:503:21: note: cast that
performs the conversions of a reinterpret_cast is not allowed in a
constant expression
EFX_STATIC_ASSERT((EFX_FIELD_OFFSET(efx_filter_spec_t,
efs_outer_vid) %
   ^
../drivers/common/sfc_efx/base/efx.h:29:3: note: expanded from macro
'EFX_FIELD_OFFSET'
((size_t)&(((_type *)0)->_field))
 ^
../drivers/common/sfc_efx/base/ef10_filter.c:1246:18: error: shift count
>= width of type [-Werror,-Wshift-count-overflow]
matches_count = MCDI_OUT_DWORD(req,
^~~
../drivers/common/sfc_efx/base/efx_mcdi.h:493:2: note: expanded from
macro 'MCDI_OUT_DWORD'
EFX_DWORD_FIELD(*MCDI_OUT2(_emr, efx_dword_t, _ofst),   \
^
../drivers/common/sfc_efx/base/efx_types.h:533:30: note: expanded from
macro 'EFX_DWORD_FIELD'
EFX_HIGH_BIT(_field)) & EFX_MASK32(_field))
^~
../drivers/common/sfc_efx/base/efx_types.h:145:23: note: expanded from
macro 'EFX_MASK32'
(uint32_t)1) << EFX_WIDTH(_field))) - 1))
 ^  ~
2 errors generated.



RE: [PATCH v3 2/3] config/arm: add support for fallback march

2024-02-07 Thread Wathsala Wathawana Vithanage



> 
> From: Pavan Nikhilesh 
> 
> Some ARM CPUs have specific march requirements and are not compatible
> with the supported march list.
> Add fallback march in case the mcpu and the march advertised in the
> part_number_config are not supported by the compiler.
> 
> Example
> mcpu = neoverse-n2
> march = armv9-a
> fallback_march = armv8.5-a
> 
> mcpu, march not supported
> machine_args = ['-march=armv8.5-a']
> 
> mcpu, march, fallback_march not supported
> least march supported = armv8-a
> 
> machine_args = ['-march=armv8-a']
> 

Similar to "[v3,1/3] config/arm: avoid mcpu and march conflicts" here
also we can avoid selecting march if it's not supported by the compiler.
Ideally, we should exit the build with an error saying march/mcpu is not
supported and suggest further actions (like trying again with 
-Dmarch -Dmcpu or -Dplatform=generic-armv9 as discussed in [v3,1/3])

> Signed-off-by: Pavan Nikhilesh 
> ---
>  config/arm/meson.build | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/config/arm/meson.build b/config/arm/meson.build index
> ba859bd060b5..4e44d1850bae 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -94,6 +94,7 @@ part_number_config_arm = {
>  '0xd49': {
>  'march': 'armv9-a',
>  'march_features': ['sve2'],
> +'fallback_march': 'armv8.5-a',
>  'mcpu': 'neoverse-n2',
>  'flags': [
>  ['RTE_MACHINE', '"neoverse-n2"'], @@ -708,6 +709,7 @@ if
> update_flags
> 
>  # probe supported archs and their features
>  candidate_march = ''
> +fallback_march = ''
>  if part_number_config.has_key('march')
>  if part_number_config.get('force_march', false) or candidate_mcpu != 
> ''
>  if cc.has_argument('-march=' +  part_number_config['march']) @@ -
> 728,10 +730,18 @@ if update_flags
>  # highest supported march version found
>  break
>  endif
> +if (part_number_config.has_key('fallback_march') and
> +supported_march == part_number_config['fallback_march'] 
> and
> +cc.has_argument('-march=' + supported_march))
> +fallback_march = supported_march
> +endif
>  endforeach
>  endif
> 
>  if candidate_march != part_number_config['march']
> +if fallback_march != ''
> +candidate_march = fallback_march
> +endif
>  warning('Configuration march version is @0@, not supported.'
>  .format(part_number_config['march']))
>  if candidate_march != ''
> --
> 2.43.0



Re: [dpdk-dev] [v2] ethdev: support Tx queue used count

2024-02-07 Thread Ferruh Yigit
On 1/23/2024 11:46 AM, Ferruh Yigit wrote:
> On 1/22/2024 1:00 PM, Konstantin Ananyev wrote:
>> CAUTION: This message has originated from an External Source. Please use 
>> proper judgment and caution when opening attachments, clicking links, or 
>> responding to this email.
>>
>>
>>> From: Jerin Jacob 
>>>
>>> Introduce a new API to retrieve the number of used descriptors
>>> in a Tx queue. Applications can leverage this API in the fast path to
>>> inspect the Tx queue occupancy and take appropriate actions based on the
>>> available free descriptors.
>>>
>>> A notable use case could be implementing Random Early Discard (RED)
>>> in software based on Tx queue occupancy.
>>>
>>> Signed-off-by: Jerin Jacob 
>>> Reviewed-by: Andrew Rybchenko 
>>> Acked-by: Morten Brørup 
>>>
>>
>> Acked-by: Konstantin Ananyev 
>>
> 
> Reviewed-by: Ferruh Yigit 
> 
> Applied to dpdk-next-net/main, thanks.
> 

There is a build error related to the tracing object.

As 'rte_eth_tx_queue_count()' is static inline, application needs to be
able to access '__rte_eth_trace_tx_queue_count' tracing object, this is
problem in shared library build.

Needs to update '.../ethdev/version.map' and add
'__rte_eth_trace_tx_queue_count'. I am doing the change in next-net and
force push. FYI.


Since there was no user of the 'rte_eth_tx_queue_count()' API, not able
to detect the issue with this patch. But with testpmd support problem
became visible.


[PATCH v2 0/7] net/tap: RSS using BPF overhaul

2024-02-07 Thread Stephen Hemminger
THe support of doing RSS for rte_flow_action was a cool idea
but it has been broken for several releases of DPDK as the
kernel and BPF infrastructure changed.

This series cleans up the BPF program, implements several
features that were never completed in the original code
and changes to use the current BPF toolchain.

The result should be easier to read and maintain.
The build process checks for the required components
and if not there will stub out to not supported.

This patch series is mostly the same as the original RFC,
most of the changes are to split it up and always build
the BPF from source.

Stephen Hemminger (7):
  net/tap: remove unused RSS hash types
  net/tap: validate and setup parameters for BPF RSS
  net/tap: stop "vendoring" linux bpf headers
  net/tap: rewrite the RSS BPF program
  net/tap: use libbpf to load new BPF program
  net/tap: remove no longer used files
  MAINTAINERS: add maintainer for TAP device

 .gitignore|3 -
 MAINTAINERS   |1 +
 drivers/net/tap/bpf/Makefile  |   19 -
 drivers/net/tap/bpf/README|   12 +
 drivers/net/tap/bpf/bpf_api.h |  276 
 drivers/net/tap/bpf/bpf_elf.h |   53 -
 drivers/net/tap/bpf/bpf_extract.py|   86 --
 drivers/net/tap/bpf/meson.build   |   81 ++
 drivers/net/tap/bpf/tap_bpf_program.c |  255 
 drivers/net/tap/bpf/tap_rss.c |  272 
 drivers/net/tap/meson.build   |   26 +-
 drivers/net/tap/rte_eth_tap.c |2 +
 drivers/net/tap/rte_eth_tap.h |9 +-
 drivers/net/tap/tap_bpf.h |  121 --
 drivers/net/tap/tap_bpf_api.c |  190 ---
 drivers/net/tap/tap_bpf_insns.h   | 1743 -
 drivers/net/tap/tap_flow.c|  531 +++-
 drivers/net/tap/tap_flow.h|   11 +-
 drivers/net/tap/tap_rss.h |   14 +-
 drivers/net/tap/tap_rss.stub.h|   45 +
 drivers/net/tap/tap_tcmsgs.h  |4 +-
 21 files changed, 584 insertions(+), 3170 deletions(-)
 delete mode 100644 drivers/net/tap/bpf/Makefile
 create mode 100644 drivers/net/tap/bpf/README
 delete mode 100644 drivers/net/tap/bpf/bpf_api.h
 delete mode 100644 drivers/net/tap/bpf/bpf_elf.h
 delete mode 100644 drivers/net/tap/bpf/bpf_extract.py
 create mode 100644 drivers/net/tap/bpf/meson.build
 delete mode 100644 drivers/net/tap/bpf/tap_bpf_program.c
 create mode 100644 drivers/net/tap/bpf/tap_rss.c
 delete mode 100644 drivers/net/tap/tap_bpf.h
 delete mode 100644 drivers/net/tap/tap_bpf_api.c
 delete mode 100644 drivers/net/tap/tap_bpf_insns.h
 create mode 100644 drivers/net/tap/tap_rss.stub.h

-- 
2.43.0



[PATCH v2 1/7] net/tap: remove unused RSS hash types

2024-02-07 Thread Stephen Hemminger
The driver doesn't support these other hash types, and there
is no reason to implement these in future.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_rss.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/tap/tap_rss.h b/drivers/net/tap/tap_rss.h
index dff46a012f94..8766ffc244f6 100644
--- a/drivers/net/tap/tap_rss.h
+++ b/drivers/net/tap/tap_rss.h
@@ -21,12 +21,6 @@ enum hash_field {
HASH_FIELD_IPV4_L3_L4,  /* IPv4 src/dst addr + L4 src/dst ports */
HASH_FIELD_IPV6_L3, /* IPv6 src/dst addr */
HASH_FIELD_IPV6_L3_L4,  /* IPv6 src/dst addr + L4 src/dst ports */
-   HASH_FIELD_L2_SRC,  /* Ethernet src addr */
-   HASH_FIELD_L2_DST,  /* Ethernet dst addr */
-   HASH_FIELD_L3_SRC,  /* L3 src addr */
-   HASH_FIELD_L3_DST,  /* L3 dst addr */
-   HASH_FIELD_L4_SRC,  /* TCP/UDP src ports */
-   HASH_FIELD_L4_DST,  /* TCP/UDP dst ports */
 };
 
 struct rss_key {
-- 
2.43.0



[PATCH v2 2/7] net/tap: validate and setup parameters for BPF RSS

2024-02-07 Thread Stephen Hemminger
The flow RSS support via BPF was not using the key, or
hash type parameters. Which is good because they were never
properly setup.

Fix the setup and validate the flow parameters, the BPF
side gets fixed later.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf_insns.h | 16 
 drivers/net/tap/tap_flow.c  | 65 ++---
 drivers/net/tap/tap_rss.h   |  5 +--
 3 files changed, 70 insertions(+), 16 deletions(-)

diff --git a/drivers/net/tap/tap_bpf_insns.h b/drivers/net/tap/tap_bpf_insns.h
index 53fa76c4e6b0..ee26cf885ed7 100644
--- a/drivers/net/tap/tap_bpf_insns.h
+++ b/drivers/net/tap/tap_bpf_insns.h
@@ -1709,13 +1709,13 @@ static struct bpf_insn l3_l4_hash_insns[] = {
{0x57,1,0,0, 0x0001},
{0x15,1,0,1, 0x},
{0xa7,3,0,0, 0xfe0fee15},
-   {0x71,1,0,  201, 0x},
+   {0x71,1,0,   45, 0x},
{0x67,1,0,0, 0x0008},
-   {0x71,2,0,  200, 0x},
+   {0x71,2,0,   44, 0x},
{0x4f,1,2,0, 0x},
-   {0x71,2,0,  202, 0x},
+   {0x71,2,0,   46, 0x},
{0x67,2,0,0, 0x0010},
-   {0x71,4,0,  203, 0x},
+   {0x71,4,0,   47, 0x},
{0x67,4,0,0, 0x0018},
{0x4f,4,2,0, 0x},
{0x4f,4,1,0, 0x},
@@ -1725,13 +1725,13 @@ static struct bpf_insn l3_l4_hash_insns[] = {
{0x57,3,0,0, 0x000f},
{0x67,3,0,0, 0x0002},
{0x0f,0,3,0, 0x},
-   {0x71,1,0,  137, 0x},
+   {0x71,1,0,   49, 0x},
{0x67,1,0,0, 0x0008},
-   {0x71,2,0,  136, 0x},
+   {0x71,2,0,   48, 0x},
{0x4f,1,2,0, 0x},
-   {0x71,2,0,  138, 0x},
+   {0x71,2,0,   50, 0x},
{0x67,2,0,0, 0x0010},
-   {0x71,3,0,  139, 0x},
+   {0x71,3,0,   51, 0x},
{0x67,3,0,0, 0x0018},
{0x4f,3,2,0, 0x},
{0x4f,3,1,0, 0x},
diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
index ed4d42f92f9f..cd49aa51c8b0 100644
--- a/drivers/net/tap/tap_flow.c
+++ b/drivers/net/tap/tap_flow.c
@@ -11,8 +11,10 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
+
 #include 
 #include 
 #include 
@@ -2053,6 +2055,21 @@ static int bpf_rss_key(enum bpf_rss_key_e cmd, __u32 
*key_idx)
return err;
 }
 
+
+/* Default RSS hash key also used by mlx devices */
+static const uint8_t rss_hash_default_key[] = {
+   0x2c, 0xc6, 0x81, 0xd1,
+   0x5b, 0xdb, 0xf4, 0xf7,
+   0xfc, 0xa2, 0x83, 0x19,
+   0xdb, 0x1a, 0x3e, 0x94,
+   0x6b, 0x9e, 0x38, 0xd9,
+   0x2c, 0x9c, 0x03, 0xd1,
+   0xad, 0x99, 0x44, 0xa7,
+   0xd9, 0x56, 0x3d, 0x59,
+   0x06, 0x3c, 0x25, 0xf3,
+   0xfc, 0x1f, 0xdc, 0x2a,
+};
+
 /**
  * Add RSS hash calculations and queue selection
  *
@@ -2071,11 +2088,11 @@ static int rss_add_actions(struct rte_flow *flow, 
struct pmd_internals *pmd,
   const struct rte_flow_action_rss *rss,
   struct rte_flow_error *error)
 {
-   /* 4096 is the maximum number of instructions for a BPF program */
+   struct rss_key rss_entry = { };
+   const uint8_t *key_in;
+   uint32_t hash_type = 0;
unsigned int i;
int err;
-   struct rss_key rss_entry = { .hash_fields = 0,
-.key_size = 0 };
 
/* Check supported RSS features */
if (rss->func != RTE_ETH_HASH_FUNCTION_DEFAULT)
@@ -2087,6 +2104,41 @@ static int rss_add_actions(struct rte_flow *flow, struct 
pmd_internals *pmd,
(error, ENOTSUP, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
 "a nonzero RSS encapsulation level is not supported");
 
+   if (rss->queue_num == 0 || rss->queue_num >= TAP_MAX_QUEUES)
+   return rte_flow_error_set(error, EINVAL, 
RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+ "invalid number of queues");
+
+   /* allow RSS key_len 0 in case of NULL (default) RSS key. */
+   if (rss->key_len == 0) {
+   if (rss->key != NULL)
+   return rte_flow_error_set(error, ENOTSUP,
+ 
RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+ &rss->key_len, "RSS hash key 
length 0");
+   key_in = rss_hash_default_key;
+   } else {
+   

[PATCH v2 3/7] net/tap: stop "vendoring" linux bpf headers

2024-02-07 Thread Stephen Hemminger
The proper place for finding bpf structures and functions is
in linux/bpf.h. The original version was trying to workaround the
case where the build environment was running on old pre BPF
version of Glibc, but the target environment had BPF. This is not
a supportable build method, and not how rest of DPDK works.

Having own private (and divergent) version headers leads to future
problems when BPF definitions evolve.

Since DPDK officially supports only LTS or later kernel
there is no need for the #ifdef workarounds in the TAP flow
code. Cloning headers leads to problems and no longer needed.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/bpf/bpf_extract.py |   1 -
 drivers/net/tap/tap_bpf.h  | 121 -
 drivers/net/tap/tap_bpf_api.c  |  20 +++--
 drivers/net/tap/tap_bpf_insns.h|   1 -
 drivers/net/tap/tap_flow.c |  89 -
 5 files changed, 13 insertions(+), 219 deletions(-)
 delete mode 100644 drivers/net/tap/tap_bpf.h

diff --git a/drivers/net/tap/bpf/bpf_extract.py 
b/drivers/net/tap/bpf/bpf_extract.py
index b630c42b809f..73c4dafe4eca 100644
--- a/drivers/net/tap/bpf/bpf_extract.py
+++ b/drivers/net/tap/bpf/bpf_extract.py
@@ -65,7 +65,6 @@ def write_header(out, source):
 print(f' * Auto-generated from {source}', file=out)
 print(" * This not the original source file. Do NOT edit it.", file=out)
 print(" */\n", file=out)
-print("#include ", file=out)
 
 
 def main():
diff --git a/drivers/net/tap/tap_bpf.h b/drivers/net/tap/tap_bpf.h
deleted file mode 100644
index 0d38bc111fe0..
--- a/drivers/net/tap/tap_bpf.h
+++ /dev/null
@@ -1,121 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
- * Copyright 2017 Mellanox Technologies, Ltd
- */
-
-#ifndef __TAP_BPF_H__
-#define __TAP_BPF_H__
-
-#include 
-
-/* Do not #include  since eBPF must compile on different
- * distros which may include partial definitions for eBPF (while the
- * kernel itself may support eBPF). Instead define here all that is needed
- */
-
-/* BPF_MAP_UPDATE_ELEM command flags */
-#defineBPF_ANY 0 /* create a new element or update an existing */
-
-/* BPF architecture instruction struct */
-struct bpf_insn {
-   __u8code;
-   __u8dst_reg:4;
-   __u8src_reg:4;
-   __s16   off;
-   __s32   imm; /* immediate value */
-};
-
-/* BPF program types */
-enum bpf_prog_type {
-   BPF_PROG_TYPE_UNSPEC,
-   BPF_PROG_TYPE_SOCKET_FILTER,
-   BPF_PROG_TYPE_KPROBE,
-   BPF_PROG_TYPE_SCHED_CLS,
-   BPF_PROG_TYPE_SCHED_ACT,
-};
-
-/* BPF commands types */
-enum bpf_cmd {
-   BPF_MAP_CREATE,
-   BPF_MAP_LOOKUP_ELEM,
-   BPF_MAP_UPDATE_ELEM,
-   BPF_MAP_DELETE_ELEM,
-   BPF_MAP_GET_NEXT_KEY,
-   BPF_PROG_LOAD,
-};
-
-/* BPF maps types */
-enum bpf_map_type {
-   BPF_MAP_TYPE_UNSPEC,
-   BPF_MAP_TYPE_HASH,
-};
-
-/* union of anonymous structs used with TAP BPF commands */
-union bpf_attr {
-   /* BPF_MAP_CREATE command */
-   struct {
-   __u32   map_type;
-   __u32   key_size;
-   __u32   value_size;
-   __u32   max_entries;
-   __u32   map_flags;
-   __u32   inner_map_fd;
-   };
-
-   /* BPF_MAP_UPDATE_ELEM, BPF_MAP_DELETE_ELEM commands */
-   struct {
-   __u32   map_fd;
-   __aligned_u64   key;
-   union {
-   __aligned_u64 value;
-   __aligned_u64 next_key;
-   };
-   __u64   flags;
-   };
-
-   /* BPF_PROG_LOAD command */
-   struct {
-   __u32   prog_type;
-   __u32   insn_cnt;
-   __aligned_u64   insns;
-   __aligned_u64   license;
-   __u32   log_level;
-   __u32   log_size;
-   __aligned_u64   log_buf;
-   __u32   kern_version;
-   __u32   prog_flags;
-   };
-} __rte_aligned(8);
-
-#ifndef __NR_bpf
-# if defined(__i386__)
-#  define __NR_bpf 357
-# elif defined(__x86_64__)
-#  define __NR_bpf 321
-# elif defined(__arm__)
-#  define __NR_bpf 386
-# elif defined(__aarch64__)
-#  define __NR_bpf 280
-# elif defined(__sparc__)
-#  define __NR_bpf 349
-# elif defined(__s390__)
-#  define __NR_bpf 351
-# elif defined(__powerpc__)
-#  define __NR_bpf 361
-# elif defined(__riscv)
-#  define __NR_bpf 280
-# elif defined(__loongarch__)
-#  define __NR_bpf 280
-# else
-#  error __NR_bpf not defined
-# endif
-#endif
-
-enum {
-   BPF_MAP_ID_KEY,
-   BPF_MAP_ID_SIMPLE,
-};
-
-static int bpf_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-   size_t insns_cnt, const char *license);
-
-#endif /* __TAP_BPF_H__ */
diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
index 15283f8917ed..9e05e2ddf19b 100644
--- a/drivers/net/tap/tap_bpf_api.c

[PATCH v2 4/7] net/tap: rewrite the RSS BPF program

2024-02-07 Thread Stephen Hemminger
Rewrite the BPF program used to do queue based RSS.
Important changes:
- uses newer BPF map format BTF
- accepts key as parameter rather than constant default
- can do L3 or L4 hashing
- supports IPv4 options
- supports IPv6 extension headers
- restructured for readability

The usage of BPF is different as well:
- the incoming configuration is looked up based on
  class parameters rather than patching the BPF.
- the resulting queue is placed in skb rather
  than requiring a second pass through classifier step.

Note: This version only works with later patch to enable it on
the DPDK driver side. It is submitted as an incremental patch
to allow for easier review. Bisection still works because
the old instruction are still present for now.

Signed-off-by: Stephen Hemminger 
---
 .gitignore|   3 -
 drivers/net/tap/bpf/Makefile  |  19 --
 drivers/net/tap/bpf/README|  12 ++
 drivers/net/tap/bpf/bpf_api.h | 276 --
 drivers/net/tap/bpf/bpf_elf.h |  53 -
 drivers/net/tap/bpf/bpf_extract.py|  85 
 drivers/net/tap/bpf/meson.build   |  81 
 drivers/net/tap/bpf/tap_bpf_program.c | 255 
 drivers/net/tap/bpf/tap_rss.c | 272 +
 9 files changed, 365 insertions(+), 691 deletions(-)
 delete mode 100644 drivers/net/tap/bpf/Makefile
 create mode 100644 drivers/net/tap/bpf/README
 delete mode 100644 drivers/net/tap/bpf/bpf_api.h
 delete mode 100644 drivers/net/tap/bpf/bpf_elf.h
 delete mode 100644 drivers/net/tap/bpf/bpf_extract.py
 create mode 100644 drivers/net/tap/bpf/meson.build
 delete mode 100644 drivers/net/tap/bpf/tap_bpf_program.c
 create mode 100644 drivers/net/tap/bpf/tap_rss.c

diff --git a/.gitignore b/.gitignore
index 3f444dcace2e..01a47a760660 100644
--- a/.gitignore
+++ b/.gitignore
@@ -36,9 +36,6 @@ TAGS
 # ignore python bytecode files
 *.pyc
 
-# ignore BPF programs
-drivers/net/tap/bpf/tap_bpf_program.o
-
 # DTS results
 dts/output
 
diff --git a/drivers/net/tap/bpf/Makefile b/drivers/net/tap/bpf/Makefile
deleted file mode 100644
index 9efeeb1bc704..
--- a/drivers/net/tap/bpf/Makefile
+++ /dev/null
@@ -1,19 +0,0 @@
-# SPDX-License-Identifier: BSD-3-Clause
-# This file is not built as part of normal DPDK build.
-# It is used to generate the eBPF code for TAP RSS.
-
-CLANG=clang
-CLANG_OPTS=-O2
-TARGET=../tap_bpf_insns.h
-
-all: $(TARGET)
-
-clean:
-   rm tap_bpf_program.o $(TARGET)
-
-tap_bpf_program.o: tap_bpf_program.c
-   $(CLANG) $(CLANG_OPTS) -emit-llvm -c $< -o - | \
-   llc -march=bpf -filetype=obj -o $@
-
-$(TARGET): tap_bpf_program.o
-   python3 bpf_extract.py -stap_bpf_program.c -o $@ $<
diff --git a/drivers/net/tap/bpf/README b/drivers/net/tap/bpf/README
new file mode 100644
index ..960a10da73b8
--- /dev/null
+++ b/drivers/net/tap/bpf/README
@@ -0,0 +1,12 @@
+This is the BPF program used to implement the RSS across queues
+flow action. It works like the skbedit tc filter but instead of mapping
+to only one queues, it maps to multiple queues based on RSS hash.
+
+This version is built the BPF Compile Once — Run Everywhere (CO-RE)
+framework and uses libbpf and bpftool.
+
+Limitations
+- requires libbpf version XX or later
+- rebuilding the BPF requires clang and bpftool
+- only Toeplitz hash with standard 40 byte key is supported
+- the number of queues per RSS action is limited to 16
diff --git a/drivers/net/tap/bpf/bpf_api.h b/drivers/net/tap/bpf/bpf_api.h
deleted file mode 100644
index 2638a8a4ac9a..
--- a/drivers/net/tap/bpf/bpf_api.h
+++ /dev/null
@@ -1,276 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */
-
-#ifndef __BPF_API__
-#define __BPF_API__
-
-/* Note:
- *
- * This file can be included into eBPF kernel programs. It contains
- * a couple of useful helper functions, map/section ABI (bpf_elf.h),
- * misc macros and some eBPF specific LLVM built-ins.
- */
-
-#include 
-
-#include 
-#include 
-#include 
-
-#include 
-
-#include "bpf_elf.h"
-
-/** libbpf pin type. */
-enum libbpf_pin_type {
-   LIBBPF_PIN_NONE,
-   /* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */
-   LIBBPF_PIN_BY_NAME,
-};
-
-/** Type helper macros. */
-
-#define __uint(name, val) int (*name)[val]
-#define __type(name, val) typeof(val) *name
-#define __array(name, val) typeof(val) *name[]
-
-/** Misc macros. */
-
-#ifndef __stringify
-# define __stringify(X)#X
-#endif
-
-#ifndef __maybe_unused
-# define __maybe_unused__attribute__((__unused__))
-#endif
-
-#ifndef offsetof
-# define offsetof(TYPE, MEMBER)__builtin_offsetof(TYPE, MEMBER)
-#endif
-
-#ifndef likely
-# define likely(X) __builtin_expect(!!(X), 1)
-#endif
-
-#ifndef unlikely
-# define unlikely(X)   __builtin_expect(!!(X), 0)
-#endif
-
-#ifndef htons
-# define htons(X)

[PATCH v2 5/7] net/tap: use libbpf to load new BPF program

2024-02-07 Thread Stephen Hemminger
There were multiple issues in the RSS queue support in the TAP
driver. This required extensive rework of the BPF support.

Change the BPF loading to use bpftool to
create a skeleton header file, and load with libbpf.
The BPF is always compiled from source so less chance that
source and instructions diverge. Also resolves issue where
libbpf and source get out of sync. The program
is only loaded once, so if multiple rules are created
only one BPF program is loaded in kernel.

The new BPF program only needs a single action.
No need for action and re-classification step.

It alsow fixes the missing bits from the original.
- supports setting RSS key per flow
- level of hash can be L3 or L3/L4.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/meson.build|  26 +--
 drivers/net/tap/rte_eth_tap.c  |   2 +
 drivers/net/tap/rte_eth_tap.h  |   9 +-
 drivers/net/tap/tap_flow.c | 391 +
 drivers/net/tap/tap_flow.h |  11 +-
 drivers/net/tap/tap_rss.h  |   3 +
 drivers/net/tap/tap_rss.stub.h |  45 
 drivers/net/tap/tap_tcmsgs.h   |   4 +-
 8 files changed, 163 insertions(+), 328 deletions(-)
 create mode 100644 drivers/net/tap/tap_rss.stub.h

diff --git a/drivers/net/tap/meson.build b/drivers/net/tap/meson.build
index 5099ccdff11b..ad51b67c 100644
--- a/drivers/net/tap/meson.build
+++ b/drivers/net/tap/meson.build
@@ -7,33 +7,21 @@ if not is_linux
 endif
 sources = files(
 'rte_eth_tap.c',
-'tap_bpf_api.c',
 'tap_flow.c',
 'tap_intr.c',
 'tap_netlink.c',
 'tap_tcmsgs.c',
 )
 
+subdir('bpf')
+if enable_tap_rss
+cflags += '-DHAVE_BPF_RSS'
+ext_deps += libbpf
+sources += tap_rss_skel_h
+endif
+
 deps = ['bus_vdev', 'gso', 'hash']
 
 cflags += '-DTAP_MAX_QUEUES=16'
 
-# input array for meson symbol search:
-# [ "MACRO to define if found", "header for the search",
-#   "enum/define", "symbol to search" ]
-#
-args = [
-[ 'HAVE_TC_FLOWER', 'linux/pkt_cls.h', 'TCA_FLOWER_UNSPEC' ],
-[ 'HAVE_TC_VLAN_ID', 'linux/pkt_cls.h', 'TCA_FLOWER_KEY_VLAN_PRIO' ],
-[ 'HAVE_TC_BPF', 'linux/pkt_cls.h', 'TCA_BPF_UNSPEC' ],
-[ 'HAVE_TC_BPF_FD', 'linux/pkt_cls.h', 'TCA_BPF_FD' ],
-[ 'HAVE_TC_ACT_BPF', 'linux/tc_act/tc_bpf.h', 'TCA_ACT_BPF_UNSPEC' ],
-[ 'HAVE_TC_ACT_BPF_FD', 'linux/tc_act/tc_bpf.h', 'TCA_ACT_BPF_FD' ],
-]
-config = configuration_data()
-foreach arg:args
-config.set(arg[0], cc.has_header_symbol(arg[1], arg[2]))
-endforeach
-configure_file(output : 'tap_autoconf.h', configuration : config)
-
 require_iova_in_mbuf = false
diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index b41fa971cb7e..a98cc8f01ae1 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -1138,6 +1138,7 @@ tap_dev_close(struct rte_eth_dev *dev)
tap_flow_implicit_flush(internals, NULL);
tap_nl_final(internals->nlsk_fd);
internals->nlsk_fd = -1;
+   tap_flow_bpf_destroy(internals);
}
 
for (i = 0; i < RTE_PMD_TAP_MAX_QUEUES; i++) {
@@ -1959,6 +1960,7 @@ eth_dev_tap_create(struct rte_vdev_device *vdev, const 
char *tap_name,
strlcpy(pmd->name, tap_name, sizeof(pmd->name));
pmd->type = type;
pmd->ka_fd = -1;
+   pmd->rss = NULL;
pmd->nlsk_fd = -1;
pmd->gso_ctx_mp = NULL;
 
diff --git a/drivers/net/tap/rte_eth_tap.h b/drivers/net/tap/rte_eth_tap.h
index 5ac93f93e961..0cf2b30bb03b 100644
--- a/drivers/net/tap/rte_eth_tap.h
+++ b/drivers/net/tap/rte_eth_tap.h
@@ -79,12 +79,11 @@ struct pmd_internals {
int flow_isolate; /* 1 if flow isolation is enabled */
int flower_support;   /* 1 if kernel supports, else 0 */
int flower_vlan_support;  /* 1 if kernel supports, else 0 */
-   int rss_enabled;  /* 1 if RSS is enabled, else 0 */
int persist;  /* 1 if keep link up, else 0 */
-   /* implicit rules set when RSS is enabled */
-   int map_fd;   /* BPF RSS map fd */
-   int bpf_fd[RTE_PMD_TAP_MAX_QUEUES];/* List of bpf fds per queue */
-   LIST_HEAD(tap_rss_flows, rte_flow) rss_flows;
+
+   struct tap_rss *rss;  /* BPF program */
+   uint16_t bpf_flowid;  /* next BPF class id */
+
LIST_HEAD(tap_flows, rte_flow) flows;/* rte_flow rules */
/* implicit rte_flow rules set when a remote device is active */
LIST_HEAD(tap_implicit_flows, rte_flow) implicit_flows;
diff --git a/drivers/net/tap/tap_flow.c b/drivers/net/tap/tap_flow.c
index 94436af55ce8..ef34e85c423b 100644
--- a/drivers/net/tap/tap_flow.c
+++ b/drivers/net/tap/tap_flow.c
@@ -16,24 +16,19 @@
 #include 
 
 #include 
-#include 
 #include 
 #include 
 
-
-/* RSS key management */
-enum bpf_rss_key_e {
-   KEY_CMD_GET = 1,
-   KEY_CMD_RELEASE,
-   KEY_CMD_INIT,
-   

[PATCH v2 7/7] MAINTAINERS: add maintainer for TAP device

2024-02-07 Thread Stephen Hemminger
Add myself as maintainer for TAP device.

Signed-off-by: Stephen Hemminger 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5fb3a73f840e..92d27e97aa9e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1015,6 +1015,7 @@ F: doc/guides/nics/pcap_ring.rst
 F: doc/guides/nics/features/pcap.ini
 
 Tap PMD
+M: Stephen Hemminger 
 F: drivers/net/tap/
 F: doc/guides/nics/tap.rst
 F: doc/guides/nics/features/tap.ini
-- 
2.43.0



[PATCH v2 6/7] net/tap: remove no longer used files

2024-02-07 Thread Stephen Hemminger
The BPF api was replaced by use of libbpf.
And the BPF instruction header was replaced by the skeleton.

Signed-off-by: Stephen Hemminger 
---
 drivers/net/tap/tap_bpf_api.c   |  196 
 drivers/net/tap/tap_bpf_insns.h | 1742 ---
 2 files changed, 1938 deletions(-)
 delete mode 100644 drivers/net/tap/tap_bpf_api.c
 delete mode 100644 drivers/net/tap/tap_bpf_insns.h

diff --git a/drivers/net/tap/tap_bpf_api.c b/drivers/net/tap/tap_bpf_api.c
deleted file mode 100644
index 9e05e2ddf19b..
--- a/drivers/net/tap/tap_bpf_api.c
+++ /dev/null
@@ -1,196 +0,0 @@
-/* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2017 Mellanox Technologies, Ltd
- */
-
-#include 
-#include 
-#include 
-
-#include 
-#include 
-
-#include 
-
-
-static int bpf_load(enum bpf_prog_type type, const struct bpf_insn *insns,
-   size_t insns_cnt, const char *license);
-
-/**
- * Load BPF program (section cls_q) into the kernel and return a bpf fd
- *
- * @param queue_idx
- *   Queue index matching packet cb
- *
- * @return
- *   -1 if the BPF program couldn't be loaded. An fd (int) otherwise.
- */
-int tap_flow_bpf_cls_q(__u32 queue_idx)
-{
-   cls_q_insns[1].imm = queue_idx;
-
-   return bpf_load(BPF_PROG_TYPE_SCHED_CLS,
-   (struct bpf_insn *)cls_q_insns,
-   RTE_DIM(cls_q_insns),
-   "Dual BSD/GPL");
-}
-
-/**
- * Load BPF program (section l3_l4) into the kernel and return a bpf fd.
- *
- * @param[in] key_idx
- *   RSS MAP key index
- *
- * @param[in] map_fd
- *   BPF RSS map file descriptor
- *
- * @return
- *   -1 if the BPF program couldn't be loaded. An fd (int) otherwise.
- */
-int tap_flow_bpf_calc_l3_l4_hash(__u32 key_idx, int map_fd)
-{
-   l3_l4_hash_insns[4].imm = key_idx;
-   l3_l4_hash_insns[9].imm = map_fd;
-
-   return bpf_load(BPF_PROG_TYPE_SCHED_ACT,
-   (struct bpf_insn *)l3_l4_hash_insns,
-   RTE_DIM(l3_l4_hash_insns),
-   "Dual BSD/GPL");
-}
-
-/**
- * Helper function to convert a pointer to unsigned 64 bits
- *
- * @param[in] ptr
- *   pointer to address
- *
- * @return
- *   64 bit unsigned long type of pointer address
- */
-static inline __u64 ptr_to_u64(const void *ptr)
-{
-   return (__u64)(unsigned long)ptr;
-}
-
-/**
- * Call BPF system call
- *
- * @param[in] cmd
- *   BPF command for program loading, map creation, map entry update, etc
- *
- * @param[in] attr
- *   System call attributes relevant to system call command
- *
- * @param[in] size
- *   size of attr parameter
- *
- * @return
- *   -1 if BPF system call failed, 0 otherwise
- */
-static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
-   unsigned int size)
-{
-#ifdef __NR_bpf
-   return syscall(__NR_bpf, cmd, attr, size);
-#else
-   TAP_LOG(ERR, "No bpf syscall, kernel headers too old?\n");
-   errno = ENOSYS;
-   return -1;
-#endif
-}
-
-/**
- * Load BPF instructions to kernel
- *
- * @param[in] type
- *   BPF program type: classifier or action
- *
- * @param[in] insns
- *   Array of BPF instructions (equivalent to BPF instructions)
- *
- * @param[in] insns_cnt
- *   Number of BPF instructions (size of array)
- *
- * @param[in] license
- *   License string that must be acknowledged by the kernel
- *
- * @return
- *   -1 if the BPF program couldn't be loaded, fd (file descriptor) otherwise
- */
-static int bpf_load(enum bpf_prog_type type,
- const struct bpf_insn *insns,
- size_t insns_cnt,
- const char *license)
-{
-   union bpf_attr attr = {};
-
-   bzero(&attr, sizeof(attr));
-   attr.prog_type = type;
-   attr.insn_cnt = (__u32)insns_cnt;
-   attr.insns = ptr_to_u64(insns);
-   attr.license = ptr_to_u64(license);
-   attr.log_buf = ptr_to_u64(NULL);
-   attr.log_level = 0;
-   attr.kern_version = 0;
-
-   return sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
-}
-
-/**
- * Create BPF map for RSS rules
- *
- * @param[in] key_size
- *   map RSS key size
- *
- * @param[in] value_size
- *   Map RSS value size
- *
- * @param[in] max_entries
- *   Map max number of RSS entries (limit on max RSS rules)
- *
- * @return
- *   -1 if BPF map couldn't be created, map fd otherwise
- */
-int tap_flow_bpf_rss_map_create(unsigned int key_size,
-   unsigned int value_size,
-   unsigned int max_entries)
-{
-   union bpf_attr attr = {};
-
-   bzero(&attr, sizeof(attr));
-   attr.map_type= BPF_MAP_TYPE_HASH;
-   attr.key_size= key_size;
-   attr.value_size  = value_size;
-   attr.max_entries = max_entries;
-
-   return sys_bpf(BPF_MAP_CREATE, &attr, sizeof(attr));
-}
-
-/**
- * Update RSS entry in BPF map
- *
- * @param[in] fd
- *   RSS map fd
- *
- * @param[in] key
- *   Pointer to RSS key whose entry is updated
- *
- * @param[in] value
- *   Pointer to RSS new updated value
- *
- * @return
- *   -1 if RSS entry failed to be upd

Re: [PATCH v2] common/sfc: replace out of bounds condition with static_assert

2024-02-07 Thread Stephen Hemminger
On Wed, 7 Feb 2024 19:10:37 +
Ferruh Yigit  wrote:

> On 1/19/2024 10:13 PM, Stephen Hemminger wrote:
> > The sfc base code had its own definition of static assertions
> > using the out of bound array access hack. Replace it with a
> > static_assert like rte_common.h.
> > 
> > Fixes: f67e4719147d ("net/sfc/base: fix coding style")
> > Signed-off-by: Stephen Hemminger 
> > Acked-by: Morten Brørup 
> > ---
> > v2 - add assert.h to make sure it works in other environments
> > 
> >  drivers/common/sfc_efx/base/efx.h | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/common/sfc_efx/base/efx.h 
> > b/drivers/common/sfc_efx/base/efx.h
> > index 3312c2fa8f81..38f2aed3e336 100644
> > --- a/drivers/common/sfc_efx/base/efx.h
> > +++ b/drivers/common/sfc_efx/base/efx.h
> > @@ -7,6 +7,8 @@
> >  #ifndef_SYS_EFX_H
> >  #define_SYS_EFX_H
> >  
> > +#include 
> > +
> >  #include "efx_annote.h"
> >  #include "efsys.h"
> >  #include "efx_types.h"
> > @@ -17,8 +19,8 @@
> >  extern "C" {
> >  #endif
> >  
> > -#defineEFX_STATIC_ASSERT(_cond)\
> > -   ((void)sizeof (char[(_cond) ? 1 : -1]))
> > +#defineEFX_STATIC_ASSERT(_cond) \
> > +   do { static_assert((_cond), "assert failed" #_cond); } while (0)
> >  
> >  #defineEFX_ARRAY_SIZE(_array)  \
> > (sizeof (_array) / sizeof ((_array)[0]))  
> 
> Getting following build error with clang:

What version of clang?
It works for me with clang 16.0.6


Re: [PATCH v2] common/sfc: replace out of bounds condition with static_assert

2024-02-07 Thread Stephen Hemminger
On Wed, 7 Feb 2024 19:10:37 +
Ferruh Yigit  wrote:

> ../drivers/common/sfc_efx/base/ef10_filter.c:1246:18: error: shift count
> >= width of type [-Werror,-Wshift-count-overflow]  
> matches_count = MCDI_OUT_DWORD(req,
> ^~~
> ../drivers/common/sfc_efx/base/efx_mcdi.h:493:2: note: expanded from
> macro 'MCDI_OUT_DWORD'
> EFX_DWORD_FIELD(*MCDI_OUT2(_emr, efx_dword_t, _ofst),   \
> ^
> ../drivers/common/sfc_efx/base/efx_types.h:533:30: note: expanded from
> macro 'EFX_DWORD_FIELD'
> EFX_HIGH_BIT(_field)) & EFX_MASK32(_field))
> ^~
> ../drivers/common/sfc_efx/base/efx_types.h:145:23: note: expanded from
> macro 'EFX_MASK32'
> (uint32_t)1) << EFX_WIDTH(_field))) - 1))

None of this got changed by the patch. Looks like it would not compile
even without the patch on your version of clang.


Re: [PATCH v5] gro : packets not getting flushed in heavy-weight mode API

2024-02-07 Thread Ferruh Yigit
On 1/18/2024 8:36 AM, Kumara Parameshwaran wrote:
> In heavy-weight mode GRO which is based on timer, the GRO packets
> will not be flushed in spite of timer expiry if there is no packet
> in the current poll. If timer mode GRO is enabled the
> rte_gro_timeout_flush API should be invoked.
> 

Agree on the problem, need a way to flush existing packets in GRO
context when no more packet receiving.


> Signed-off-by: Kumara Parameshwaran 
> ---
> v1:
> Changes to make sure that the GRO flush API is invoked if there are no 
> packets in 
> current poll and timer expiry.
> 
> v2:
> Fix code organisation issue
> 
> v3:
> Fix warnings
> 
> v4:
> Fix error and warnings
> 
> v5:
> Fix compilation issue when GRO is not defined
> 
>  app/test-pmd/csumonly.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
> index c103e54111..6d9ce99500 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -863,16 +863,23 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
>  
>   /* receive a burst of packet */
>   nb_rx = common_fwd_stream_receive(fs, pkts_burst, nb_pkt_per_burst);
> +#ifndef RTE_LIB_GRO
>   if (unlikely(nb_rx == 0))
>   return false;
> -
> +#else
> + gro_enable = gro_ports[fs->rx_port].enable;
> + if (unlikely(nb_rx == 0)) {
> + if (gro_enable && (gro_flush_cycles != 
> GRO_DEFAULT_FLUSH_CYCLES))
> + goto init;
>

Rest of the function expects 'nb_rx' non zero, so I am concerned about
unexpected side effects, like:

- GRO reassembles packets and flush, how reassembles behaves with zero
nb_rx?

- If there is no packet in GRO context, what happens to call with zero
nb_rx?


To address above issue, what about add 'rte_gro_get_pkt_count()' check
and only continue if it return not zero.
Also in below GRO block, call 'rte_gro_reassemble()' only when 'nb_rx'
is not zero.

> + else
> + return false;
>

'goto' can be prevented by changing the condition in if block.


> + }
> +init:
>

Label name 'init' is not a good name.

> +#endif
>   rx_bad_ip_csum = 0;
>   rx_bad_l4_csum = 0;
>   rx_bad_outer_l4_csum = 0;
>   rx_bad_outer_ip_csum = 0;
> -#ifdef RTE_LIB_GRO
> - gro_enable = gro_ports[fs->rx_port].enable;
> -#endif
>  
>   txp = &ports[fs->tx_port];
>   tx_offloads = txp->dev_conf.txmode.offloads;



Re: [v7 1/1] net/af_xdp: fix multi interface support for K8s

2024-02-07 Thread Ferruh Yigit
On 1/11/2024 2:21 PM, Ferruh Yigit wrote:
> On 1/11/2024 12:21 PM, Maryam Tahhan wrote:
>> On 11/01/2024 11:35, Ferruh Yigit wrote:
>>> Devarg is user interface, changing it impacts the user.
>>>
>>> Assume that user of '22.11.3' using 'use_cni' dev_arg, it will be broken
>>> when user upgrades DPDK to '22.11.4', which is not expected.
>>>
>>> dev_arg is not API/ABI but as it impacts the user, it is in the gray
>>> area to backport to the LTS release.
>> Fair enough
>>> Current patch doesn't have Fixes tag or stable tag, so it doesn't
>>> request to be backported to LTS release. I took this as an improvement,
>>> more than a fix.
>>
>> This was overlooked by me apologies. It's been a while since I've
>> contributed to DPDK and I must've missed this detail in the contribution
>> guide.
>>> As far as I understand existing code (that use 'use_cni' dev_arg)
>>> supports only single netdev, this patch adds support for multiple netdevs.
>>
>> The use_cni implementation will no longer work with the AF_XDP DP as the
>> use_cni was originally implemented as it has hard coded what's now an
>> incorrect path for the UDS.
>>
>>> So what do you think keep LTS with 'use_cni' dev_arg, is there a
>>> requirement to update LTS release?
>>> If so, can it be an option to keep 'use_cni' for backward compatibility
>>> but add only add 'uds_path' and remove 'use_cni' in next LTS?
>>
>>
>> Yeah we can go back to the version of the patch that had the 'use_cni'
>> flag that was used in combination with the path argument. We can add
>> better documentation re the "use_cni" misnomer... What we can then do is
>> if no path argument is set by the user assume their intent and and
>> generate the path internally in the AF_XDP PMD (which was suggested by
>> Shibin at some stage). That way there should be no surprises to the End
>> User.
>>
> 
> Ack, this keeps backward compatibility,
> 
> BUT if 'use_cni' is already broken in v23.11 (that is what I understand
> from your above comment), means there is no user of it in LTS, and we
> can be more pragmatic and replace the dev_args, by backporting this
> patch, assuming LTS maintainer is also OK with it.
> 

Hi Maryam,

How do you want to continue with the patch, I think options we considered:

1. Fix 'use_cni' documentation (which we can backport to LTS) and
overload the argument for new purpose. This will enable new feature by
keeping backward compatibility. And requires new version of this patch.

2. If the 'use_cni' is completely broken in the 23.11 LTS, which means
there is no user or backward compatibility to worry about, we can merge
this patch and backport it to LTS.

3. Don't backport this fix to LTS, merge only to current release, which
means your new feature won't be available to some users as long as a few
years.


(1.) is most user friendly, but if 'use_cni' already broken in LTS we
can go with option (2.). What do you think?



btw, @Ciara, @Maryam, if (2.) is true, how we end up having a feature
('use_cni' dev_args) completely broken in an LTS release?



> 
>> Long term I would like to keep a (renamed) path argument (in case the
>> path does ever change from the AF_XDP DP POV) and use it also in
>> combination with another (maybe boolean) param for passing pinned bpf
>> maps rather than another separate path.
>>
>> WDYT? Would this work for the LTS release?
>>
>>
> 



Re: [PATCH v2] common/sfc: replace out of bounds condition with static_assert

2024-02-07 Thread Ferruh Yigit
On 2/7/2024 10:36 PM, Stephen Hemminger wrote:
> On Wed, 7 Feb 2024 19:10:37 +
> Ferruh Yigit  wrote:
> 
>> ../drivers/common/sfc_efx/base/ef10_filter.c:1246:18: error: shift count
>>> = width of type [-Werror,-Wshift-count-overflow]  
>> matches_count = MCDI_OUT_DWORD(req,
>> ^~~
>> ../drivers/common/sfc_efx/base/efx_mcdi.h:493:2: note: expanded from
>> macro 'MCDI_OUT_DWORD'
>> EFX_DWORD_FIELD(*MCDI_OUT2(_emr, efx_dword_t, _ofst),   \
>> ^
>> ../drivers/common/sfc_efx/base/efx_types.h:533:30: note: expanded from
>> macro 'EFX_DWORD_FIELD'
>> EFX_HIGH_BIT(_field)) & EFX_MASK32(_field))
>> ^~
>> ../drivers/common/sfc_efx/base/efx_types.h:145:23: note: expanded from
>> macro 'EFX_MASK32'
>> (uint32_t)1) << EFX_WIDTH(_field))) - 1))
> 
> None of this got changed by the patch. Looks like it would not compile
> even without the patch on your version of clang.
>

Nope, error only happens with the patch.

And CI seems reporting the errors:
https://mails.dpdk.org/archives/test-report/2024-January/558546.html


Re: [PATCH v2] app/testpmd: use Tx preparation in txonly engine

2024-02-07 Thread Ferruh Yigit
On 1/11/2024 5:25 AM, Kaiwen Deng wrote:
> Txonly forwarding engine does not call the Tx preparation API
> before transmitting packets. This may cause some problems.
> 
> TSO breaks when MSS spans more than 8 data fragments. Those
> packets will be dropped by Tx preparation API, but it will cause
> MDD event if txonly forwarding engine does not call the Tx preparation
> API before transmitting packets.
> 

txonly is used commonly, adding Tx prepare for a specific case may
impact performance for users.

What happens when driver throws MDD (Malicious Driver Detection) event,
can't it be ignored? As you are already OK to drop the packet, can
device be configured to drop these packages?


Or as Jerin suggested adding a new forwarding engine is a solution, but
that will create code duplication, I prefer to not have it if this can
be handled in device level.



Re: [PATCH 0/3] net/nfb: driver cleanups

2024-02-07 Thread Ferruh Yigit
On 1/12/2024 1:50 PM, Martin Spinler wrote:
> Tested-by: Martin Spinler 
> Acked-by: Martin Spinler 
> 
> ---
> 
> Hi! Thanks for the cleanup. I've tested that patchset and works fine.
> 
> I'm just not sure, if the "net/nfb: use dynamic logtype" patch merges
> with the "Remove uses of PMD logtype" series as they slightly differs
> (both links below).
> Stephen, would it make sense to remove the last patch from this series?
> 
> https://patchwork.dpdk.org/project/dpdk/patch/20231207185720.19913-4-step...@networkplumber.org/
> https://patchwork.dpdk.org/project/dpdk/patch/20231222171820.8778-9-step...@networkplumber.org/
> 

Second one is larger set with multiple components involved, this one is
more specific, I will proceed with this one.

@Thomas may drop the nfp patch in that series.

> 
> On Fri, 2024-01-12 at 12:16 +, Ferruh Yigit wrote:
>> On 12/7/2023 6:56 PM, Stephen Hemminger wrote:
>>> Replace static logtype with dynamic logtype and
>>> remove dead code. Compile tested on Fedora.
>>>
>>> Stephen Hemminger (3):
>>>   net/nfb: remove unused device args
>>>   net/nfb: make device path local to init function
>>>   net/nfb: use dynamic logtype
>>>
>>>  
>>
>> Hi Martin,
>>
>> Can you please review the set?
> 



Re: [PATCH v1] net/memif: remove extra mbuf refcnt update in zero copy Tx

2024-02-07 Thread Ferruh Yigit
On 12/8/2023 1:44 PM, Ferruh Yigit wrote:
> On 12/8/2023 2:38 AM, Liangxing Wang wrote:
>> The refcnt update of stored mbufs in memif driver is redundant since
>> those mbufs are only freed in eth_memif_tx_zc(). No other place
>> can free those stored mbufs quietly. So remove the redundant mbuf
>> refcnt update in dpdk memif driver to avoid extra heavy cost.
>> Performance of dpdk memif zero copy tx is improved with this change.
>>
> 
> As mentioned above, since free is called only from 'eth_memif_tx_zc()',
> this change looks good to me.
> Did you measure the performance improvement, if so can you please share it?
> 
> 
> 
> And addition to this being an optimization, it may be a required fix,
> can you please check following case:
> 
> - When 'memif_tx_one_zc()' called, it has number of free slot
> information as parameter
> - If the mbuf is chained mbuf, only first mbuf reference is increased
> - If number of segment in the mbuf chain is bigger than free slot,
> function returns 0
> - in this error case 'eth_memif_tx_zc()' breaks the sending look and returns
> - In this scenario application gives the decision to either free the
> mbuf or re-send it. But for this case application can't free the mbuf
> because of reference count which may cause memory leak
> - If application decides to re-send, reference count increased again, I
> guess eventually 'memif_free_stored_mbufs()' will decrease the refcount
> to be able to free it
> 
> Assuming above is not done intentionally to make sure all mbufs are sent.
> 
> This refcount prevent application discretion to drop packets, so your
> change is required to fix this. Can you please double check if I am
> missing anything?
> 
> 

Hi Liangxing,

Let me summarize two points above,

1. Can you quantify performance improvement and document this commit log
of next version?

2. For some cases this optimization can be required as fix, can you
please make this a fix patch with fixes tag etc in next version?



Re: [PATCH v6] app/testpmd: enable cli for programmable action

2024-02-07 Thread Ferruh Yigit
On 10/11/2023 1:03 PM, Qi Zhang wrote:
> Parsing command line for rte_flow_action_prog.
> 
> Syntax:
> 
> "prog name  [arguments   \
>... end]"
> 
> Use parse_string0 to parse name string.
> Use parse_hex to parse hex string.
> Use struct action_prog_data to store parsed result.
> 
> Example:
> 
> Action with 2 arguments:
> 
> "prog name action0 arguments field0 03FF field1 55AA end"
> 
> Action without argument:
> 
> "prog name action1"
> 
> Signed-off-by: Qi Zhang 
> 
>

Hi Ori, Cristian, can you please help reviewing this patch?



Re: [PATCH v2] app/testpmd: fix crash in multi-process packet forwarding

2024-02-07 Thread Ferruh Yigit
On 1/30/2024 1:32 AM, Dengdui Huang wrote:
> On multi-process scenario, each process creates flows based on the
> number of queues. When nbcore is greater than 1, multiple cores may
> use the same queue to forward packet, like:
> dpdk-testpmd -a BDF --proc-type=auto -- -i --rxq=4 --txq=4
> --nb-cores=2 --num-procs=2 --proc-id=0
> testpmd> start
> mac packet forwarding - ports=1 - cores=2 - streams=4 - NUMA support
> enabled, MP allocation mode: native
> Logical Core 2 (socket 0) forwards packets on 2 streams:
>   RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>   RX P=0/Q=1 (socket 0) -> TX P=0/Q=1 (socket 0) peer=02:00:00:00:00:00
> Logical Core 3 (socket 0) forwards packets on 2 streams:
>   RX P=0/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00
>   RX P=0/Q=1 (socket 0) -> TX P=0/Q=1 (socket 0) peer=02:00:00:00:00:00
> 
> After this commit, the result will be:
> dpdk-testpmd -a BDF --proc-type=auto -- -i --rxq=4 --txq=4
> --nb-cores=2 --num-procs=2 --proc-id=0
> testpmd> start
> io packet forwarding - ports=1 - cores=2 - streams=2 - NUMA support
> enabled, MP allocation mode: native
> Logical Core 2 (socket 0) forwards packets on 1 streams:
>   RX P=0/Q=0 (socket 2) -> TX P=0/Q=0 (socket 2) peer=02:00:00:00:00:00
> Logical Core 3 (socket 0) forwards packets on 1 streams:
>   RX P=0/Q=1 (socket 2) -> TX P=0/Q=1 (socket 2) peer=02:00:00:00:00:00
> 
> Fixes: a550baf24af9 ("app/testpmd: support multi-process")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Dengdui Huang 
> Acked-by: Chengwen Feng 
>

Thanks for the fix.

Acked-by: Ferruh Yigit 

Applied to dpdk-next-net/main, thanks.


[PATCH 0/3] Support IPv6 flow label based RSS

2024-02-07 Thread Ajit Khaparde
The use of 5-tuple of the source address, destination address,
source port, destination port, and the transport protocol type
may not be possible due to IP fragmentation, encryption, or
inability to parse past IPv6 extensions headers.

Flow label values can be chosen such that they can be
used as part of the input to a hash function used in a load
distribution scheme.

On supporting hardware, the 20-bit Flow Label field in the
IPv6 header can be used to perform RSS in the ingress path.

Please apply.

Example to configure IPv6 flow label based RSS:

flow create 0 ingress pattern eth / ipv6 / tcp / end actions rss types 
ipv6-flow-label end / end

Ajit Khaparde (3):
  ethdev: add support for RSS based on IPv6 flow label
  app/testpmd: add IPv6 flow label to RSS types
  net/bnxt: add IPv6 flow label based RSS support

 app/test-pmd/config.c  | 1 +
 drivers/net/bnxt/bnxt.h| 1 +
 drivers/net/bnxt/bnxt_ethdev.c | 2 ++
 drivers/net/bnxt/bnxt_hwrm.c   | 7 +++
 drivers/net/bnxt/bnxt_vnic.c   | 9 +++--
 lib/ethdev/rte_ethdev.h| 1 +
 6 files changed, 19 insertions(+), 2 deletions(-)

-- 
2.39.2 (Apple Git-143)



smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH 1/3] ethdev: add support for RSS based on IPv6 flow label

2024-02-07 Thread Ajit Khaparde
On supporting hardware, the 20-bit Flow Label field in the
IPv6 header can be used to perform RSS in the ingress path.

Signed-off-by: Ajit Khaparde 
---
 lib/ethdev/rte_ethdev.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 2687c23fa6..75a3f5f2c7 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -587,6 +587,7 @@ struct rte_eth_rss_conf {
 #define RTE_ETH_RSS_L4_CHKSUM  RTE_BIT64(35)
 
 #define RTE_ETH_RSS_L2TPV2 RTE_BIT64(36)
+#define RTE_ETH_RSS_IPV6_FLOW_LABELRTE_BIT64(37)
 
 /*
  * We use the following macros to combine with above RTE_ETH_RSS_* for
-- 
2.39.2 (Apple Git-143)



smime.p7s
Description: S/MIME Cryptographic Signature


  1   2   >