RE: [PATCH v2 00/17] stop using variadic argument pack extension

2024-02-23 Thread Morten Brørup
> From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> Sent: Friday, 23 February 2024 00.46
> 
> RTE_LOG_LINE cannot be augmented with a prefix format and arguments
> without the user of RTE_LOG_LINE using the args... and ## args compiler
> extension to conditionally remove trailing comma when the macro receives
> only a single argument.
> 
> Provide a new/similar macro RTE_LOG_LINE_PREFIX that accepts the prefix
> format and arguments as separate parameters allowing them to be expanded
> at the correct locations inside of RTE_FMT() allowing the rest of the
> non-prefix format string and arguments to be collapsed to the argument
> pack which can be directly forwarded with __VA_ARGS__ avoiding the need
> for conditional comma removal.
> 
> I've done my best to manually check expansions (preprocessed) and
> compiled printf of the logs to validate correct output.
> 
> note: due to drastic change in series i have not carried any series acks
>   forward.

Series-acked-by: Morten Brørup 



Where to best ack a series

2024-02-23 Thread Morten Brørup
Dear maintainers,

Is it easier for you to spot if we ack a series in patch 0, patch 1, or the 
last patch of the series? Or don't you have any preferences?


Med venlig hilsen / Kind regards,
-Morten Brørup




[PATCH v1] dts: fix smoke tests driver regex

2024-02-23 Thread Juraj Linkeš
Add hyphen to the regex, which is needed for drivers such as vfio-pci.

Fixes: 88489c0501af ("dts: add smoke tests")
Cc: jspew...@iol.unh.edu
Signed-off-by: Juraj Linkeš 
---
 dts/tests/TestSuite_smoke_tests.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dts/tests/TestSuite_smoke_tests.py 
b/dts/tests/TestSuite_smoke_tests.py
index 5e2bac14bd..1be5c3047e 100644
--- a/dts/tests/TestSuite_smoke_tests.py
+++ b/dts/tests/TestSuite_smoke_tests.py
@@ -130,7 +130,7 @@ def test_device_bound_to_driver(self) -> None:
 # with the address for the nic we are on in the loop and then 
captures the
 # name of the driver in a group
 devbind_info_for_nic = re.search(
-f"{nic.pci}[^\\n]*drv=([\\d\\w]*) [^\\n]*",
+f"{nic.pci}[^\\n]*drv=([\\d\\w-]*) [^\\n]*",
 all_nics_in_dpdk_devbind,
 )
 self.verify(
-- 
2.34.1



Re: [PATCH v2] app/testpmd: use Tx preparation in txonly engine

2024-02-23 Thread Andrew Rybchenko

On 2/22/24 21:28, Konstantin Ananyev wrote:



+CC: Ethernet API maintainers
+CC: Jerin (commented on another branch of this thread)


From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com]
Sent: Sunday, 11 February 2024 16.04


TSO breaks when MSS spans more than 8 data fragments. Those
packets will be dropped by Tx preparation API, but it will

cause

MDD event if txonly forwarding engine does not call the Tx

preparation

API before transmitting packets.



txonly is used commonly, adding Tx prepare for a specific case

may

impact performance for users.

What happens when driver throws MDD (Malicious Driver Detection)

event,

can't it be ignored? As you are already OK to drop the packet,

can

device be configured to drop these packages?


Or as Jerin suggested adding a new forwarding engine is a

solution,

but

that will create code duplication, I prefer to not have it if

this

can

be handled in device level.


Actually I am agree with the author of the patch - when TX offloads
and/or multisegs are enabled,
user supposed to invoke eth_tx_prepare().
Not doing that seems like a bug to me.


I strongly disagree with that statement, Konstantin!
It is not documented anywhere that using TX offloads and/or multisegs

requires calling rte_eth_tx_prepare() before

rte_eth_tx_burst(). And none of the examples do it.


In fact, we do use it for test-pmd/csumonly.c.
About other sample apps:
AFAIK, not many of other DPDK apps do use L4 offloads.
Right now special treatment (pseudo-header cksum calculation) is needed
only for L4 offloads (CKSUM, SEG).
So, majority of our apps who rely on other TX offloads (multi-seg, ipv4
cksum, vlan insertion) happily run without
calling tx_prepare(), even though it is not the safest way.



In my opinion:
If some driver has limitations for a feature, e.g. max 8 fragments,

it should be documented for that driver, so the application

developer can make the appropriate decisions when designing the

application.

Furthermore, we have APIs for the drivers to expose to the

applications what the driver supports, so the application can configure

itself optimally at startup. Perhaps those APIs need to be expanded.
And if a feature limitation is common across the majority of drivers,

that limitation should be mentioned in the documentation of the

feature itself.


Many of such limitations *are* documented and in fact we do have an API
to check max segments that each driver support,
see struct rte_eth_desc_lim.


Yes, this is the kind of API we should provide, so the application can 
configure itself appropriately.


The problem is:
- none of our sample app does proper check on these values, so users
don't have a good example how to do it.


Agreed.
Adding an example showing how to do it properly would be the best solution.
Calling tx_prepare() in the examples is certainly not the solution.


- with current DPDK API not all of HW/PMD requirements could be
extracted programmatically:
   let say majority of Intel PMDs for TCP offloads expect pseudo-header
cksum to be pre-calculated by the SW.


I hope this requirement is documented somewhere.


   another example, some HW expects pkt_len to be bigger then some
threshold value, otherwise HW hang may appear.


I hope this requirement is also documented somewhere.


No idea, I found it only in the code.


IMHO Tx burst must check such limitations. If you made your HW simpler
(or just lost it on initial testing), pay in your drivers (or your
HW+driver will be unusable because of such problems).


Generally, if the requirements cannot be extracted programmatically, they must 
be prominently documented, like this note to
rte_eth_rx_burst():


Obviously, more detailed documentation is always good, but...
Right now we have 50+ different PMDs from different vendors.
Even if each and every of them will carefully document all possible limitations 
and necessary preparation steps,
how DPDK app developer supposed to  deal with all that?
Do you expect everyone, to read carefully through all of them, and handle all 
of them properly oh his own
in each and every DPDK app he is going to write?
That seems unrealistic.
Again what to do with backward compatibility: when new driver (with new 
limitations) will arise
*after* your app is already written and tested?


+1



  * @note
  *   Some drivers using vector instructions require that *nb_pkts* is
  *   divisible by 4 or 8, depending on the driver implementation.


I'm wondering what application should do if it needs to send just one 
packet and do it now. IMHO, such limitations are not acceptable.





- As new HW and PMD keep appearing it is hard to predict what extra
limitations/requirements will arise,
   that's why tx_prepare() was introduced as s driver op.



We don't want to check in the fast path what can be checked at

startup or build time!

If your app supposed to work with just a few, known in advance, NIC
models, then sure, you can do that.
For apps that supposed to work 'in ge

Re: Where to best ack a series

2024-02-23 Thread Ferruh Yigit
On 2/23/2024 8:15 AM, Morten Brørup wrote:
> Dear maintainers,
> 
> Is it easier for you to spot if we ack a series in patch 0, patch 1, or the 
> last patch of the series? Or don't you have any preferences?
> 

When a patch is ack'ed, not cover letter (patch 0), patchwork detects it
and both shows it in the web interface (A/R/T), and automatically adds
it when patch applied from patchwork, so this makes life easy.

But to ack each patch in a series one by one is noise for mailing list
and overhead for reviewer. For this case I think better to ack whole
series in reply to cover letter, maintainer can apply this manually to
each patch.

When there is a patch series, but it doesn't have a cover letter, I tend
to reply to patch 1, but I don't think patch 1 or last patch matters,
only to differentiate if the ack is for that patch or whole, I am adding:
```
For series,
Acked-by: ...
```

>From maintainers perspective this manually adding tags is small enough
work to ignore, but I see authors are impacted too, like if a previous
version cover letter is acked, they are not adding this ack manually to
each patch in next version, requiring reviewer ack the new version again



I guess best solution is add this series ack support to patchwork,
it can be either:
- Ack in cover letter automatically add ack to each patch in the series.
or
- Add new "Series-acked-by: " syntax, which if patchwork detects it in
any of patch in the series automatically add ack to each patch in the
series.



Re: [PATCH v2 5/5] net/cnxk: select optimized LLC transaction type

2024-02-23 Thread Jerin Jacob
On Thu, Feb 22, 2024 at 3:38 PM Rahul Bhansali  wrote:
>
> LLC transaction optimization by using LDWB LDTYPE option
> in SG preparation for Tx. With this, if data is present
> and dirty in LLC then the LLC would mark the data clean.
>
> Signed-off-by: Rahul Bhansali 

Series applied to dpdk-next-net-mrvl/for-main. Thanks



> ---
> Changes in v2: No change
>
>  drivers/net/cnxk/cn10k_tx.h | 16 +---
>  1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/cnxk/cn10k_tx.h b/drivers/net/cnxk/cn10k_tx.h
> index 664e47e1fc..fcd19be77e 100644
> --- a/drivers/net/cnxk/cn10k_tx.h
> +++ b/drivers/net/cnxk/cn10k_tx.h
> @@ -331,9 +331,15 @@ cn10k_nix_tx_skeleton(struct cn10k_eth_txq *txq, 
> uint64_t *cmd,
> else
> cmd[2] = NIX_SUBDC_EXT << 60;
> cmd[3] = 0;
> -   cmd[4] = (NIX_SUBDC_SG << 60) | BIT_ULL(48);
> +   if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F))
> +   cmd[4] = (NIX_SUBDC_SG << 60) | (NIX_SENDLDTYPE_LDWB 
> << 58) | BIT_ULL(48);
> +   else
> +   cmd[4] = (NIX_SUBDC_SG << 60) | BIT_ULL(48);
> } else {
> -   cmd[2] = (NIX_SUBDC_SG << 60) | BIT_ULL(48);
> +   if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F))
> +   cmd[2] = (NIX_SUBDC_SG << 60) | (NIX_SENDLDTYPE_LDWB 
> << 58) | BIT_ULL(48);
> +   else
> +   cmd[2] = (NIX_SUBDC_SG << 60) | BIT_ULL(48);
> }
>  }
>
> @@ -1989,7 +1995,11 @@ cn10k_nix_xmit_pkts_vector(void *tx_queue, uint64_t 
> *ws,
>
> senddesc01_w1 = vdupq_n_u64(0);
> senddesc23_w1 = senddesc01_w1;
> -   sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | BIT_ULL(48));
> +   if (!(flags & NIX_TX_OFFLOAD_MBUF_NOFF_F))
> +   sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | 
> (NIX_SENDLDTYPE_LDWB << 58) |
> + BIT_ULL(48));
> +   else
> +   sgdesc01_w0 = vdupq_n_u64((NIX_SUBDC_SG << 60) | BIT_ULL(48));
> sgdesc23_w0 = sgdesc01_w0;
>
> if (flags & NIX_TX_NEED_EXT_HDR) {
> --
> 2.25.1
>


Re: [PATCH v2 1/4] ethdev: add function to check representor port

2024-02-23 Thread Ferruh Yigit
On 2/23/2024 2:42 AM, Chaoyong He wrote:
> From: Long Wu 
> 
> Add a function to check if a device is representor port, also
> modified the related codes for PMDs.
> 

Thanks Long for the patch.

> Signed-off-by: Long Wu 
> Reviewed-by: Chaoyong He 
> Reviewed-by: Peng Zhang 
> ---
>  doc/guides/rel_notes/release_24_03.rst |  3 +++
>  drivers/net/bnxt/bnxt.h|  3 ---
>  drivers/net/bnxt/bnxt_ethdev.c |  4 ++--
>  drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c | 12 ++--
>  drivers/net/bnxt/tf_ulp/bnxt_ulp.c |  4 ++--
>  drivers/net/bnxt/tf_ulp/ulp_def_rules.c|  4 ++--
>  drivers/net/cpfl/cpfl_representor.c|  2 +-
>  drivers/net/enic/enic.h|  5 -
>  drivers/net/enic/enic_ethdev.c |  2 +-
>  drivers/net/enic/enic_fm_flow.c| 20 ++--
>  drivers/net/enic/enic_main.c   |  4 ++--
>  drivers/net/i40e/i40e_ethdev.c |  2 +-
>  drivers/net/ice/ice_dcf_ethdev.c   |  2 +-
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  2 +-
>  drivers/net/nfp/flower/nfp_flower_flow.c   |  2 +-
>  drivers/net/nfp/nfp_mtr.c  |  2 +-
>  drivers/net/nfp/nfp_net_common.c   |  4 ++--
>  drivers/net/nfp/nfp_net_flow.c |  2 +-
>  lib/ethdev/ethdev_driver.h | 17 +
>  19 files changed, 54 insertions(+), 42 deletions(-)
> 

There are two more instance in 'rte_class_eth.c'

> diff --git a/doc/guides/rel_notes/release_24_03.rst 
> b/doc/guides/rel_notes/release_24_03.rst
> index 879bb4944c..8178417b98 100644
> --- a/doc/guides/rel_notes/release_24_03.rst
> +++ b/doc/guides/rel_notes/release_24_03.rst
> @@ -185,6 +185,9 @@ API Changes
>  * ethdev: Renamed structure ``rte_flow_action_modify_data`` to be
>``rte_flow_field_data`` for more generic usage.
>  
> +* ethdev: Add new function ``rte_eth_dev_is_repr()`` to check if a device is
> +  representor port.
> +
>  

This is not user facing API, no need to update the release notes.

<...>

> +/**
> + * @internal
> + * Check if the ethdev is a representor port.
> + *
> + * @param dev
> + *  Pointer to struct rte_eth_dev.
> + *
> + * @return
> + *  false the ethdev is not a representor port.
> + *  true  the ethdev is a representor port.
> + */
> +static inline bool
> +rte_eth_dev_is_repr(const struct rte_eth_dev *dev)
> +{
> + return ((dev->data->dev_flags & RTE_ETH_DEV_REPRESENTOR) != 0);
> +}
> +

There is a 'rte_eth_representor_id_get()' API above, can you move this
new one below it to group them together?



Re: [PATCH] app/dma-perf: add average latency per worker

2024-02-23 Thread fengchengwen
Hi Vipin,

On 2023/12/20 0:40, Vipin Varghese wrote:
> Modify the user display data with total average latency per worker.
> 
> Signed-off-by: Vipin Varghese 
> ---
>  app/test-dma-perf/benchmark.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test-dma-perf/benchmark.c b/app/test-dma-perf/benchmark.c
> index 9b1f58c78c..8b6886af62 100644
> --- a/app/test-dma-perf/benchmark.c
> +++ b/app/test-dma-perf/benchmark.c
> @@ -470,7 +470,8 @@ mem_copy_benchmark(struct test_configure *cfg, bool 
> is_dma)
>   bandwidth_total += bandwidth;
>   avg_cycles_total += avg_cycles;
>   }
> - printf("\nTotal Bandwidth: %.3lf Gbps, Total MOps: %.3lf\n", 
> bandwidth_total, mops_total);
> + printf("\nAverage Cycles/op: %.2lf, Total Bandwidth: %.3lf Gbps, Total 
> MOps: %.3lf\n",
> + (float) avg_cycles_total / nb_workers, bandwidth_total, 
> mops_total);

Because this is total stats, suggest add Total prefix, e.g. "Total Average 
Cycles/op"

I think print format keep one-digit precision is enough. Also please modify 
CSV_TOTAL_LINE_FMT
make sure the csv also have same precision of Cycles/op.

Thanks

>   snprintf(output_str[MAX_WORKER_NB], MAX_OUTPUT_STR_LEN, 
> CSV_TOTAL_LINE_FMT,
>   cfg->scenario_id, nr_buf, memory * nb_workers,
>   avg_cycles_total / nb_workers, bandwidth_total, 
> mops_total);
> 


Re: [RFC v3 1/6] eal: add static per-lcore memory allocation facility

2024-02-23 Thread Mattias Rönnblom

On 2024-02-22 10:22, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
Sent: Tuesday, 20 February 2024 09.49

Introduce DPDK per-lcore id variables, or lcore variables for short.

An lcore variable has one value for every current and future lcore
id-equipped thread.

The primary  use case is for statically allocating
small chunks of often-used data, which is related logically, but where
there are performance benefits to reap from having updates being local
to an lcore.

Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.

Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.

The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.

RFC v3:
  * Replace use of GCC-specific alignof() with alignof().
  * Update example to reflect FOREACH macro name change (in RFC v2).

RFC v2:
  * Use alignof to derive alignment requirements. (Morten Brørup)
  * Change name of FOREACH to make it distinct from 's
*per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
  * Allow user-specified alignment, but limit max to cache line size.

Signed-off-by: Mattias Rönnblom 
---
  config/rte_config.h   |   1 +
  doc/api/doxy-api-index.md |   1 +
  lib/eal/common/eal_common_lcore_var.c |  82 ++
  lib/eal/common/meson.build|   1 +
  lib/eal/include/meson.build   |   1 +
  lib/eal/include/rte_lcore_var.h   | 375 ++
  lib/eal/version.map   |   4 +
  7 files changed, 465 insertions(+)
  create mode 100644 lib/eal/common/eal_common_lcore_var.c
  create mode 100644 lib/eal/include/rte_lcore_var.h

diff --git a/config/rte_config.h b/config/rte_config.h
index da265d7dd2..884482e473 100644
--- a/config/rte_config.h
+++ b/config/rte_config.h
@@ -30,6 +30,7 @@
  /* EAL defines */
  #define RTE_CACHE_GUARD_LINES 1
  #define RTE_MAX_HEAPS 32
+#define RTE_MAX_LCORE_VAR 1048576
  #define RTE_MAX_MEMSEG_LISTS 128
  #define RTE_MAX_MEMSEG_PER_LIST 8192
  #define RTE_MAX_MEM_MB_PER_LIST 32768
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index a6a768bd7c..bb06bb7ca1 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -98,6 +98,7 @@ The public API headers are grouped by topics:
[interrupts](@ref rte_interrupts.h),
[launch](@ref rte_launch.h),
[lcore](@ref rte_lcore.h),
+  [lcore-varible](@ref rte_lcore_var.h),
[per-lcore](@ref rte_per_lcore.h),
[service cores](@ref rte_service.h),
[keepalive](@ref rte_keepalive.h),
diff --git a/lib/eal/common/eal_common_lcore_var.c
b/lib/eal/common/eal_common_lcore_var.c
new file mode 100644
index 00..dfd11cbd0b
--- /dev/null
+++ b/lib/eal/common/eal_common_lcore_var.c
@@ -0,0 +1,82 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "eal_private.h"
+
+#define WARN_THRESHOLD 75


It's not an error condition, so 75 % seems like a low threshold for WARNING.
Consider increasing it to 95 %, or change the level to NOTICE.
Or both.



I'll make an attempt at a variant which uses the libc heap instead of 
BSS, and does so dynamically. Then one need not worry about a fixed-size 
upper bound, barring heap allocation failures (which you are best off 
making fatal in the lcore variables case).


The glibc heap is available early (as early as the earliest RTE_INIT()).

You also avoid the headache of thinking about what happens if indeed all 
of the rte_lcore_var array is backed by actual memory. That could be due 
to mlockall(), huge page use for BSS, or systems where BSS is not 
on-demand mapped. I have no idea how paging works on Windows NT, for 
example.



+
+/*
+ * Avoid using offset zero, since it would result in a NULL-value
+ * "handle" (offset) pointer, which in principle and per the API
+ * definition shouldn't be an issue, but may confuse some tools and
+ * users.
+ */
+#define INITIAL_OFFSET 1
+
+char rte_lcore_var[RTE_MAX_LCORE][RTE_MAX_LCORE_VAR] __rte_cache_aligned;
+
+static uintptr_t allocated = INITIAL_OFFSET;


Please add an API to get the amount of allocated lcore variable memory.
The easy option is to make the above variable public (with a proper name, e.g. 
rte_lcore_var_allocated).

The total amount of lcore variable memory is al

Re: [RFC v3 5/6] service: keep per-lcore state in lcore variable

2024-02-23 Thread Mattias Rönnblom

On 2024-02-22 10:42, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
Sent: Tuesday, 20 February 2024 09.49

Replace static array of cache-aligned structs with an lcore variable,
to slightly benefit code simplicity and performance.

Signed-off-by: Mattias Rönnblom 
---




@@ -486,8 +489,7 @@ service_runner_func(void *arg)
  {
RTE_SET_USED(arg);
uint8_t i;
-   const int lcore = rte_lcore_id();
-   struct core_state *cs = &lcore_states[lcore];
+   struct core_state *cs = RTE_LCORE_VAR_PTR(lcore_states);


Typo: TAB -> SPACE.



Will fix.



rte_atomic_store_explicit(&cs->thread_active, 1,
rte_memory_order_seq_cst);

@@ -533,13 +535,16 @@ service_runner_func(void *arg)
  int32_t
  rte_service_lcore_may_be_active(uint32_t lcore)
  {
-   if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core)
+   struct core_state *cs =
+   RTE_LCORE_VAR_LCORE_PTR(lcore, lcore_states);
+
+   if (lcore >= RTE_MAX_LCORE || !cs->is_service_core)
return -EINVAL;


This comment is mostly related to patch 1 in the series...

You are setting cs = RTE_LCORE_VAR_LCORE_PTR(lcore, ...) before validating that 
lcore < RTE_MAX_LCORE. I wondered if that potentially was an overrun bug.

It is obvious when looking at the RTE_LCORE_VAR_LCORE_PTR() macro implementation, but 
perhaps its description could mention that it is safe to use with an "invalid" 
lcore_id, but not dereferencing it.



I thought about adding something equivalent to an RTE_ASSERT() on 
lcore_id in the dereferencing macros, but then I thought that maybe it 
is a valid use case to pass invalid lcore ids.


Invalid ids being OK or not, I think the above code should do "cs = 
/../" *after* the lcore id check. Now it looks strange and force the 
reader to consider if this is valid or not, for no good reason.


The lcore variable API docs should probably explicitly allow invalid 
core id in the macros.


[PATCH v2] examples/ipsec-secgw: fix cryptodev to SA mapping

2024-02-23 Thread Radu Nicolau
There are use cases where a SA should be able to use different cryptodevs on
different lcores, for example there can be cryptodevs with just 1 qp per VF.
For this purpose this patch relaxes the check in create lookaside session 
function.
Also add a check to verify that a CQP is available for the current lcore.

Fixes: a8ade12123c3 ("examples/ipsec-secgw: create lookaside sessions at init")
Cc: sta...@dpdk.org
Cc: vfia...@marvell.com

Signed-off-by: Radu Nicolau 
Signed-off-by: Radu Nicolau 
Tested-by: Ting-Kai Ku 
Signed-off-by: Radu Nicolau 
Acked-by: Ciara Power 
Acked-by: Kai Ji 
---
v2: add likely to CQP available branch

 examples/ipsec-secgw/ipsec.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index f5cec4a928..7bb9646736 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -288,10 +288,9 @@ create_lookaside_session(struct ipsec_ctx 
*ipsec_ctx_lcore[],
if (cdev_id == RTE_CRYPTO_MAX_DEVS)
cdev_id = ipsec_ctx->tbl[cdev_id_qp].id;
else if (cdev_id != ipsec_ctx->tbl[cdev_id_qp].id) {
-   RTE_LOG(ERR, IPSEC,
-   "SA mapping to multiple cryptodevs is "
-   "not supported!");
-   return -EINVAL;
+   RTE_LOG(WARNING, IPSEC,
+   "SA mapped to multiple cryptodevs for SPI %d\n",
+   sa->spi);
}
 
/* Store per core queue pair information */
@@ -908,7 +907,11 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx 
*ipsec_ctx,
continue;
}
 
-   enqueue_cop(sa->cqp[ipsec_ctx->lcore_id], &priv->cop);
+   if (likely(sa->cqp[ipsec_ctx->lcore_id]))
+   enqueue_cop(sa->cqp[ipsec_ctx->lcore_id], &priv->cop);
+   else
+   RTE_LOG(ERR, IPSEC, "No CQP available for lcore %d\n",
+   ipsec_ctx->lcore_id);
}
 }
 
-- 
2.34.1



Re: [PATCH v5 1/3] config/arm: avoid mcpu and march conflicts

2024-02-23 Thread Juraj Linkeš
Other than the one point below,
Reviewed-by: Juraj Linkeš 

> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 36f21d2259..d05d54b564 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build

> @@ -695,13 +698,37 @@ if update_flags
>
>  machine_args = [] # Clear previous machine args
>
> -# probe supported archs and their features
> +march_features = []
> +if part_number_config.has_key('march_features')
> +march_features += part_number_config['march_features']
> +endif
> +if soc_config.has_key('extra_march_features')
> +march_features += soc_config['extra_march_features']
> +endif
> +
> +candidate_mcpu = ''
>  candidate_march = ''
> -if part_number_config.has_key('march')
> +
> +if part_number_config.has_key('mcpu') and
> +   cc.has_argument('-mcpu=' + part_number_config['mcpu'])
> +candidate_mcpu = '-mcpu=' + part_number_config['mcpu']
> +foreach feature: march_features
> +if cc.has_argument('+'.join([candidate_mcpu, feature]))
> +candidate_mcpu = '+'.join([candidate_mcpu, feature])
> +else
> +warning('The compiler does not support feature @0@'
> +.format(feature))
> +endif
> +endforeach
> +machine_args += candidate_mcpu
> +elif part_number_config.has_key('march')
> +# probe supported archs and their features
>  if part_number_config.get('force_march', false)
> -candidate_march = part_number_config['march']
> +if cc.has_argument('-march=' +  part_number_config['march'])
> +candidate_march = part_number_config['march']
> +endif

The check was omitted here by design because aarch32 builds with some
compilers don't support -march=armv8-a alone, only with -mfpu= as
well.

>  else
> -supported_marchs = ['armv8.6-a', 'armv8.5-a', 'armv8.4-a', 
> 'armv8.3-a',
> +supported_marchs = ['armv9-a', 'armv8.6-a', 'armv8.5-a', 
> 'armv8.4-a', 'armv8.3-a',
>  'armv8.2-a', 'armv8.1-a', 'armv8-a']
>  check_compiler_support = false
>  foreach supported_march: supported_marchs
> @@ -717,32 +744,31 @@ if update_flags
>  endif
>  endforeach
>  endif
> -if candidate_march == ''
> -error('No suitable armv8 march version found.')
> -endif
> +
>  if candidate_march != part_number_config['march']
> -warning('Configuration march version is ' +
> -'@0@, but the compiler supports only @1@.'
> -.format(part_number_config['march'], candidate_march))
> +warning('Configuration march version is @0@, not supported.'
> +.format(part_number_config['march']))
> +if candidate_march != ''
> +warning('Using march version @0@.'.format(candidate_march))
> +endif
>  endif
> -candidate_march = '-march=' + candidate_march
>
> -march_features = []
> -if part_number_config.has_key('march_features')
> -march_features += part_number_config['march_features']
> -endif
> -if soc_config.has_key('extra_march_features')
> -march_features += soc_config['extra_march_features']
> +if candidate_march != ''
> +candidate_march = '-march=' + candidate_march
> +foreach feature: march_features
> +if cc.has_argument('+'.join([candidate_march, feature]))
> +candidate_march = '+'.join([candidate_march, feature])
> +else
> +warning('The compiler does not support feature @0@'
> +.format(feature))
> +endif
> +endforeach
> +machine_args += candidate_march
>  endif
> -foreach feature: march_features
> -if cc.has_argument('+'.join([candidate_march, feature]))
> -candidate_march = '+'.join([candidate_march, feature])
> -else
> -warning('The compiler does not support feature @0@'
> -.format(feature))
> -endif
> -endforeach
> -machine_args += candidate_march
> +endif
> +
> +if candidate_mcpu == '' and candidate_march == ''
> +error('No suitable ARM march/mcpu version found.')
>  endif
>
>  # apply supported compiler options
> --
> 2.25.1
>


Re: [PATCH v3] net/af_xdp: fix resources leak when xsk configure fails

2024-02-23 Thread Ferruh Yigit
On 2/23/2024 1:45 AM, Yunjian Wang wrote:
> In xdp_umem_configure() allocated some resources for the
> xsk umem, we should delete them when xsk configure fails,
> otherwise it will lead to resources leak.
> 
> Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yunjian Wang 
> 

Moving from previous version:
Reviewed-by: Ciara Loftus 

Acked-by: Ferruh Yigit 


Applied to dpdk-next-net/main, thanks.


Re: [PATCH v5 2/3] config/arm: add support for fallback march

2024-02-23 Thread Juraj Linkeš
On Thu, Feb 22, 2024 at 1:45 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Some ARM CPUs have specific march requirements and
> are not compatible with the supported march list.
> Add fallback march in case the mcpu and the march
> advertised in the part_number_config are not supported
> by the compiler.
>
> Example
> mcpu = neoverse-n2
> march = armv9-a
> fallback_march = armv8.5-a
>
> mcpu, march not supported
> machine_args = ['-march=armv8.5-a']
>
> mcpu, march, fallback_march not supported
> least march supported = armv8-a
>
> machine_args = ['-march=armv8-a']
>
> Signed-off-by: Pavan Nikhilesh 
> ---
>  config/arm/meson.build | 14 +++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index d05d54b564..87ff5039f6 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -94,6 +94,7 @@ part_number_config_arm = {
>  '0xd49': {
>  'march': 'armv9-a',
>  'march_features': ['sve2'],
> +'fallback_march': 'armv8.5-a',
>  'mcpu': 'neoverse-n2',
>  'flags': [
>  ['RTE_MACHINE', '"neoverse-n2"'],
> @@ -708,6 +709,7 @@ if update_flags
>
>  candidate_mcpu = ''
>  candidate_march = ''
> +fallback_march = ''
>
>  if part_number_config.has_key('mcpu') and
> cc.has_argument('-mcpu=' + part_number_config['mcpu'])
> @@ -736,16 +738,22 @@ if update_flags
>  # start checking from this version downwards
>  check_compiler_support = true
>  endif
> -if (check_compiler_support and
> +if (check_compiler_support and candidate_march == '' and
>  cc.has_argument('-march=' + supported_march))
>  candidate_march = supported_march
> -# highest supported march version found
> -break
> +endif
> +if (part_number_config.has_key('fallback_march') and
> +supported_march == part_number_config['fallback_march'] 
> and
> +cc.has_argument('-march=' + supported_march))
> +fallback_march = supported_march

If both fallback_march and march are supported, fallback_march is
going to be chosen over march. I think this is what we want instead:
Use march if supported,
then use fallback_march if supported,
then use the other fallback marchs.

If the above is indeed what we want, we could just put this after
endforeach (in the original version):

# at this point, candidate march is either
part_number_config['fallback_march'] or some other lower version
if (part_number_config.has_key('fallback_march') and
candidate_march != part_number_config['march'] and
cc.has_argument('-march=' + part_number_config['fallback_march']))
# this overwrites only the lower version with preferred
fallback_march, if supported
candidate_march = part_number_config['fallback_march']
endif

This way we won't even need the fallback_march variable.

>  endif
>  endforeach
>  endif
>
>  if candidate_march != part_number_config['march']
> +if fallback_march != ''
> +candidate_march = fallback_march
> +endif
>  warning('Configuration march version is @0@, not supported.'
>  .format(part_number_config['march']))
>  if candidate_march != ''
> --
> 2.25.1
>


Re: [PATCH v2 1/2] app/testpmd: fix modify tag typo

2024-02-23 Thread Ferruh Yigit
On 2/23/2024 3:21 AM, Rongwei Liu wrote:
> Update the name to the right one: "src_tag_index"
> 
> Fixes: c23626f27b09 ("ethdev: add MPLS header modification")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Rongwei Liu 
> Acked-by: Dariusz Sosnowski 
> 

Acked-by: Ferruh Yigit 



Re: [PATCH v2 0/2] Fix modify flex item error

2024-02-23 Thread Ferruh Yigit
On 2/23/2024 3:21 AM, Rongwei Liu wrote:
> v2: rebase.
> 
> Rongwei Liu (2):
>   app/testpmd: fix modify tag typo
>   net/mlx5: fix modify flex item error
> 

Series applied to dpdk-next-net/main, thanks.



[PATCH v4 1/2] crypto/ipsec_mb: bump minimum IPsec Multi-buffer version

2024-02-23 Thread Sivaramakrishnan Venkat
SW PMDs increment IPsec Multi-buffer version to 1.4.
A minimum IPsec Multi-buffer version of 1.4 or greater is now required.

Signed-off-by: Sivaramakrishnan Venkat 
Acked-by: Ciara Power 
Acked-by: Pablo de Lara 
---
  v4:
 - 24.03 release notes updated to bump minimum IPSec Multi-buffer
   version to 1.4 for SW PMDs.
  v2:
 - Removed unused macro in ipsec_mb_ops.c
 - set_gcm_job() modified correctly to keep multi_sgl_job line
 - Updated SW PMDs documentation for minimum IPSec Multi-buffer version
 - Updated commit message, and patch title.
---
 doc/guides/cryptodevs/aesni_gcm.rst |   3 +-
 doc/guides/cryptodevs/aesni_mb.rst  |   3 +-
 doc/guides/cryptodevs/chacha20_poly1305.rst |   3 +-
 doc/guides/cryptodevs/kasumi.rst|   3 +-
 doc/guides/cryptodevs/snow3g.rst|   3 +-
 doc/guides/cryptodevs/zuc.rst   |   3 +-
 doc/guides/rel_notes/release_24_03.rst  |   4 +
 drivers/crypto/ipsec_mb/ipsec_mb_ops.c  |  23 ---
 drivers/crypto/ipsec_mb/meson.build |   2 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 165 
 drivers/crypto/ipsec_mb/pmd_aesni_mb_priv.h |   9 --
 11 files changed, 17 insertions(+), 204 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index f5773426ee..dc665e536c 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -85,7 +85,8 @@ and the external crypto libraries supported by them:
18.05 - 19.02  Multi-buffer library 0.49 - 0.52
19.05 - 20.08  Multi-buffer library 0.52 - 0.55
20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index b2e74ba417..5d670ee237 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -146,7 +146,8 @@ and the Multi-Buffer library version supported by them:
19.05 - 19.08   0.52
19.11 - 20.08   0.52 - 0.55
20.11 - 21.08   0.53 - 1.3*
-   21.11+  1.0  - 1.5*
+   21.11 - 23.11   1.0  - 1.5*
+   24.03+  1.4  - 1.5*
==  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index 9d4bf86cf1..c32866b301 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -72,7 +72,8 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21.11+ Multi-buffer library 1.0-1.5*
+   21.11 - 23.11  Multi-buffer library 1.0-1.5*
+   24.03+ Multi-buffer library 1.4-1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/kasumi.rst b/doc/guides/cryptodevs/kasumi.rst
index 0989054875..a8f4e6b204 100644
--- a/doc/guides/cryptodevs/kasumi.rst
+++ b/doc/guides/cryptodevs/kasumi.rst
@@ -87,7 +87,8 @@ and the external crypto libraries supported by them:
=  
16.11 - 19.11  LibSSO KASUMI
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/snow3g.rst b/doc/guides/cryptodevs/snow3g.rst
index 3392932653..46863462e5 100644
--- a/doc/guides/cryptodevs/snow3g.rst
+++ b/doc/guides/cryptodevs/snow3g.rst
@@ -96,7 +96,8 @@ and the external crypto libraries supported by them:
=  
16.04 - 19.11  LibSSO SNOW3G
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/zuc.rst b/doc/guides/cryptodevs/zuc.rst
index a414b5ad2c..51867e1a16 100644
--- a/doc/guides/cryptodevs/zuc.rst
+++ b/doc/guides/cryptodevs/zuc.rst
@@ -95,7 +95,8 @@ and the external crypto libraries supported by them:
=  ===

[PATCH v4 2/2] doc: remove outdated version details

2024-02-23 Thread Sivaramakrishnan Venkat
SW PMDs documentation is updated to remove details of unsupported IPsec
Multi-buffer versions.DPDK older than 20.11 is end of life. So, older
DPDK versions are removed from the Crypto library version table.

Signed-off-by: Sivaramakrishnan Venkat 
Acked-by: Pablo de Lara 
---
  v3:
- added second patch for outdated documentation updates.
---
 doc/guides/cryptodevs/aesni_gcm.rst | 19 +++---
 doc/guides/cryptodevs/aesni_mb.rst  | 22 +++--
 doc/guides/cryptodevs/chacha20_poly1305.rst | 12 ++-
 doc/guides/cryptodevs/kasumi.rst| 14 +++--
 doc/guides/cryptodevs/snow3g.rst| 15 +++---
 doc/guides/cryptodevs/zuc.rst   | 15 +++---
 6 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index dc665e536c..e38a03b78f 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -62,12 +62,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
@@ -79,18 +73,11 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   16.04 - 16.11  Multi-buffer library 0.43 - 0.44
-   17.02 - 17.05  ISA-L Crypto v2.18
-   17.08 - 18.02  Multi-buffer library 0.46 - 0.48
-   18.05 - 19.02  Multi-buffer library 0.49 - 0.52
-   19.05 - 20.08  Multi-buffer library 0.52 - 0.55
-   20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
-   24.03+ Multi-buffer library 1.4  - 1.5*
+   20.11 - 21.08  Multi-buffer library 0.53 - 1.3
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5
+   24.03+ Multi-buffer library 1.4  - 1.5
=  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 5d670ee237..bd7c8de07f 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -121,12 +121,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the Multi-Buffer library version supported by them:
 
@@ -137,21 +131,11 @@ and the Multi-Buffer library version supported by them:
==  
DPDK versionMulti-buffer library version
==  
-   2.2 - 16.11 0.43 - 0.44
-   17.02   0.44
-   17.05 - 17.08   0.45 - 0.48
-   17.11   0.47 - 0.48
-   18.02   0.48
-   18.05 - 19.02   0.49 - 0.52
-   19.05 - 19.08   0.52
-   19.11 - 20.08   0.52 - 0.55
-   20.11 - 21.08   0.53 - 1.3*
-   21.11 - 23.11   1.0  - 1.5*
-   24.03+  1.4  - 1.5*
+   20.11 - 21.08   0.53 - 1.3
+   21.11 - 23.11   1.0  - 1.5
+   24.03+  1.4  - 1.5
==  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index c32866b301..8e0ee4f835 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -56,12 +56,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
 
@@ -72,12 +66,10 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21

[PATCH v4 1/2] crypto/ipsec_mb: bump minimum IPsec Multi-buffer version

2024-02-23 Thread Sivaramakrishnan Venkat
SW PMDs increment IPsec Multi-buffer version to 1.4.
A minimum IPsec Multi-buffer version of 1.4 or greater is now required.

Signed-off-by: Sivaramakrishnan Venkat 
Acked-by: Ciara Power 
Acked-by: Pablo de Lara 
---
  v4:
 - 24.03 release notes updated to bump minimum IPSec Multi-buffer
   version to 1.4 for SW PMDs.
  v2:
 - Removed unused macro in ipsec_mb_ops.c
 - set_gcm_job() modified correctly to keep multi_sgl_job line
 - Updated SW PMDs documentation for minimum IPSec Multi-buffer version
 - Updated commit message, and patch title.
---
 doc/guides/cryptodevs/aesni_gcm.rst |   3 +-
 doc/guides/cryptodevs/aesni_mb.rst  |   3 +-
 doc/guides/cryptodevs/chacha20_poly1305.rst |   3 +-
 doc/guides/cryptodevs/kasumi.rst|   3 +-
 doc/guides/cryptodevs/snow3g.rst|   3 +-
 doc/guides/cryptodevs/zuc.rst   |   3 +-
 doc/guides/rel_notes/release_24_03.rst  |   4 +
 drivers/crypto/ipsec_mb/ipsec_mb_ops.c  |  23 ---
 drivers/crypto/ipsec_mb/meson.build |   2 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 165 
 drivers/crypto/ipsec_mb/pmd_aesni_mb_priv.h |   9 --
 11 files changed, 17 insertions(+), 204 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index f5773426ee..dc665e536c 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -85,7 +85,8 @@ and the external crypto libraries supported by them:
18.05 - 19.02  Multi-buffer library 0.49 - 0.52
19.05 - 20.08  Multi-buffer library 0.52 - 0.55
20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index b2e74ba417..5d670ee237 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -146,7 +146,8 @@ and the Multi-Buffer library version supported by them:
19.05 - 19.08   0.52
19.11 - 20.08   0.52 - 0.55
20.11 - 21.08   0.53 - 1.3*
-   21.11+  1.0  - 1.5*
+   21.11 - 23.11   1.0  - 1.5*
+   24.03+  1.4  - 1.5*
==  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index 9d4bf86cf1..c32866b301 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -72,7 +72,8 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21.11+ Multi-buffer library 1.0-1.5*
+   21.11 - 23.11  Multi-buffer library 1.0-1.5*
+   24.03+ Multi-buffer library 1.4-1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/kasumi.rst b/doc/guides/cryptodevs/kasumi.rst
index 0989054875..a8f4e6b204 100644
--- a/doc/guides/cryptodevs/kasumi.rst
+++ b/doc/guides/cryptodevs/kasumi.rst
@@ -87,7 +87,8 @@ and the external crypto libraries supported by them:
=  
16.11 - 19.11  LibSSO KASUMI
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/snow3g.rst b/doc/guides/cryptodevs/snow3g.rst
index 3392932653..46863462e5 100644
--- a/doc/guides/cryptodevs/snow3g.rst
+++ b/doc/guides/cryptodevs/snow3g.rst
@@ -96,7 +96,8 @@ and the external crypto libraries supported by them:
=  
16.04 - 19.11  LibSSO SNOW3G
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/zuc.rst b/doc/guides/cryptodevs/zuc.rst
index a414b5ad2c..51867e1a16 100644
--- a/doc/guides/cryptodevs/zuc.rst
+++ b/doc/guides/cryptodevs/zuc.rst
@@ -95,7 +95,8 @@ and the external crypto libraries supported by them:
=  ===

[PATCH v4 2/2] doc: remove outdated version details

2024-02-23 Thread Sivaramakrishnan Venkat
SW PMDs documentation is updated to remove details of unsupported IPsec
Multi-buffer versions.DPDK older than 20.11 is end of life. So, older
DPDK versions are removed from the Crypto library version table.

Signed-off-by: Sivaramakrishnan Venkat 
Acked-by: Pablo de Lara 
---
  v3:
- added second patch for outdated documentation updates.
---
 doc/guides/cryptodevs/aesni_gcm.rst | 19 +++---
 doc/guides/cryptodevs/aesni_mb.rst  | 22 +++--
 doc/guides/cryptodevs/chacha20_poly1305.rst | 12 ++-
 doc/guides/cryptodevs/kasumi.rst| 14 +++--
 doc/guides/cryptodevs/snow3g.rst| 15 +++---
 doc/guides/cryptodevs/zuc.rst   | 15 +++---
 6 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index dc665e536c..e38a03b78f 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -62,12 +62,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
@@ -79,18 +73,11 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   16.04 - 16.11  Multi-buffer library 0.43 - 0.44
-   17.02 - 17.05  ISA-L Crypto v2.18
-   17.08 - 18.02  Multi-buffer library 0.46 - 0.48
-   18.05 - 19.02  Multi-buffer library 0.49 - 0.52
-   19.05 - 20.08  Multi-buffer library 0.52 - 0.55
-   20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
-   24.03+ Multi-buffer library 1.4  - 1.5*
+   20.11 - 21.08  Multi-buffer library 0.53 - 1.3
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5
+   24.03+ Multi-buffer library 1.4  - 1.5
=  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 5d670ee237..bd7c8de07f 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -121,12 +121,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the Multi-Buffer library version supported by them:
 
@@ -137,21 +131,11 @@ and the Multi-Buffer library version supported by them:
==  
DPDK versionMulti-buffer library version
==  
-   2.2 - 16.11 0.43 - 0.44
-   17.02   0.44
-   17.05 - 17.08   0.45 - 0.48
-   17.11   0.47 - 0.48
-   18.02   0.48
-   18.05 - 19.02   0.49 - 0.52
-   19.05 - 19.08   0.52
-   19.11 - 20.08   0.52 - 0.55
-   20.11 - 21.08   0.53 - 1.3*
-   21.11 - 23.11   1.0  - 1.5*
-   24.03+  1.4  - 1.5*
+   20.11 - 21.08   0.53 - 1.3
+   21.11 - 23.11   1.0  - 1.5
+   24.03+  1.4  - 1.5
==  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index c32866b301..8e0ee4f835 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -56,12 +56,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
 
@@ -72,12 +66,10 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21

Re: [PATCH v4 00/12] improve eventdev API specification/documentation

2024-02-23 Thread Jerin Jacob
On Wed, Feb 21, 2024 at 4:10 PM Bruce Richardson
 wrote:
>
> This patchset makes rewording improvements to the eventdev doxygen
> documentation to try and ensure that it is as clear as possible,
> describes the implementation as accurately as possible, and is
> consistent within itself.
>
> Most changes are just minor rewordings, along with plenty of changes to
> change references into doxygen links/cross-references.
>
> In tightening up the definitions, there may be subtle changes in meaning
> which should be checked for carefully by reviewers. Where there was
> ambiguity, the behaviour of existing code is documented so as to avoid
> breaking existing apps.
>
> V4:
> * additional rework following comments from Jerin and on-list discussion
> * extra 12th patch to clean up some doxygen issues


@Mattias Rönnblom  I would like to merge this for rc2. It would be
great if you can review this version and Ack it.


Re: [PATCH v7 1/4] net/bnx2x: fix warnings about rte_memcpy lengths

2024-02-23 Thread Jerin Jacob
On Thu, Mar 9, 2023 at 9:53 PM Stephen Hemminger
 wrote:
>
> On Thu, 9 Feb 2023 17:49:31 +0100
> Morten Brørup  wrote:
>
> > >  rte_memcpy(old, new, sizeof(struct nig_stats));
> > >
> > > -rte_memcpy(&(estats->rx_stat_ifhcinbadoctets_hi), &(pstats-
> > > >mac_stx[1]),
> > > -  sizeof(struct mac_stx));
> > > +   rte_memcpy(RTE_PTR_ADD(estats,
> > > +   offsetof(struct bnx2x_eth_stats,
> > > rx_stat_ifhcinbadoctets_hi)),
> > > +   &pstats->mac_stx[1], sizeof(struct mac_stx));
>
> Stop using rte_memcpy() in slow path like this.
> memcpy() is just as fast, compiler can optimize, and the checking tools
> are better with it.

+1

@Morten Brørup Could you send the next version? I am marking as Change
requested.


Re: [PATCH v7 2/4] event/dlb2: remove superfluous rte_memcpy

2024-02-23 Thread Jerin Jacob
On Fri, Feb 10, 2023 at 1:13 PM Morten Brørup  
wrote:
>
> > From: Sevincer, Abdullah [mailto:abdullah.sevin...@intel.com]
> > Sent: Thursday, 9 February 2023 19.50
> >
> > Acked: by abdullah.sevin...@intel.com
>
> Thanks.
>
> Patchwork didn't catch it due to formatting, but the point is obvious:
>
> Acked-by: Abdullah Sevincer 

Applied to dpdk-next-eventdev/for-main. Thanks


Re: [PATCH v2 0/3] net/ionic, common/ionic: add vdev support

2024-02-23 Thread Ferruh Yigit
On 2/21/2024 4:36 PM, Ferruh Yigit wrote:
> On 2/20/2024 8:42 PM, Andrew Boyer wrote:
>> This patch series adds support to net/ionic for using UIO platform devices
>> as DPDK vdevs. This is used by client applications which run directly on
>> the AMD Pensando family of devices.
>>
>> The UIO code is implemented in a new common code library so that it can
>> be shared with the upcoming crypto/ionic driver.
>>
>> V2:
>> - Redesign vdev device scan as suggested by review.
>> - Re-sort entries in config/arm/meson.build as suggested by review.
>>
>> Andrew Boyer (3):
>>   common/ionic: create common code library for ionic
>>   net/ionic: remove duplicate barriers
>>   net/ionic: add vdev support for embedded applications
>>
> 
> for series,
> Acked-by: Ferruh Yigit 
>

Series applied to dpdk-next-net/main, thanks.


[PATCH v8] net/bnx2x: fix warnings about rte_memcpy lengths

2024-02-23 Thread Morten Brørup
Bugfix: The vlan in the bulletin does not contain a VLAN header, only the
VLAN ID, so only copy 2 byte, not 4. The target structure has padding
after the field, so copying 2 byte too many is effectively harmless.
There is no need to backport this patch.

Use RTE_PTR_ADD where copying arrays to the offset of a first field in a
structure holding multiple fields, to avoid compiler warnings with
decorated rte_memcpy.

Bugzilla ID: 1146

Fixes: 540a211084a7695a1c7bc43068934c140d6989be ("bnx2x: driver core")
Cc: step...@networkplumber.org
Cc: rm...@marvell.com
Cc: shsha...@marvell.com
Cc: pa...@marvell.com

Signed-off-by: Morten Brørup 
Acked-by: Devendra Singh Rawat 
---
v8:
* Use memcpy instead of rte_memcpy in slow path. (Stephen Hemminger)
v7:
* No changes.
v6:
* Add Fixes to patch description.
* Fix checkpatch warnings.
v5:
* No changes.
v4:
* Type casting did not fix the warnings, so use RTE_PTR_ADD instead.
v3:
* First patch in series.
---
 drivers/net/bnx2x/bnx2x_stats.c | 14 --
 drivers/net/bnx2x/bnx2x_vfpf.c  | 14 +++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_stats.c b/drivers/net/bnx2x/bnx2x_stats.c
index c07b01510a..8105375d44 100644
--- a/drivers/net/bnx2x/bnx2x_stats.c
+++ b/drivers/net/bnx2x/bnx2x_stats.c
@@ -114,7 +114,7 @@ bnx2x_hw_stats_post(struct bnx2x_softc *sc)
 
/* Update MCP's statistics if possible */
if (sc->func_stx) {
-   rte_memcpy(BNX2X_SP(sc, func_stats), &sc->func_stats,
+   memcpy(BNX2X_SP(sc, func_stats), &sc->func_stats,
sizeof(sc->func_stats));
}
 
@@ -817,10 +817,10 @@ bnx2x_hw_stats_update(struct bnx2x_softc *sc)
  etherstatspktsover1522octets);
 }
 
-rte_memcpy(old, new, sizeof(struct nig_stats));
+memcpy(old, new, sizeof(struct nig_stats));
 
-rte_memcpy(&(estats->rx_stat_ifhcinbadoctets_hi), &(pstats->mac_stx[1]),
-  sizeof(struct mac_stx));
+   memcpy(RTE_PTR_ADD(estats, offsetof(struct bnx2x_eth_stats, 
rx_stat_ifhcinbadoctets_hi)),
+   &pstats->mac_stx[1], sizeof(struct mac_stx));
 estats->brb_drop_hi = pstats->brb_drop_hi;
 estats->brb_drop_lo = pstats->brb_drop_lo;
 
@@ -1492,9 +1492,11 @@ bnx2x_stats_init(struct bnx2x_softc *sc)
REG_RD(sc, NIG_REG_STAT0_BRB_TRUNCATE + port*0x38);
if (!CHIP_IS_E3(sc)) {
REG_RD_DMAE(sc, NIG_REG_STAT0_EGRESS_MAC_PKT0 + port*0x50,
-   &(sc->port.old_nig_stats.egress_mac_pkt0_lo), 
2);
+   RTE_PTR_ADD(&sc->port.old_nig_stats,
+   offsetof(struct nig_stats, 
egress_mac_pkt0_lo)), 2);
REG_RD_DMAE(sc, NIG_REG_STAT0_EGRESS_MAC_PKT1 + port*0x50,
-   &(sc->port.old_nig_stats.egress_mac_pkt1_lo), 
2);
+   RTE_PTR_ADD(&sc->port.old_nig_stats,
+   offsetof(struct nig_stats, 
egress_mac_pkt1_lo)), 2);
}
 
/* function stats */
diff --git a/drivers/net/bnx2x/bnx2x_vfpf.c b/drivers/net/bnx2x/bnx2x_vfpf.c
index 63953c2979..5411df3a38 100644
--- a/drivers/net/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/bnx2x/bnx2x_vfpf.c
@@ -52,9 +52,9 @@ bnx2x_check_bull(struct bnx2x_softc *sc)
 
/* check the mac address and VLAN and allocate memory if valid */
if (valid_bitmap & (1 << MAC_ADDR_VALID) && memcmp(bull->mac, 
sc->old_bulletin.mac, ETH_ALEN))
-   rte_memcpy(&sc->link_params.mac_addr, bull->mac, ETH_ALEN);
+   memcpy(&sc->link_params.mac_addr, bull->mac, ETH_ALEN);
if (valid_bitmap & (1 << VLAN_VALID))
-   rte_memcpy(&bull->vlan, &sc->old_bulletin.vlan, RTE_VLAN_HLEN);
+   memcpy(&bull->vlan, &sc->old_bulletin.vlan, sizeof(bull->vlan));
 
sc->old_bulletin = *bull;
 
@@ -569,7 +569,7 @@ bnx2x_vf_set_mac(struct bnx2x_softc *sc, int set)
 
bnx2x_check_bull(sc);
 
-   rte_memcpy(query->filters[0].mac, sc->link_params.mac_addr, ETH_ALEN);
+   memcpy(query->filters[0].mac, sc->link_params.mac_addr, ETH_ALEN);
 
bnx2x_add_tlv(sc, query, query->first_tlv.tl.length,
  BNX2X_VF_TLV_LIST_END,
@@ -583,9 +583,9 @@ bnx2x_vf_set_mac(struct bnx2x_softc *sc, int set)
while (BNX2X_VF_STATUS_FAILURE == reply->status &&
bnx2x_check_bull(sc)) {
/* A new mac was configured by PF for us */
-   rte_memcpy(sc->link_params.mac_addr, sc->pf2vf_bulletin->mac,
+   memcpy(sc->link_params.mac_addr, sc->pf2vf_bulletin->mac,
ETH_ALEN);
-   rte_memcpy(query->filters[0].mac, sc->pf2vf_bulletin->mac,
+   memcpy(query->filters[0].mac, sc->pf2vf_bulletin->mac,
ETH_ALEN);
 
rc = bnx2x_do_req4pf(sc, sc->vf2pf_mbox_mapping.paddr);
@@ -622,10 +622,

Re: [PATCH] net/hns3: fix Rx packet truncation when KEEP CRC enabled

2024-02-23 Thread Ferruh Yigit
On 2/20/2024 3:58 AM, Jie Hai wrote:
> Hi, Ferruh,
> 
> Thanks for your review.
> 
> On 2024/2/7 22:15, Ferruh Yigit wrote:
>> On 2/6/2024 1:10 AM, Jie Hai wrote:
>>> From: Dengdui Huang 
>>>
>>> When KEEP_CRC offload is enabled, some packets will be truncated and
>>> the CRC is still be stripped in following cases:
>>> 1. For HIP08 hardware, the packet type is TCP and the length
>>>     is less than or equal to 60B.
>>> 2. For other hardwares, the packet type is IP and the length
>>>     is less than or equal to 60B.
>>>
>>
>> If a device doesn't support the offload by some packets, it can be
>> option to disable offload for that device, instead of calculating it in
>> software and append it.
> 
> The KEEP CRC feature of hns3 is faulty only in the specific packet
> type and small packet(<60B) case.
> What's more, the small ethernet packet is not common.
> 
>> Unless you have a specific usecase, or requirement to support the
>> offload.
> 
> Yes, some users of hns3 are already using this feature.
> So we cannot drop this offload
> 
>> <...>
>>
>>> @@ -2492,10 +2544,16 @@ hns3_recv_pkts_simple(void *rx_queue,
>>>   goto pkt_err;
>>>     rxm->packet_type = hns3_rx_calc_ptype(rxq, l234_info,
>>> ol_info);
>>> -
>>>   if (rxm->packet_type == RTE_PTYPE_L2_ETHER_TIMESYNC)
>>>   rxm->ol_flags |= RTE_MBUF_F_RX_IEEE1588_PTP;
>>>   +    if (unlikely(rxq->crc_len > 0)) {
>>> +    if (hns3_need_recalculate_crc(rxq, rxm))
>>> +    hns3_recalculate_crc(rxq, rxm);
>>> +    rxm->pkt_len -= rxq->crc_len;
>>> +    rxm->data_len -= rxq->crc_len;
>>>
>>
>> Removing 'crc_len' from 'mbuf->pkt_len' & 'mbuf->data_len' is
>> practically same as stripping CRC.
>>
>> We don't count CRC length in the statistics, but it should be accessible
>> in the payload by the user.
> Our drivers are behaving exactly as you say.
>

If so I missed why mbuf 'pkt_len' and 'data_len' reduced by
'rxq->crc_len', can you please explain what above lines does?




Re: Where to best ack a series

2024-02-23 Thread Thomas Monjalon
23/02/2024 09:38, Ferruh Yigit:
> On 2/23/2024 8:15 AM, Morten Brørup wrote:
> > Dear maintainers,
> > 
> > Is it easier for you to spot if we ack a series in patch 0, patch 1, or the 
> > last patch of the series? Or don't you have any preferences?
> 
> When a patch is ack'ed, not cover letter (patch 0), patchwork detects it
> and both shows it in the web interface (A/R/T), and automatically adds
> it when patch applied from patchwork, so this makes life easy.
> 
> But to ack each patch in a series one by one is noise for mailing list
> and overhead for reviewer. For this case I think better to ack whole
> series in reply to cover letter, maintainer can apply this manually to
> each patch.
> 
> When there is a patch series, but it doesn't have a cover letter, I tend
> to reply to patch 1, but I don't think patch 1 or last patch matters,
> only to differentiate if the ack is for that patch or whole, I am adding:
> ```
> For series,
> Acked-by: ...
> ```
> 
> From maintainers perspective this manually adding tags is small enough
> work to ignore, but I see authors are impacted too, like if a previous
> version cover letter is acked, they are not adding this ack manually to
> each patch in next version, requiring reviewer ack the new version again
> 
> 
> I guess best solution is add this series ack support to patchwork,
> it can be either:
> - Ack in cover letter automatically add ack to each patch in the series.
> or
> - Add new "Series-acked-by: " syntax, which if patchwork detects it in
> any of patch in the series automatically add ack to each patch in the
> series.

I agree with all being said by Ferruh.
+100




[PATCH v9] net/bnx2x: fix warnings about rte_memcpy lengths

2024-02-23 Thread Morten Brørup
Bugfix: The vlan in the bulletin does not contain a VLAN header, only the
VLAN ID, so only copy 2 byte, not 4. The target structure has padding
after the field, so copying 2 byte too many is effectively harmless.
There is no need to backport this patch.

Use RTE_PTR_ADD where copying arrays to the offset of a first field in a
structure holding multiple fields, to avoid compiler warnings with
decorated rte_memcpy.

Bugzilla ID: 1146

Fixes: 540a211084a7695a1c7bc43068934c140d6989be ("bnx2x: driver core")
Cc: step...@networkplumber.org
Cc: rm...@marvell.com
Cc: shsha...@marvell.com
Cc: pa...@marvell.com

Signed-off-by: Morten Brørup 
Acked-by: Devendra Singh Rawat 
---
v9:
* Fix checkpatch warning about spaces.
v8:
* Use memcpy instead of rte_memcpy in slow path. (Stephen Hemminger)
v7:
* No changes.
v6:
* Add Fixes to patch description.
* Fix checkpatch warnings.
v5:
* No changes.
v4:
* Type casting did not fix the warnings, so use RTE_PTR_ADD instead.
v3:
* First patch in series.
---
 drivers/net/bnx2x/bnx2x_stats.c | 14 --
 drivers/net/bnx2x/bnx2x_vfpf.c  | 14 +++---
 2 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/net/bnx2x/bnx2x_stats.c b/drivers/net/bnx2x/bnx2x_stats.c
index c07b01510a..8105375d44 100644
--- a/drivers/net/bnx2x/bnx2x_stats.c
+++ b/drivers/net/bnx2x/bnx2x_stats.c
@@ -114,7 +114,7 @@ bnx2x_hw_stats_post(struct bnx2x_softc *sc)
 
/* Update MCP's statistics if possible */
if (sc->func_stx) {
-   rte_memcpy(BNX2X_SP(sc, func_stats), &sc->func_stats,
+   memcpy(BNX2X_SP(sc, func_stats), &sc->func_stats,
sizeof(sc->func_stats));
}
 
@@ -817,10 +817,10 @@ bnx2x_hw_stats_update(struct bnx2x_softc *sc)
  etherstatspktsover1522octets);
 }
 
-rte_memcpy(old, new, sizeof(struct nig_stats));
+   memcpy(old, new, sizeof(struct nig_stats));
 
-rte_memcpy(&(estats->rx_stat_ifhcinbadoctets_hi), &(pstats->mac_stx[1]),
-  sizeof(struct mac_stx));
+   memcpy(RTE_PTR_ADD(estats, offsetof(struct bnx2x_eth_stats, 
rx_stat_ifhcinbadoctets_hi)),
+   &pstats->mac_stx[1], sizeof(struct mac_stx));
 estats->brb_drop_hi = pstats->brb_drop_hi;
 estats->brb_drop_lo = pstats->brb_drop_lo;
 
@@ -1492,9 +1492,11 @@ bnx2x_stats_init(struct bnx2x_softc *sc)
REG_RD(sc, NIG_REG_STAT0_BRB_TRUNCATE + port*0x38);
if (!CHIP_IS_E3(sc)) {
REG_RD_DMAE(sc, NIG_REG_STAT0_EGRESS_MAC_PKT0 + port*0x50,
-   &(sc->port.old_nig_stats.egress_mac_pkt0_lo), 
2);
+   RTE_PTR_ADD(&sc->port.old_nig_stats,
+   offsetof(struct nig_stats, 
egress_mac_pkt0_lo)), 2);
REG_RD_DMAE(sc, NIG_REG_STAT0_EGRESS_MAC_PKT1 + port*0x50,
-   &(sc->port.old_nig_stats.egress_mac_pkt1_lo), 
2);
+   RTE_PTR_ADD(&sc->port.old_nig_stats,
+   offsetof(struct nig_stats, 
egress_mac_pkt1_lo)), 2);
}
 
/* function stats */
diff --git a/drivers/net/bnx2x/bnx2x_vfpf.c b/drivers/net/bnx2x/bnx2x_vfpf.c
index 63953c2979..5411df3a38 100644
--- a/drivers/net/bnx2x/bnx2x_vfpf.c
+++ b/drivers/net/bnx2x/bnx2x_vfpf.c
@@ -52,9 +52,9 @@ bnx2x_check_bull(struct bnx2x_softc *sc)
 
/* check the mac address and VLAN and allocate memory if valid */
if (valid_bitmap & (1 << MAC_ADDR_VALID) && memcmp(bull->mac, 
sc->old_bulletin.mac, ETH_ALEN))
-   rte_memcpy(&sc->link_params.mac_addr, bull->mac, ETH_ALEN);
+   memcpy(&sc->link_params.mac_addr, bull->mac, ETH_ALEN);
if (valid_bitmap & (1 << VLAN_VALID))
-   rte_memcpy(&bull->vlan, &sc->old_bulletin.vlan, RTE_VLAN_HLEN);
+   memcpy(&bull->vlan, &sc->old_bulletin.vlan, sizeof(bull->vlan));
 
sc->old_bulletin = *bull;
 
@@ -569,7 +569,7 @@ bnx2x_vf_set_mac(struct bnx2x_softc *sc, int set)
 
bnx2x_check_bull(sc);
 
-   rte_memcpy(query->filters[0].mac, sc->link_params.mac_addr, ETH_ALEN);
+   memcpy(query->filters[0].mac, sc->link_params.mac_addr, ETH_ALEN);
 
bnx2x_add_tlv(sc, query, query->first_tlv.tl.length,
  BNX2X_VF_TLV_LIST_END,
@@ -583,9 +583,9 @@ bnx2x_vf_set_mac(struct bnx2x_softc *sc, int set)
while (BNX2X_VF_STATUS_FAILURE == reply->status &&
bnx2x_check_bull(sc)) {
/* A new mac was configured by PF for us */
-   rte_memcpy(sc->link_params.mac_addr, sc->pf2vf_bulletin->mac,
+   memcpy(sc->link_params.mac_addr, sc->pf2vf_bulletin->mac,
ETH_ALEN);
-   rte_memcpy(query->filters[0].mac, sc->pf2vf_bulletin->mac,
+   memcpy(query->filters[0].mac, sc->pf2vf_bulletin->mac,
ETH_ALEN);
 
rc = bnx2x_do_req4pf(sc, s

[PATCH v2 1/4] net/mlx5: fix conntrack action handle representation

2024-02-23 Thread Dariusz Sosnowski
In mlx5 PMD, handles to indirect connection tracking flow actions
are encoded in 32-bit unsigned integers as follows:

- Bits 31-29 - indirect action type.
- Bits 28-25 - port on which connection tracking action was created.
- Bits 24-0 - index of connection tracking object.

Macro defining a bit shift for owner part in this representation
was incorrectly defined as 22. This patch fixes that, as well as
aligns documented limitations.

Fixes: 463170a7c934 ("net/mlx5: support connection tracking with HWS")
Fixes: 48fbb0e93d06 ("net/mlx5: support flow meter mark indirect action with 
HWS")
Cc: sta...@dpdk.org

Signed-off-by: Dariusz Sosnowski 
Acked-by: Ori Kam 
---
 doc/guides/nics/mlx5.rst | 4 ++--
 drivers/net/mlx5/mlx5_flow.h | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 0d2213497a..90ae3f3047 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -783,8 +783,8 @@ Limitations
 
   - Cannot co-exist with ASO meter, ASO age action in a single flow rule.
   - Flow rules insertion rate and memory consumption need more optimization.
-  - 256 ports maximum.
-  - 4M connections maximum with ``dv_flow_en`` 1 mode. 16M with ``dv_flow_en`` 
2.
+  - 16 ports maximum.
+  - 32M connections maximum.
 
 - Multi-thread flow insertion:
 
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index a4d0ff7b13..b4bf96cd64 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -77,7 +77,7 @@ enum mlx5_indirect_type {
 /* Now, the maximal ports will be supported is 16, action number is 32M. */
 #define MLX5_INDIRECT_ACT_CT_MAX_PORT 0x10
 
-#define MLX5_INDIRECT_ACT_CT_OWNER_SHIFT 22
+#define MLX5_INDIRECT_ACT_CT_OWNER_SHIFT 25
 #define MLX5_INDIRECT_ACT_CT_OWNER_MASK (MLX5_INDIRECT_ACT_CT_MAX_PORT - 1)
 
 /* 29-31: type, 25-28: owner port, 0-24: index */
-- 
2.34.1



[PATCH v2 0/4] net/mlx5: connection tracking changes

2024-02-23 Thread Dariusz Sosnowski
Patches 1 and 2 contain fixes for existing implementation of
connection tracking flow actions.

Patch 3 adds support for sharing connection tracking flow actions
between ports when ports' flow engines are configured with
RTE_FLOW_PORT_FLAG_SHARE_INDIRECT flag set.

Patch 4 is based on the previous one and removes the limitation on
number of ports when connection tracking flow actions are used
with HW Steering flow engine.

v2:
- Rebased on top of v24.03-rc1
- Updated mlx5 docs.

Dariusz Sosnowski (3):
  net/mlx5: fix conntrack action handle representation
  net/mlx5: fix connection tracking action validation
  net/mlx5: remove port from conntrack handle representation

Suanming Mou (1):
  net/mlx5: add cross port CT object sharing

 doc/guides/nics/mlx5.rst   |   4 +-
 doc/guides/rel_notes/release_24_03.rst |   2 +
 drivers/net/mlx5/mlx5_flow.h   |  20 ++-
 drivers/net/mlx5/mlx5_flow_dv.c|   9 ++
 drivers/net/mlx5/mlx5_flow_hw.c| 182 +
 5 files changed, 125 insertions(+), 92 deletions(-)

--
2.34.1



[PATCH v2 2/4] net/mlx5: fix connection tracking action validation

2024-02-23 Thread Dariusz Sosnowski
In mlx5 PMD, handles to indirect connection tracking flow actions
are encoded as 32-bit unsigned integers, where port ID is stored
in bits 28-25. Because of this, connection tracking flow actions
cannot be created on ports with IDs higher than 15.
This patch adds missing validation.

Fixes: 463170a7c934 ("net/mlx5: support connection tracking with HWS")
Cc: sta...@dpdk.org

Signed-off-by: Dariusz Sosnowski 
Acked-by: Ori Kam 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 9 +
 drivers/net/mlx5/mlx5_flow_hw.c | 7 +++
 2 files changed, 16 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 23a2388320..c78ef1f616 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -13861,6 +13861,13 @@ flow_dv_translate_create_conntrack(struct rte_eth_dev 
*dev,
return rte_flow_error_set(error, ENOTSUP,
  RTE_FLOW_ERROR_TYPE_ACTION, NULL,
  "Connection is not supported");
+   if (dev->data->port_id >= MLX5_INDIRECT_ACT_CT_MAX_PORT) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+  "CT supports port indexes up to "
+  RTE_STR(MLX5_ACTION_CTX_CT_MAX_PORT));
+   return 0;
+   }
idx = flow_dv_aso_ct_alloc(dev, error);
if (!idx)
return rte_flow_error_set(error, rte_errno,
@@ -16558,6 +16565,8 @@ flow_dv_action_create(struct rte_eth_dev *dev,
case RTE_FLOW_ACTION_TYPE_CONNTRACK:
ret = flow_dv_translate_create_conntrack(dev, action->conf,
 err);
+   if (!ret)
+   break;
idx = MLX5_INDIRECT_ACT_CT_GEN_IDX(PORT_ID(priv), ret);
break;
default:
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index bcf43f5457..366a6956d2 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -10048,6 +10048,13 @@ flow_hw_conntrack_create(struct rte_eth_dev *dev, 
uint32_t queue,
   "CT is not enabled");
return 0;
}
+   if (dev->data->port_id >= MLX5_INDIRECT_ACT_CT_MAX_PORT) {
+   rte_flow_error_set(error, EINVAL,
+  RTE_FLOW_ERROR_TYPE_UNSPECIFIED, NULL,
+  "CT supports port indexes up to "
+  RTE_STR(MLX5_ACTION_CTX_CT_MAX_PORT));
+   return 0;
+   }
ct = mlx5_ipool_zmalloc(pool->cts, &ct_idx);
if (!ct) {
rte_flow_error_set(error, rte_errno,
-- 
2.34.1



[PATCH v2 4/4] net/mlx5: remove port from conntrack handle representation

2024-02-23 Thread Dariusz Sosnowski
This patch removes the owner port index from integer
representation of indirect action handle in mlx5 PMD for conntrack
flow actions.
This index is not needed when HW Steering flow engine is enabled,
because either:

- port references its own indirect actions or,
- port references indirect actions of the host port when sharing
  indirect actions was configured.

In both cases it is explicitly known which port owns the action.
Port index, included in action handle, introduced unnecessary
limitation and caused undefined behavior issues when application
used more than supported number of ports.

This patch removes the port index from indirect conntrack action handle
representation when HW steering flow engine is used.
It does not affect SW Steering flow engine.

Signed-off-by: Dariusz Sosnowski 
Acked-by: Ori Kam 
---
 doc/guides/nics/mlx5.rst|  2 +-
 drivers/net/mlx5/mlx5_flow.h| 18 +++---
 drivers/net/mlx5/mlx5_flow_hw.c | 44 +++--
 3 files changed, 29 insertions(+), 35 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 90ae3f3047..7729fe4151 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -783,7 +783,7 @@ Limitations
 
   - Cannot co-exist with ASO meter, ASO age action in a single flow rule.
   - Flow rules insertion rate and memory consumption need more optimization.
-  - 16 ports maximum.
+  - 16 ports maximum (with ``dv_flow_en=1``).
   - 32M connections maximum.
 
 - Multi-thread flow insertion:
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index b4bf96cd64..187f440893 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -80,7 +80,12 @@ enum mlx5_indirect_type {
 #define MLX5_INDIRECT_ACT_CT_OWNER_SHIFT 25
 #define MLX5_INDIRECT_ACT_CT_OWNER_MASK (MLX5_INDIRECT_ACT_CT_MAX_PORT - 1)
 
-/* 29-31: type, 25-28: owner port, 0-24: index */
+/*
+ * When SW steering flow engine is used, the CT action handles are encoded in 
a following way:
+ * - bits 31:29 - type
+ * - bits 28:25 - port index of the action owner
+ * - bits 24:0 - action index
+ */
 #define MLX5_INDIRECT_ACT_CT_GEN_IDX(owner, index) \
((MLX5_INDIRECT_ACTION_TYPE_CT << MLX5_INDIRECT_ACTION_TYPE_OFFSET) | \
 (((owner) & MLX5_INDIRECT_ACT_CT_OWNER_MASK) << \
@@ -93,9 +98,14 @@ enum mlx5_indirect_type {
 #define MLX5_INDIRECT_ACT_CT_GET_IDX(index) \
((index) & ((1 << MLX5_INDIRECT_ACT_CT_OWNER_SHIFT) - 1))
 
-#define MLX5_ACTION_CTX_CT_GET_IDX  MLX5_INDIRECT_ACT_CT_GET_IDX
-#define MLX5_ACTION_CTX_CT_GET_OWNER MLX5_INDIRECT_ACT_CT_GET_OWNER
-#define MLX5_ACTION_CTX_CT_GEN_IDX MLX5_INDIRECT_ACT_CT_GEN_IDX
+/*
+ * When HW steering flow engine is used, the CT action handles are encoded in 
a following way:
+ * - bits 31:29 - type
+ * - bits 28:0 - action index
+ */
+#define MLX5_INDIRECT_ACT_HWS_CT_GEN_IDX(index) \
+   ((struct rte_flow_action_handle *)(uintptr_t) \
+((MLX5_INDIRECT_ACTION_TYPE_CT << MLX5_INDIRECT_ACTION_TYPE_OFFSET) | 
(index)))
 
 enum mlx5_indirect_list_type {
MLX5_INDIRECT_ACTION_LIST_TYPE_ERR = 0,
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index f53ed1144b..905c10a90c 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -563,7 +563,7 @@ flow_hw_ct_compile(struct rte_eth_dev *dev,
struct mlx5_priv *priv = dev->data->dev_private;
struct mlx5_aso_ct_action *ct;
 
-   ct = mlx5_ipool_get(priv->hws_ctpool->cts, 
MLX5_ACTION_CTX_CT_GET_IDX(idx));
+   ct = mlx5_ipool_get(priv->hws_ctpool->cts, idx);
if (!ct || (!priv->shared_host && mlx5_aso_ct_available(priv->sh, 
queue, ct)))
return -1;
rule_act->action = priv->hws_ctpool->dr_action;
@@ -2455,8 +2455,7 @@ __flow_hw_actions_translate(struct rte_eth_dev *dev,
break;
case RTE_FLOW_ACTION_TYPE_CONNTRACK:
if (masks->conf) {
-   ct_idx = MLX5_ACTION_CTX_CT_GET_IDX
-((uint32_t)(uintptr_t)actions->conf);
+   ct_idx = 
MLX5_INDIRECT_ACTION_IDX_GET(actions->conf);
if (flow_hw_ct_compile(dev, MLX5_HW_INV_QUEUE, 
ct_idx,
   
&acts->rule_acts[dr_pos]))
goto err;
@@ -3172,8 +3171,7 @@ flow_hw_actions_construct(struct rte_eth_dev *dev,
job->flow->cnt_id = act_data->shared_counter.id;
break;
case RTE_FLOW_ACTION_TYPE_CONNTRACK:
-   ct_idx = MLX5_ACTION_CTX_CT_GET_IDX
-((uint32_t)(uintptr_t)action->conf);
+   ct_idx = MLX5_INDIRECT_ACTION_IDX_GET(action->conf);
if (flow_hw_ct_compile(dev, queue, ct_idx,
  

[PATCH v2 3/4] net/mlx5: add cross port CT object sharing

2024-02-23 Thread Dariusz Sosnowski
From: Suanming Mou 

This commit adds cross port CT object sharing.

Shared CT object shares the same DevX objects, but allocate port's
own action locally. Once the CT object is shared between two flows
in different ports, the two flows use their own local action with
the same offset index.

The shared CT object can only be created/updated/queried/destroyed
by host port.

Signed-off-by: Suanming Mou 
Signed-off-by: Dariusz Sosnowski 
Acked-by: Ori Kam 
---
 doc/guides/rel_notes/release_24_03.rst |   2 +
 drivers/net/mlx5/mlx5_flow_hw.c| 145 ++---
 2 files changed, 85 insertions(+), 62 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 879bb4944c..b660c2c7cf 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -130,6 +130,8 @@ New Features
   * Added support for matching a random value.
   * Added support for comparing result between packet fields or value.
   * Added support for accumulating value of field into another one.
+  * Added support for sharing indirect action objects of type 
``RTE_FLOW_ACTION_TYPE_CONNTRACK``
+with HW steering flow engine.
 
 * **Updated Marvell cnxk crypto driver.**
 
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 366a6956d2..f53ed1144b 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -564,7 +564,7 @@ flow_hw_ct_compile(struct rte_eth_dev *dev,
struct mlx5_aso_ct_action *ct;
 
ct = mlx5_ipool_get(priv->hws_ctpool->cts, 
MLX5_ACTION_CTX_CT_GET_IDX(idx));
-   if (!ct || mlx5_aso_ct_available(priv->sh, queue, ct))
+   if (!ct || (!priv->shared_host && mlx5_aso_ct_available(priv->sh, 
queue, ct)))
return -1;
rule_act->action = priv->hws_ctpool->dr_action;
rule_act->aso_ct.offset = ct->offset;
@@ -3835,9 +3835,11 @@ __flow_hw_pull_indir_action_comp(struct rte_eth_dev *dev,
if (ret_comp < n_res && priv->hws_mpool)
ret_comp += 
mlx5_aso_pull_completion(&priv->hws_mpool->sq[queue],
&res[ret_comp], n_res - ret_comp);
-   if (ret_comp < n_res && priv->hws_ctpool)
-   ret_comp += 
mlx5_aso_pull_completion(&priv->ct_mng->aso_sqs[queue],
-   &res[ret_comp], n_res - ret_comp);
+   if (!priv->shared_host) {
+   if (ret_comp < n_res && priv->hws_ctpool)
+   ret_comp += 
mlx5_aso_pull_completion(&priv->ct_mng->aso_sqs[queue],
+   &res[ret_comp], n_res - ret_comp);
+   }
if (ret_comp < n_res && priv->quota_ctx.sq)
ret_comp += mlx5_aso_pull_completion(&priv->quota_ctx.sq[queue],
 &res[ret_comp],
@@ -8797,15 +8799,19 @@ flow_hw_ct_mng_destroy(struct rte_eth_dev *dev,
 }
 
 static void
-flow_hw_ct_pool_destroy(struct rte_eth_dev *dev __rte_unused,
+flow_hw_ct_pool_destroy(struct rte_eth_dev *dev,
struct mlx5_aso_ct_pool *pool)
 {
+   struct mlx5_priv *priv = dev->data->dev_private;
+
if (pool->dr_action)
mlx5dr_action_destroy(pool->dr_action);
-   if (pool->devx_obj)
-   claim_zero(mlx5_devx_cmd_destroy(pool->devx_obj));
-   if (pool->cts)
-   mlx5_ipool_destroy(pool->cts);
+   if (!priv->shared_host) {
+   if (pool->devx_obj)
+   claim_zero(mlx5_devx_cmd_destroy(pool->devx_obj));
+   if (pool->cts)
+   mlx5_ipool_destroy(pool->cts);
+   }
mlx5_free(pool);
 }
 
@@ -8829,51 +8835,56 @@ flow_hw_ct_pool_create(struct rte_eth_dev *dev,
.type = "mlx5_hw_ct_action",
};
int reg_id;
-   uint32_t flags;
+   uint32_t flags = 0;
 
-   if (port_attr->flags & RTE_FLOW_PORT_FLAG_SHARE_INDIRECT) {
-   DRV_LOG(ERR, "Connection tracking is not supported "
-"in cross vHCA sharing mode");
-   rte_errno = ENOTSUP;
-   return NULL;
-   }
pool = mlx5_malloc(MLX5_MEM_ZERO, sizeof(*pool), 0, SOCKET_ID_ANY);
if (!pool) {
rte_errno = ENOMEM;
return NULL;
}
-   obj = mlx5_devx_cmd_create_conn_track_offload_obj(priv->sh->cdev->ctx,
- priv->sh->cdev->pdn,
- log_obj_size);
-   if (!obj) {
-   rte_errno = ENODATA;
-   DRV_LOG(ERR, "Failed to create conn_track_offload_obj using 
DevX.");
-   goto err;
+   if (!priv->shared_host) {
+   /*
+* No need for local cache if CT number is a small number. Since
+* flow insertion rate will be very limited in that case. Here 
let's
+*

[PATCH v2 0/4] add new QAT gen3 and gen5

2024-02-23 Thread Ciara Power
This patchset adds support for two new QAT devices.
A new GEN3 device, and a GEN5 device, both of which have
wireless slice support for algorithms such as ZUC-256.

Symmetric, asymmetric and compression are all supported
for these devices.
 
v2:
  - New patch added for gen5 device that reuses gen4 code,
and new gen3 wireless slice changes.
  - Removed patch to disable asymmetric and compression.
  - Documentation updates added.
  - Fixed ZUC-256 IV modification for raw API path.
  - Fixed setting extended protocol flag bit position.
  - Added check for ZUC-256 wireless slice in slice map.

Ciara Power (4):
  common/qat: add new gen3 device
  common/qat: add zuc256 wireless slice for gen3
  common/qat: add new gen3 CMAC macros
  common/qat: add gen5 device

 doc/guides/compressdevs/qat_comp.rst |   1 +
 doc/guides/cryptodevs/qat.rst|   6 +
 doc/guides/rel_notes/release_24_03.rst   |   7 +
 drivers/common/qat/dev/qat_dev_gen4.c|  31 ++-
 drivers/common/qat/dev/qat_dev_gen5.c|  51 
 drivers/common/qat/dev/qat_dev_gens.h|  54 
 drivers/common/qat/meson.build   |   3 +
 drivers/common/qat/qat_adf/icp_qat_fw.h  |   6 +-
 drivers/common/qat/qat_adf/icp_qat_fw_la.h   |  24 ++
 drivers/common/qat/qat_adf/icp_qat_hw.h  |  26 +-
 drivers/common/qat/qat_common.h  |   1 +
 drivers/common/qat/qat_device.c  |  19 ++
 drivers/common/qat/qat_device.h  |   2 +
 drivers/compress/qat/dev/qat_comp_pmd_gen4.c |   8 +-
 drivers/compress/qat/dev/qat_comp_pmd_gen5.c |  73 +
 drivers/compress/qat/dev/qat_comp_pmd_gens.h |  14 +
 drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c |   7 +-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c |  63 -
 drivers/crypto/qat/dev/qat_crypto_pmd_gen4.c |   4 +-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen5.c | 278 +++
 drivers/crypto/qat/dev/qat_crypto_pmd_gens.h |  40 ++-
 drivers/crypto/qat/dev/qat_sym_pmd_gen1.c|  43 +++
 drivers/crypto/qat/qat_sym_session.c | 177 ++--
 drivers/crypto/qat/qat_sym_session.h |   2 +
 24 files changed, 889 insertions(+), 51 deletions(-)
 create mode 100644 drivers/common/qat/dev/qat_dev_gen5.c
 create mode 100644 drivers/compress/qat/dev/qat_comp_pmd_gen5.c
 create mode 100644 drivers/crypto/qat/dev/qat_crypto_pmd_gen5.c

-- 
2.25.1



[PATCH v2 1/4] common/qat: add new gen3 device

2024-02-23 Thread Ciara Power
Add new gen3 QAT device ID.
This device has a wireless slice, but other gen3 devices do not, so we
must set a flag to indicate this wireless enabled device.

Capabilities for the device are slightly different from base gen3
capabilities, some are removed from the list for this device.

Symmetric, asymmetric and compression services are enabled.

Signed-off-by: Ciara Power 
---
v2: Added documentation updates.
---
 doc/guides/compressdevs/qat_comp.rst |  1 +
 doc/guides/cryptodevs/qat.rst|  2 ++
 doc/guides/rel_notes/release_24_03.rst   |  4 
 drivers/common/qat/qat_device.c  | 13 +
 drivers/common/qat/qat_device.h  |  2 ++
 drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c | 11 +++
 6 files changed, 33 insertions(+)

diff --git a/doc/guides/compressdevs/qat_comp.rst 
b/doc/guides/compressdevs/qat_comp.rst
index 475c4a9f9f..338b1bf623 100644
--- a/doc/guides/compressdevs/qat_comp.rst
+++ b/doc/guides/compressdevs/qat_comp.rst
@@ -10,6 +10,7 @@ support for the following hardware accelerator devices:
 * ``Intel QuickAssist Technology C62x``
 * ``Intel QuickAssist Technology C3xxx``
 * ``Intel QuickAssist Technology DH895x``
+* ``Intel QuickAssist Technology 300xx``
 
 
 Features
diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index dc6b95165d..51190e12d6 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -26,6 +26,7 @@ poll mode crypto driver support for the following hardware 
accelerator devices:
 * ``Intel QuickAssist Technology D15xx``
 * ``Intel QuickAssist Technology C4xxx``
 * ``Intel QuickAssist Technology 4xxx``
+* ``Intel QuickAssist Technology 300xx``
 
 
 Features
@@ -177,6 +178,7 @@ poll mode crypto driver support for the following hardware 
accelerator devices:
 * ``Intel QuickAssist Technology C4xxx``
 * ``Intel QuickAssist Technology 4xxx``
 * ``Intel QuickAssist Technology 401xxx``
+* ``Intel QuickAssist Technology 300xx``
 
 The QAT ASYM PMD has support for:
 
diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 879bb4944c..55517eabd8 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -131,6 +131,10 @@ New Features
   * Added support for comparing result between packet fields or value.
   * Added support for accumulating value of field into another one.
 
+* **Updated Intel QuickAssist Technology driver.**
+
+  * Enabled support for new QAT GEN3 (578a) devices in QAT crypto driver.
+
 * **Updated Marvell cnxk crypto driver.**
 
   * Added support for Rx inject in crypto_cn10k.
diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c
index f55dc3c6f0..0e7d387d78 100644
--- a/drivers/common/qat/qat_device.c
+++ b/drivers/common/qat/qat_device.c
@@ -53,6 +53,9 @@ static const struct rte_pci_id pci_id_qat_map[] = {
{
RTE_PCI_DEVICE(0x8086, 0x18a1),
},
+   {
+   RTE_PCI_DEVICE(0x8086, 0x578b),
+   },
{
RTE_PCI_DEVICE(0x8086, 0x4941),
},
@@ -194,6 +197,7 @@ pick_gen(const struct rte_pci_device *pci_dev)
case 0x18ef:
return QAT_GEN2;
case 0x18a1:
+   case 0x578b:
return QAT_GEN3;
case 0x4941:
case 0x4943:
@@ -205,6 +209,12 @@ pick_gen(const struct rte_pci_device *pci_dev)
}
 }
 
+static int
+wireless_slice_support(uint16_t pci_dev_id)
+{
+   return pci_dev_id == 0x578b;
+}
+
 struct qat_pci_device *
 qat_pci_device_allocate(struct rte_pci_device *pci_dev,
struct qat_dev_cmd_param *qat_dev_cmd_param)
@@ -282,6 +292,9 @@ qat_pci_device_allocate(struct rte_pci_device *pci_dev,
qat_dev->qat_dev_id = qat_dev_id;
qat_dev->qat_dev_gen = qat_dev_gen;
 
+   if (wireless_slice_support(pci_dev->id.device_id))
+   qat_dev->has_wireless_slice = 1;
+
ops_hw = qat_dev_hw_spec[qat_dev->qat_dev_gen];
NOT_NULL(ops_hw->qat_dev_get_misc_bar, goto error,
"QAT internal error! qat_dev_get_misc_bar function not set");
diff --git a/drivers/common/qat/qat_device.h b/drivers/common/qat/qat_device.h
index aa7988bb74..43e4752812 100644
--- a/drivers/common/qat/qat_device.h
+++ b/drivers/common/qat/qat_device.h
@@ -135,6 +135,8 @@ struct qat_pci_device {
/**< Per generation specific information */
uint32_t slice_map;
/**< Map of the crypto and compression slices */
+   uint16_t has_wireless_slice;
+   /**< Wireless Slices supported */
 };
 
 struct qat_gen_hw_data {
diff --git a/drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c 
b/drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c
index 02bcdb06b1..bc53e2e0f1 100644
--- a/drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c
+++ b/drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c
@@ -255,6 +255,17 @@ qat_sym_crypto_cap

[PATCH v2 2/4] common/qat: add zuc256 wireless slice for gen3

2024-02-23 Thread Ciara Power
The new gen3 device handles wireless algorithms on wireless slices,
based on the device wireless slice support, set the required flags for
these algorithms to move slice.

One of the algorithms supported for the wireless slices is ZUC 256,
support is added for this, along with modifying the capability for the
device.
The device supports 24 bytes iv for ZUC 256, with iv[20]
being ignored in register.
For 25 byte iv, compress this into 23 bytes.

Signed-off-by: Ciara Power 
---
v2:
  - Fixed setting extended protocol flag bit position.
  - Added slice map check for ZUC256 wireless slice.
  - Fixed IV modification for ZUC256 in raw datapath.
  - Added increment size for ZUC256 capabiltiies.
  - Added release note.
---
 doc/guides/rel_notes/release_24_03.rst   |   1 +
 drivers/common/qat/qat_adf/icp_qat_fw.h  |   6 +-
 drivers/common/qat/qat_adf/icp_qat_fw_la.h   |  24 
 drivers/common/qat/qat_adf/icp_qat_hw.h  |  24 +++-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c |   7 +-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c |  52 ++-
 drivers/crypto/qat/dev/qat_crypto_pmd_gens.h |  34 -
 drivers/crypto/qat/dev/qat_sym_pmd_gen1.c|  43 ++
 drivers/crypto/qat/qat_sym_session.c | 142 +--
 drivers/crypto/qat/qat_sym_session.h |   2 +
 10 files changed, 312 insertions(+), 23 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 55517eabd8..0dee1ff104 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -134,6 +134,7 @@ New Features
 * **Updated Intel QuickAssist Technology driver.**
 
   * Enabled support for new QAT GEN3 (578a) devices in QAT crypto driver.
+  * Enabled ZUC256 cipher and auth algorithm for wireless slice enabled GEN3 
device.
 
 * **Updated Marvell cnxk crypto driver.**
 
diff --git a/drivers/common/qat/qat_adf/icp_qat_fw.h 
b/drivers/common/qat/qat_adf/icp_qat_fw.h
index 3aa17ae041..dd7c926140 100644
--- a/drivers/common/qat/qat_adf/icp_qat_fw.h
+++ b/drivers/common/qat/qat_adf/icp_qat_fw.h
@@ -75,7 +75,8 @@ struct icp_qat_fw_comn_req_hdr {
uint8_t service_type;
uint8_t hdr_flags;
uint16_t serv_specif_flags;
-   uint16_t comn_req_flags;
+   uint8_t comn_req_flags;
+   uint8_t ext_flags;
 };
 
 struct icp_qat_fw_comn_req_rqpars {
@@ -176,9 +177,6 @@ struct icp_qat_fw_comn_resp {
 #define QAT_COMN_PTR_TYPE_SGL 0x1
 #define QAT_COMN_CD_FLD_TYPE_64BIT_ADR 0x0
 #define QAT_COMN_CD_FLD_TYPE_16BYTE_DATA 0x1
-#define QAT_COMN_EXT_FLAGS_BITPOS 8
-#define QAT_COMN_EXT_FLAGS_MASK 0x1
-#define QAT_COMN_EXT_FLAGS_USED 0x1
 
 #define ICP_QAT_FW_COMN_FLAGS_BUILD(cdt, ptr) \
cdt) & QAT_COMN_CD_FLD_TYPE_MASK) << QAT_COMN_CD_FLD_TYPE_BITPOS) \
diff --git a/drivers/common/qat/qat_adf/icp_qat_fw_la.h 
b/drivers/common/qat/qat_adf/icp_qat_fw_la.h
index 70f0effa62..134c309355 100644
--- a/drivers/common/qat/qat_adf/icp_qat_fw_la.h
+++ b/drivers/common/qat/qat_adf/icp_qat_fw_la.h
@@ -81,6 +81,15 @@ struct icp_qat_fw_la_bulk_req {
 #define ICP_QAT_FW_LA_PARTIAL_END 2
 #define QAT_LA_PARTIAL_BITPOS 0
 #define QAT_LA_PARTIAL_MASK 0x3
+#define QAT_LA_USE_EXTENDED_PROTOCOL_FLAGS_BITPOS 0
+#define QAT_LA_USE_EXTENDED_PROTOCOL_FLAGS 1
+#define QAT_LA_USE_EXTENDED_PROTOCOL_FLAGS_MASK 0x1
+#define QAT_LA_USE_WCP_SLICE 1
+#define QAT_LA_USE_WCP_SLICE_BITPOS 2
+#define QAT_LA_USE_WCP_SLICE_MASK 0x1
+#define QAT_LA_USE_WAT_SLICE_BITPOS 3
+#define QAT_LA_USE_WAT_SLICE 1
+#define QAT_LA_USE_WAT_SLICE_MASK 0x1
 #define ICP_QAT_FW_LA_FLAGS_BUILD(zuc_proto, gcm_iv_len, auth_rslt, proto, \
cmp_auth, ret_auth, update_state, \
ciph_iv, ciphcfg, partial) \
@@ -188,6 +197,21 @@ struct icp_qat_fw_la_bulk_req {
QAT_FIELD_SET(flags, val, QAT_LA_PARTIAL_BITPOS, \
QAT_LA_PARTIAL_MASK)
 
+#define ICP_QAT_FW_USE_EXTENDED_PROTOCOL_FLAGS_SET(flags, val) \
+   QAT_FIELD_SET(flags, val,   \
+   QAT_LA_USE_EXTENDED_PROTOCOL_FLAGS_BITPOS,  \
+   QAT_LA_USE_EXTENDED_PROTOCOL_FLAGS_MASK)
+
+#define ICP_QAT_FW_USE_WCP_SLICE_SET(flags, val) \
+   QAT_FIELD_SET(flags, val, \
+   QAT_LA_USE_WCP_SLICE_BITPOS, \
+   QAT_LA_USE_WCP_SLICE_MASK)
+
+#define ICP_QAT_FW_USE_WAT_SLICE_SET(flags, val) \
+   QAT_FIELD_SET(flags, val, \
+   QAT_LA_USE_WAT_SLICE_BITPOS, \
+   QAT_LA_USE_WAT_SLICE_MASK)
+
 #define QAT_FW_LA_MODE2 1
 #define QAT_FW_LA_NO_MODE2 0
 #define QAT_FW_LA_MODE2_MASK 0x1
diff --git a/drivers/common/qat/qat_adf/icp_qat_hw.h 
b/drivers/common/qat/qat_adf/icp_qat_hw.h
index 33756d512d..4651fb90bb 100644
--- a/drivers/common/qat/qat_adf/icp_qat_hw.h
+++ b/drivers/common/qat/qat_adf/icp_qat_hw.h
@@ -21,7 +21,8 @@ enum icp_qat_slice_mask {
ICP_ACCEL_MASK_CRYPTO1_SLICE = 0x100,
ICP_ACCEL_MASK_CRYPTO2_SLICE = 0x200,
ICP_ACCEL_MASK_SM3_SLICE = 0x400,
-   ICP_ACCEL_MASK_SM4_SLICE = 0x800
+   ICP_ACCEL_MASK_SM4_

[PATCH v2 3/4] common/qat: add new gen3 CMAC macros

2024-02-23 Thread Ciara Power
The new QAT GEN3 device uses new macros for CMAC values, rather than
using XCBC_MAC ones.

The wireless slice handles CMAC in the new gen3 device, and no key
precomputes are required by SW.

Signed-off-by: Ciara Power 
---
 drivers/common/qat/qat_adf/icp_qat_hw.h |  4 +++-
 drivers/crypto/qat/qat_sym_session.c| 28 +
 2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/common/qat/qat_adf/icp_qat_hw.h 
b/drivers/common/qat/qat_adf/icp_qat_hw.h
index 4651fb90bb..b99dde2176 100644
--- a/drivers/common/qat/qat_adf/icp_qat_hw.h
+++ b/drivers/common/qat/qat_adf/icp_qat_hw.h
@@ -75,7 +75,7 @@ enum icp_qat_hw_auth_algo {
ICP_QAT_HW_AUTH_ALGO_RESERVED = 20,
ICP_QAT_HW_AUTH_ALGO_RESERVED1 = 21,
ICP_QAT_HW_AUTH_ALGO_RESERVED2 = 22,
-   ICP_QAT_HW_AUTH_ALGO_RESERVED3 = 22,
+   ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC = 22,
ICP_QAT_HW_AUTH_ALGO_RESERVED4 = 23,
ICP_QAT_HW_AUTH_ALGO_RESERVED5 = 24,
ICP_QAT_HW_AUTH_ALGO_ZUC_256_MAC_32 = 25,
@@ -180,6 +180,7 @@ struct icp_qat_hw_auth_setup {
 #define ICP_QAT_HW_ZUC_256_MAC_32_STATE1_SZ 8
 #define ICP_QAT_HW_ZUC_256_MAC_64_STATE1_SZ 8
 #define ICP_QAT_HW_ZUC_256_MAC_128_STATE1_SZ 16
+#define ICP_QAT_HW_AES_CMAC_STATE1_SZ 16
 
 #define ICP_QAT_HW_NULL_STATE2_SZ 32
 #define ICP_QAT_HW_MD5_STATE2_SZ 16
@@ -208,6 +209,7 @@ struct icp_qat_hw_auth_setup {
 #define ICP_QAT_HW_GALOIS_H_SZ 16
 #define ICP_QAT_HW_GALOIS_LEN_A_SZ 8
 #define ICP_QAT_HW_GALOIS_E_CTR0_SZ 16
+#define ICP_QAT_HW_AES_128_CMAC_STATE2_SZ 16
 
 struct icp_qat_hw_auth_sha512 {
struct icp_qat_hw_auth_setup inner_setup;
diff --git a/drivers/crypto/qat/qat_sym_session.c 
b/drivers/crypto/qat/qat_sym_session.c
index ebdad0bd67..b1649b8d18 100644
--- a/drivers/crypto/qat/qat_sym_session.c
+++ b/drivers/crypto/qat/qat_sym_session.c
@@ -922,11 +922,20 @@ qat_sym_session_configure_auth(struct rte_cryptodev *dev,
session->qat_hash_alg = ICP_QAT_HW_AUTH_ALGO_AES_XCBC_MAC;
break;
case RTE_CRYPTO_AUTH_AES_CMAC:
-   session->qat_hash_alg = ICP_QAT_HW_AUTH_ALGO_AES_XCBC_MAC;
session->aes_cmac = 1;
-   if (internals->qat_dev->has_wireless_slice) {
-   is_wireless = 1;
-   session->is_wireless = 1;
+   if (!internals->qat_dev->has_wireless_slice) {
+   session->qat_hash_alg = 
ICP_QAT_HW_AUTH_ALGO_AES_XCBC_MAC;
+   break;
+   }
+   is_wireless = 1;
+   session->is_wireless = 1;
+   switch (key_length) {
+   case ICP_QAT_HW_AES_128_KEY_SZ:
+   session->qat_hash_alg = 
ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC;
+   break;
+   default:
+   QAT_LOG(ERR, "Invalid key length: %d", key_length);
+   return -ENOTSUP;
}
break;
case RTE_CRYPTO_AUTH_AES_GMAC:
@@ -1309,6 +1318,9 @@ static int qat_hash_get_state1_size(enum 
icp_qat_hw_auth_algo qat_hash_alg)
case ICP_QAT_HW_AUTH_ALGO_NULL:
return QAT_HW_ROUND_UP(ICP_QAT_HW_NULL_STATE1_SZ,
QAT_HW_DEFAULT_ALIGNMENT);
+   case ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC:
+   return QAT_HW_ROUND_UP(ICP_QAT_HW_AES_CMAC_STATE1_SZ,
+   QAT_HW_DEFAULT_ALIGNMENT);
case ICP_QAT_HW_AUTH_ALGO_DELIMITER:
/* return maximum state1 size in this case */
return QAT_HW_ROUND_UP(ICP_QAT_HW_SHA512_STATE1_SZ,
@@ -1345,6 +1357,7 @@ static int qat_hash_get_digest_size(enum 
icp_qat_hw_auth_algo qat_hash_alg)
case ICP_QAT_HW_AUTH_ALGO_MD5:
return ICP_QAT_HW_MD5_STATE1_SZ;
case ICP_QAT_HW_AUTH_ALGO_AES_XCBC_MAC:
+   case ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC:
return ICP_QAT_HW_AES_XCBC_MAC_STATE1_SZ;
case ICP_QAT_HW_AUTH_ALGO_DELIMITER:
/* return maximum digest size in this case */
@@ -2353,6 +2366,7 @@ int qat_sym_cd_auth_set(struct qat_sym_session *cdesc,
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_ZUC_256_MAC_64
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_ZUC_256_MAC_128
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_AES_XCBC_MAC
+   || cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_AES_CBC_MAC
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_NULL
|| cdesc->qat_hash_alg == ICP_QAT_HW_AUTH_ALGO_SM3
@@ -2593,6 +2607,12 @@ int qat_sym_cd_auth_set(struct qat_sym_session *cdesc,
return -EFAULT;
}
break;
+   case ICP_QAT_HW_AUTH_ALGO_AES_128_CMAC:
+   state1_size = ICP_QAT_HW_AES_CMAC_STATE1_SZ;
+ 

[PATCH v2 4/4] common/qat: add gen5 device

2024-02-23 Thread Ciara Power
Add new gen5 QAT device ID.
This device has a wireless slice, so we must set a flag to indicate
this wireless enabled device.
Asides from the wireless slices and some extra capabilities for
wireless algorithms, the device is functionally the same as gen4 and can
reuse most functions and macros.

Symmetric, asymmetric and compression services are enabled.

Signed-off-by: Ciara Power 
---
v2:
  - Fixed setting extended protocol flag bit position.
  - Added slice map check for ZUC256 wireless slice.
  - Fixed IV modification for ZUC256 in raw datapath.
  - Added increment size for ZUC256 capabiltiies.
  - Added release note.
---
 doc/guides/cryptodevs/qat.rst|   4 +
 doc/guides/rel_notes/release_24_03.rst   |   6 +-
 drivers/common/qat/dev/qat_dev_gen4.c|  31 ++-
 drivers/common/qat/dev/qat_dev_gen5.c|  51 
 drivers/common/qat/dev/qat_dev_gens.h|  54 
 drivers/common/qat/meson.build   |   3 +
 drivers/common/qat/qat_common.h  |   1 +
 drivers/common/qat/qat_device.c  |   8 +-
 drivers/compress/qat/dev/qat_comp_pmd_gen4.c |   8 +-
 drivers/compress/qat/dev/qat_comp_pmd_gen5.c |  73 +
 drivers/compress/qat/dev/qat_comp_pmd_gens.h |  14 +
 drivers/crypto/qat/dev/qat_crypto_pmd_gen4.c |   4 +-
 drivers/crypto/qat/dev/qat_crypto_pmd_gen5.c | 278 +++
 drivers/crypto/qat/dev/qat_crypto_pmd_gens.h |   6 +
 drivers/crypto/qat/qat_sym_session.c |  13 +-
 15 files changed, 524 insertions(+), 30 deletions(-)
 create mode 100644 drivers/common/qat/dev/qat_dev_gen5.c
 create mode 100644 drivers/compress/qat/dev/qat_comp_pmd_gen5.c
 create mode 100644 drivers/crypto/qat/dev/qat_crypto_pmd_gen5.c

diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index 51190e12d6..28945bb5f3 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -27,6 +27,7 @@ poll mode crypto driver support for the following hardware 
accelerator devices:
 * ``Intel QuickAssist Technology C4xxx``
 * ``Intel QuickAssist Technology 4xxx``
 * ``Intel QuickAssist Technology 300xx``
+* ``Intel QuickAssist Technology 420xx``
 
 
 Features
@@ -179,6 +180,7 @@ poll mode crypto driver support for the following hardware 
accelerator devices:
 * ``Intel QuickAssist Technology 4xxx``
 * ``Intel QuickAssist Technology 401xxx``
 * ``Intel QuickAssist Technology 300xx``
+* ``Intel QuickAssist Technology 420xx``
 
 The QAT ASYM PMD has support for:
 
@@ -472,6 +474,8 @@ to see the full table)

+-+-+-+-+--+---+---+++--+++
| Yes | No  | No  | 4   | 402xx| IDZ/ N/A  | qat_4xxx  | 4xxx   
| 4944   | 2| 4945   | 16 |

+-+-+-+-+--+---+---+++--+++
+   | Yes | Yes | Yes | 5   | 420xx| linux/6.8+| qat_420xx | 420xx  
| 4946   | 2| 4947   | 16 |
+   
+-+-+-+-+--+---+---+++--+++
 
 * Note: Symmetric mixed crypto algorithms feature on Gen 2 works only with IDZ 
driver version 4.9.0+
 
diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 0dee1ff104..439d354cd8 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -133,8 +133,10 @@ New Features
 
 * **Updated Intel QuickAssist Technology driver.**
 
-  * Enabled support for new QAT GEN3 (578a) devices in QAT crypto driver.
-  * Enabled ZUC256 cipher and auth algorithm for wireless slice enabled GEN3 
device.
+  * Enabled support for new QAT GEN3 (578a) and QAT GEN5 (4946)
+devices in QAT crypto driver.
+  * Enabled ZUC256 cipher and auth algorithm for wireless slice
+enabled GEN3 and GEN5 devices.
 
 * **Updated Marvell cnxk crypto driver.**
 
diff --git a/drivers/common/qat/dev/qat_dev_gen4.c 
b/drivers/common/qat/dev/qat_dev_gen4.c
index 1ce262f715..2525e1e695 100644
--- a/drivers/common/qat/dev/qat_dev_gen4.c
+++ b/drivers/common/qat/dev/qat_dev_gen4.c
@@ -10,6 +10,7 @@
 #include "adf_transport_access_macros_gen4vf.h"
 #include "adf_pf2vf_msg.h"
 #include "qat_pf2vf.h"
+#include "qat_dev_gens.h"
 
 #include 
 
@@ -60,7 +61,7 @@ qat_select_valid_queue_gen4(struct qat_pci_device *qat_dev, 
int qp_id,
return -1;
 }
 
-static const struct qat_qp_hw_data *
+const struct qat_qp_hw_data *
 qat_qp_get_hw_data_gen4(struct qat_pci_device *qat_dev,
enum qat_service_type service_type, uint16_t qp_id)
 {
@@ -74,7 +75,7 @@ qat_qp_get_hw_data_gen4(struct qat_pci_device *qat_dev,
return &dev_extra->qp_gen4_data[ring_pair][0];
 }
 
-static int
+int
 qat_qp_rings_per_service_gen4(struct qat_pci_device *qat_dev,
enum qat_service_type service)
 {
@@ -103,7 +104,7 @@ gen4_pick_service(uint8_t hw_service)
}
 }
 
-stati

RE: [PATCH 1/4] common/qat: add files specific to GEN5

2024-02-23 Thread Power, Ciara



> -Original Message-
> From: Nayak, Nishikanta 
> Sent: Wednesday, December 20, 2023 1:26 PM
> To: dev@dpdk.org
> Cc: Ji, Kai ; Power, Ciara ; Kusztal,
> ArkadiuszX ; Nayak, Nishikanta
> ; Thomas Monjalon ;
> Burakov, Anatoly 
> Subject: [PATCH 1/4] common/qat: add files specific to GEN5
> 
> Adding GEN5 files for handling GEN5 specific operaions.
> These files are inherited from the existing files/APIs which has some changes
> specific GEN5 requirements Also updated the mailmap file.
> 
> Signed-off-by: Nishikant Nayak 
> ---

A note on this one,
We will send a v2 of this patchset soon, renaming the device to GEN_LCE instead 
of GEN5.

This will avoid clashing with the patch I have just sent for another QAT 
device, that is named GEN5.
(https://patches.dpdk.org/project/dpdk/patch/20240223151255.3310490-5-ciara.po...@intel.com/)

Thanks,
Ciara


[PATCH v2] net/cnxk: support Tx queue descriptor count

2024-02-23 Thread skoteshwar
From: Satha Rao 

Added CNXK APIs to get used txq descriptor count.

Signed-off-by: Satha Rao 
---

Depends-on: series-30833 ("ethdev: support Tx queue used count")

v2:
  Updated release notes and fixed API for CPT queues.

 doc/guides/nics/features/cnxk.ini  |  1 +
 doc/guides/rel_notes/release_24_03.rst |  1 +
 drivers/net/cnxk/cn10k_tx_select.c | 22 ++
 drivers/net/cnxk/cn9k_tx_select.c  | 23 +++
 drivers/net/cnxk/cnxk_ethdev.h | 24 
 5 files changed, 71 insertions(+)

diff --git a/doc/guides/nics/features/cnxk.ini 
b/doc/guides/nics/features/cnxk.ini
index 94e7a6a..ab18f38 100644
--- a/doc/guides/nics/features/cnxk.ini
+++ b/doc/guides/nics/features/cnxk.ini
@@ -40,6 +40,7 @@ Timesync = Y
 Timestamp offload= Y
 Rx descriptor status = Y
 Tx descriptor status = Y
+Tx queue count   = Y
 Basic stats  = Y
 Stats per queue  = Y
 Extended stats   = Y
diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 32d0ad8..5f8fb9e 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -110,6 +110,7 @@ New Features
 
   * Added support for ``RTE_FLOW_ITEM_TYPE_PPPOES`` flow item.
   * Added support for ``RTE_FLOW_ACTION_TYPE_SAMPLE`` flow item.
+  * Added support for fast path function ``rte_eth_tx_queue_count``.
 
 * **Updated Marvell OCTEON EP driver.**
 
diff --git a/drivers/net/cnxk/cn10k_tx_select.c 
b/drivers/net/cnxk/cn10k_tx_select.c
index 404f5ba..aa0620e 100644
--- a/drivers/net/cnxk/cn10k_tx_select.c
+++ b/drivers/net/cnxk/cn10k_tx_select.c
@@ -20,6 +20,24 @@
eth_dev->tx_pkt_burst;
 }
 
+#if defined(RTE_ARCH_ARM64)
+static int
+cn10k_nix_tx_queue_count(void *tx_queue)
+{
+   struct cn10k_eth_txq *txq = (struct cn10k_eth_txq *)tx_queue;
+
+   return cnxk_nix_tx_queue_count(txq->fc_mem, txq->sqes_per_sqb_log2);
+}
+
+static int
+cn10k_nix_tx_queue_sec_count(void *tx_queue)
+{
+   struct cn10k_eth_txq *txq = (struct cn10k_eth_txq *)tx_queue;
+
+   return cnxk_nix_tx_queue_sec_count(txq->fc_mem, txq->sqes_per_sqb_log2, 
txq->cpt_fc);
+}
+#endif
+
 void
 cn10k_eth_set_tx_function(struct rte_eth_dev *eth_dev)
 {
@@ -63,6 +81,10 @@
if (dev->tx_offloads & RTE_ETH_TX_OFFLOAD_MULTI_SEGS)
pick_tx_func(eth_dev, nix_eth_tx_vec_burst_mseg);
}
+   if (dev->tx_offloads & RTE_ETH_TX_OFFLOAD_SECURITY)
+   eth_dev->tx_queue_count = cn10k_nix_tx_queue_sec_count;
+   else
+   eth_dev->tx_queue_count = cn10k_nix_tx_queue_count;
 
rte_mb();
 #else
diff --git a/drivers/net/cnxk/cn9k_tx_select.c 
b/drivers/net/cnxk/cn9k_tx_select.c
index e08883f..0e09ed6 100644
--- a/drivers/net/cnxk/cn9k_tx_select.c
+++ b/drivers/net/cnxk/cn9k_tx_select.c
@@ -20,6 +20,24 @@
eth_dev->tx_pkt_burst;
 }
 
+#if defined(RTE_ARCH_ARM64)
+static int
+cn9k_nix_tx_queue_count(void *tx_queue)
+{
+   struct cn9k_eth_txq *txq = (struct cn9k_eth_txq *)tx_queue;
+
+   return cnxk_nix_tx_queue_count(txq->fc_mem, txq->sqes_per_sqb_log2);
+}
+
+static int
+cn9k_nix_tx_queue_sec_count(void *tx_queue)
+{
+   struct cn9k_eth_txq *txq = (struct cn9k_eth_txq *)tx_queue;
+
+   return cnxk_nix_tx_queue_sec_count(txq->fc_mem, txq->sqes_per_sqb_log2, 
txq->cpt_fc);
+}
+#endif
+
 void
 cn9k_eth_set_tx_function(struct rte_eth_dev *eth_dev)
 {
@@ -60,6 +78,11 @@
pick_tx_func(eth_dev, nix_eth_tx_vec_burst_mseg);
}
 
+   if (dev->tx_offloads & RTE_ETH_TX_OFFLOAD_SECURITY)
+   eth_dev->tx_queue_count = cn9k_nix_tx_queue_sec_count;
+   else
+   eth_dev->tx_queue_count = cn9k_nix_tx_queue_count;
+
rte_mb();
 #else
RTE_SET_USED(eth_dev);
diff --git a/drivers/net/cnxk/cnxk_ethdev.h b/drivers/net/cnxk/cnxk_ethdev.h
index 37b6395..f810bb8 100644
--- a/drivers/net/cnxk/cnxk_ethdev.h
+++ b/drivers/net/cnxk/cnxk_ethdev.h
@@ -458,6 +458,30 @@ struct cnxk_eth_txq_sp {
return ((struct cnxk_eth_txq_sp *)__txq) - 1;
 }
 
+static inline int
+cnxk_nix_tx_queue_count(uint64_t *mem, uint16_t sqes_per_sqb_log2)
+{
+   uint64_t val;
+
+   val = rte_atomic_load_explicit(mem, rte_memory_order_relaxed);
+   val = (val << sqes_per_sqb_log2) - val;
+
+   return (val & 0x);
+}
+
+static inline int
+cnxk_nix_tx_queue_sec_count(uint64_t *mem, uint16_t sqes_per_sqb_log2, 
uint64_t *sec_fc)
+{
+   uint64_t sq_cnt, sec_cnt, val;
+
+   sq_cnt = rte_atomic_load_explicit(mem, rte_memory_order_relaxed);
+   sq_cnt = (sq_cnt << sqes_per_sqb_log2) - sq_cnt;
+   sec_cnt = rte_atomic_load_explicit(sec_fc, rte_memory_order_relaxed);
+   val = RTE_MAX(sq_cnt, sec_cnt);
+
+   return (val & 0x);
+}
+
 /* Common ethdev ops */
 extern struct eth_dev_ops cnxk_eth_dev_ops;
 
-- 
1.8.3.1



[PATCH 1/1] net/octeon_ep: use devarg to enable ISM accesses

2024-02-23 Thread Vamsi Attunuru
Adds a devarg option to enable/disable ISM memory accesses
for reading packet count details. This option is disabled
by default, as ISM memory accesses effect throughput of
bigger size packets.

Signed-off-by: Vamsi Attunuru 
---
 doc/guides/nics/octeon_ep.rst | 12 
 drivers/net/octeon_ep/cnxk_ep_rx.h| 42 +-
 drivers/net/octeon_ep/cnxk_ep_tx.c| 42 ++
 drivers/net/octeon_ep/cnxk_ep_vf.c|  4 +--
 drivers/net/octeon_ep/otx2_ep_vf.c|  4 +--
 drivers/net/octeon_ep/otx_ep_common.h | 14 +++--
 drivers/net/octeon_ep/otx_ep_ethdev.c | 43 +++
 drivers/net/octeon_ep/otx_ep_rxtx.c   | 15 ++
 drivers/net/octeon_ep/otx_ep_rxtx.h   |  2 ++
 9 files changed, 153 insertions(+), 25 deletions(-)

diff --git a/doc/guides/nics/octeon_ep.rst b/doc/guides/nics/octeon_ep.rst
index b5040aeee2..befa0a4097 100644
--- a/doc/guides/nics/octeon_ep.rst
+++ b/doc/guides/nics/octeon_ep.rst
@@ -11,6 +11,18 @@ and **Cavium OCTEON** families of adapters in SR-IOV context.
 More information can be found at `Marvell Official Website
 
`_.
 
+Runtime Config Options
+--
+
+- ``Rx&Tx ISM memory accesses enable`` (default ``0``)
+
+   PMD supports 2 modes for checking Rx & Tx packet count, PMD may read the 
packet count directly
+   from hardware registers or it may read from ISM memory, this may be 
selected at runtime
+   using ``ism_enable`` ``devargs`` parameter.
+
+   For example::
+
+  -a 0002:02:00.0,ism_enable=1
 
 Prerequisites
 -
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.h 
b/drivers/net/octeon_ep/cnxk_ep_rx.h
index 61263e651e..ecf95cd961 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.h
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.h
@@ -88,8 +88,9 @@ cnxk_ep_rx_refill(struct otx_ep_droq *droq)
 }
 
 static inline uint32_t
-cnxk_ep_check_rx_pkts(struct otx_ep_droq *droq)
+cnxk_ep_check_rx_ism_mem(void *rx_queue)
 {
+   struct otx_ep_droq *droq = (struct otx_ep_droq *)rx_queue;
uint32_t new_pkts;
uint32_t val;
 
@@ -98,8 +99,9 @@ cnxk_ep_check_rx_pkts(struct otx_ep_droq *droq)
 * number of PCIe writes.
 */
val = __atomic_load_n(droq->pkts_sent_ism, __ATOMIC_RELAXED);
-   new_pkts = val - droq->pkts_sent_ism_prev;
-   droq->pkts_sent_ism_prev = val;
+
+   new_pkts = val - droq->pkts_sent_prev;
+   droq->pkts_sent_prev = val;
 
if (val > RTE_BIT32(31)) {
/* Only subtract the packet count in the HW counter
@@ -113,11 +115,34 @@ cnxk_ep_check_rx_pkts(struct otx_ep_droq *droq)
rte_write64(OTX2_SDP_REQUEST_ISM, droq->pkts_sent_reg);
rte_mb();
}
-
-   droq->pkts_sent_ism_prev = 0;
+   droq->pkts_sent_prev = 0;
}
+
rte_write64(OTX2_SDP_REQUEST_ISM, droq->pkts_sent_reg);
-   droq->pkts_pending += new_pkts;
+
+   return new_pkts;
+}
+
+static inline uint32_t
+cnxk_ep_check_rx_pkt_reg(void *rx_queue)
+{
+   struct otx_ep_droq *droq = (struct otx_ep_droq *)rx_queue;
+   uint32_t new_pkts;
+   uint32_t val;
+
+   val = rte_read32(droq->pkts_sent_reg);
+
+   new_pkts = val - droq->pkts_sent_prev;
+   droq->pkts_sent_prev = val;
+
+   if (val > RTE_BIT32(31)) {
+   /* Only subtract the packet count in the HW counter
+* when count above halfway to saturation.
+*/
+   rte_write64((uint64_t)val, droq->pkts_sent_reg);
+   rte_mb();
+   droq->pkts_sent_prev = 0;
+   }
 
return new_pkts;
 }
@@ -125,8 +150,11 @@ cnxk_ep_check_rx_pkts(struct otx_ep_droq *droq)
 static inline int16_t __rte_hot
 cnxk_ep_rx_pkts_to_process(struct otx_ep_droq *droq, uint16_t nb_pkts)
 {
+   const otx_ep_check_pkt_count_t cnxk_rx_pkt_count[2] = { 
cnxk_ep_check_rx_pkt_reg,
+   
cnxk_ep_check_rx_ism_mem};
+
if (droq->pkts_pending < nb_pkts)
-   cnxk_ep_check_rx_pkts(droq);
+   droq->pkts_pending += cnxk_rx_pkt_count[droq->ism_ena](droq);
 
return RTE_MIN(nb_pkts, droq->pkts_pending);
 }
diff --git a/drivers/net/octeon_ep/cnxk_ep_tx.c 
b/drivers/net/octeon_ep/cnxk_ep_tx.c
index 9f11a2f317..98c0a861c3 100644
--- a/drivers/net/octeon_ep/cnxk_ep_tx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_tx.c
@@ -5,9 +5,10 @@
 #include "cnxk_ep_vf.h"
 #include "otx_ep_rxtx.h"
 
-static uint32_t
-cnxk_vf_update_read_index(struct otx_ep_instr_queue *iq)
+static inline uint32_t
+cnxk_ep_check_tx_ism_mem(void *tx_queue)
 {
+   struct otx_ep_instr_queue *iq = (struct otx_ep_instr_queue *)tx_queue;
uint32_t val;
 
/* Batch subtractions from the HW counter to reduce PCIe traffic
@@ -15,8 +16,8 @@ cnxk_vf_update_

[PATCH v5 02/39] eal: redefine macro to be integer literal for MSVC

2024-02-23 Thread Tyler Retzlaff
MSVC __declspec(align(#)) is limited and accepts only integer literals
as opposed to constant expressions. define XMM_SIZE to be 16 instead of
sizeof(xmm_t) and static_assert that sizeof(xmm_t) == 16 for
compatibility.

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/eal/x86/include/rte_vect.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/eal/x86/include/rte_vect.h b/lib/eal/x86/include/rte_vect.h
index a1a537e..441f1a0 100644
--- a/lib/eal/x86/include/rte_vect.h
+++ b/lib/eal/x86/include/rte_vect.h
@@ -11,6 +11,7 @@
  * RTE SSE/AVX related header.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -33,9 +34,11 @@
 
 typedef __m128i xmm_t;
 
-#defineXMM_SIZE(sizeof(xmm_t))
+#defineXMM_SIZE16
 #defineXMM_MASK(XMM_SIZE - 1)
 
+static_assert(sizeof(xmm_t) == 16, "");
+
 typedef union rte_xmm {
xmm_tx;
uint8_t  u8[XMM_SIZE / sizeof(uint8_t)];
-- 
1.8.3.1



[PATCH v5 01/39] eal: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Expand __rte_aligned(a) to __declspec(align(a)) when building
  with MSVC.

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/eal/arm/include/rte_vect.h   |  4 ++--
 lib/eal/common/malloc_elem.h |  4 ++--
 lib/eal/common/malloc_heap.h |  4 ++--
 lib/eal/common/rte_keepalive.c   |  3 ++-
 lib/eal/common/rte_random.c  |  4 ++--
 lib/eal/common/rte_service.c |  8 
 lib/eal/include/generic/rte_atomic.h |  4 ++--
 lib/eal/include/rte_common.h | 23 +++
 lib/eal/loongarch/include/rte_vect.h |  8 
 lib/eal/ppc/include/rte_vect.h   |  4 ++--
 lib/eal/riscv/include/rte_vect.h |  4 ++--
 lib/eal/x86/include/rte_vect.h   |  4 ++--
 lib/eal/x86/rte_power_intrinsics.c   | 10 ++
 13 files changed, 47 insertions(+), 37 deletions(-)

diff --git a/lib/eal/arm/include/rte_vect.h b/lib/eal/arm/include/rte_vect.h
index 8cfe4bd..c97d299 100644
--- a/lib/eal/arm/include/rte_vect.h
+++ b/lib/eal/arm/include/rte_vect.h
@@ -24,14 +24,14 @@
 #defineXMM_SIZE(sizeof(xmm_t))
 #defineXMM_MASK(XMM_SIZE - 1)
 
-typedef union rte_xmm {
+typedef union __rte_aligned(16) rte_xmm {
xmm_tx;
uint8_t  u8[XMM_SIZE / sizeof(uint8_t)];
uint16_t u16[XMM_SIZE / sizeof(uint16_t)];
uint32_t u32[XMM_SIZE / sizeof(uint32_t)];
uint64_t u64[XMM_SIZE / sizeof(uint64_t)];
double   pd[XMM_SIZE / sizeof(double)];
-} __rte_aligned(16) rte_xmm_t;
+} rte_xmm_t;
 
 #if defined(RTE_ARCH_ARM) && defined(RTE_ARCH_32)
 /* NEON intrinsic vqtbl1q_u8() is not supported in ARMv7-A(AArch32) */
diff --git a/lib/eal/common/malloc_elem.h b/lib/eal/common/malloc_elem.h
index 952ce73..c7ff671 100644
--- a/lib/eal/common/malloc_elem.h
+++ b/lib/eal/common/malloc_elem.h
@@ -20,7 +20,7 @@ enum elem_state {
ELEM_PAD  /* element is a padding-only header */
 };
 
-struct malloc_elem {
+struct __rte_cache_aligned malloc_elem {
struct malloc_heap *heap;
struct malloc_elem *volatile prev;
/**< points to prev elem in memseg */
@@ -48,7 +48,7 @@ struct malloc_elem {
size_t user_size;
uint64_t asan_cookie[2]; /* must be next to header_cookie */
 #endif
-} __rte_cache_aligned;
+};
 
 static const unsigned int MALLOC_ELEM_HEADER_LEN = sizeof(struct malloc_elem);
 
diff --git a/lib/eal/common/malloc_heap.h b/lib/eal/common/malloc_heap.h
index 8f3ab57..0c49588 100644
--- a/lib/eal/common/malloc_heap.h
+++ b/lib/eal/common/malloc_heap.h
@@ -21,7 +21,7 @@
 /**
  * Structure to hold malloc heap
  */
-struct malloc_heap {
+struct __rte_cache_aligned malloc_heap {
rte_spinlock_t lock;
LIST_HEAD(, malloc_elem) free_head[RTE_HEAP_NUM_FREELISTS];
struct malloc_elem *volatile first;
@@ -31,7 +31,7 @@ struct malloc_heap {
unsigned int socket_id;
size_t total_size;
char name[RTE_HEAP_NAME_MAX_LEN];
-} __rte_cache_aligned;
+};
 
 void *
 malloc_heap_alloc(const char *type, size_t size, int socket, unsigned int 
flags,
diff --git a/lib/eal/common/rte_keepalive.c b/lib/eal/common/rte_keepalive.c
index f6db973..391c1be 100644
--- a/lib/eal/common/rte_keepalive.c
+++ b/lib/eal/common/rte_keepalive.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2015-2016 Intel Corporation
  */
 
+#include 
 #include 
 
 #include 
@@ -19,7 +20,7 @@ struct rte_keepalive {
/*
 * Each element must be cache aligned to prevent false sharing.
 */
-   enum rte_keepalive_state core_state __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) enum rte_keepalive_state 
core_state;
} live_data[RTE_KEEPALIVE_MAXCORES];
 
/** Last-seen-alive timestamps */
diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 7709b8f..90e91b3 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -13,14 +13,14 @@
 #include 
 #include 
 
-struct rte_rand_state {
+struct __rte_cache_aligned rte_

[PATCH v5 03/39] stack: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/stack/rte_stack.h | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/lib/stack/rte_stack.h b/lib/stack/rte_stack.h
index a379300..8ff0659 100644
--- a/lib/stack/rte_stack.h
+++ b/lib/stack/rte_stack.h
@@ -15,6 +15,8 @@
 #ifndef _RTE_STACK_H_
 #define _RTE_STACK_H_
 
+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -42,7 +44,7 @@ struct rte_stack_lf_head {
 
 struct rte_stack_lf_list {
/** List head */
-   struct rte_stack_lf_head head __rte_aligned(16);
+   alignas(16) struct rte_stack_lf_head head;
/** List len */
RTE_ATOMIC(uint64_t) len;
 };
@@ -52,11 +54,11 @@ struct rte_stack_lf_list {
  */
 struct rte_stack_lf {
/** LIFO list of elements */
-   struct rte_stack_lf_list used __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_stack_lf_list used;
/** LIFO list of free elements */
-   struct rte_stack_lf_list free __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_stack_lf_list free;
/** LIFO elements */
-   struct rte_stack_lf_elem elems[] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_stack_lf_elem elems[];
 };
 
 /* Structure containing the LIFO, its current length, and a lock for mutual
@@ -71,9 +73,9 @@ struct rte_stack_std {
 /* The RTE stack structure contains the LIFO structure itself, plus metadata
  * such as its name and memzone pointer.
  */
-struct rte_stack {
+struct __rte_cache_aligned rte_stack {
/** Name of the stack. */
-   char name[RTE_STACK_NAMESIZE] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) char name[RTE_STACK_NAMESIZE];
/** Memzone containing the rte_stack structure. */
const struct rte_memzone *memzone;
uint32_t capacity; /**< Usable size of the stack. */
@@ -82,7 +84,7 @@ struct rte_stack {
struct rte_stack_lf stack_lf; /**< Lock-free LIFO structure. */
struct rte_stack_std stack_std; /**< LIFO structure. */
};
-} __rte_cache_aligned;
+};
 
 /**
  * The stack uses lock-free push and pop functions. This flag is only
-- 
1.8.3.1



[PATCH v5 04/39] sched: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

Replace use of __rte_aligned_16 with C11 alignas(16) and garbage collect
the __rte_aligned_16 macro which was only used once.

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/sched/rte_sched.c| 21 +++--
 lib/sched/rte_sched_common.h |  2 --
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/lib/sched/rte_sched.c b/lib/sched/rte_sched.c
index d90aa53..bbdb5d1 100644
--- a/lib/sched/rte_sched.c
+++ b/lib/sched/rte_sched.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -57,7 +58,7 @@ struct rte_sched_pipe_profile {
uint8_t  wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
-struct rte_sched_pipe {
+struct __rte_cache_aligned rte_sched_pipe {
/* Token bucket (TB) */
uint64_t tb_time; /* time of last update */
uint64_t tb_credits;
@@ -75,7 +76,7 @@ struct rte_sched_pipe {
/* TC oversubscription */
uint64_t tc_ov_credits;
uint8_t tc_ov_period_id;
-} __rte_cache_aligned;
+};
 
 struct rte_sched_queue {
uint16_t qw;
@@ -145,7 +146,7 @@ struct rte_sched_grinder {
uint8_t wrr_cost[RTE_SCHED_BE_QUEUES_PER_PIPE];
 };
 
-struct rte_sched_subport {
+struct __rte_cache_aligned rte_sched_subport {
/* Token bucket (TB) */
uint64_t tb_time; /* time of last update */
uint64_t tb_credits;
@@ -164,7 +165,7 @@ struct rte_sched_subport {
double tc_ov_rate;
 
/* Statistics */
-   struct rte_sched_subport_stats stats __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_sched_subport_stats stats;
 
/* subport profile */
uint32_t profile;
@@ -193,7 +194,7 @@ struct rte_sched_subport {
 
/* Bitmap */
struct rte_bitmap *bmp;
-   uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS] 
__rte_aligned_16;
+   alignas(16) uint32_t grinder_base_bmp_pos[RTE_SCHED_PORT_N_GRINDERS];
 
/* Grinders */
struct rte_sched_grinder grinder[RTE_SCHED_PORT_N_GRINDERS];
@@ -212,10 +213,10 @@ struct rte_sched_subport {
struct rte_sched_pipe_profile *pipe_profiles;
uint8_t *bmp_array;
struct rte_mbuf **queue_array;
-   uint8_t memory[0] __rte_cache_aligned;
-} __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[0];
+};
 
-struct rte_sched_port {
+struct __rte_cache_aligned rte_sched_port {
/* User parameters */
uint32_t n_subports_per_port;
uint32_t n_pipes_per_subport;
@@ -244,8 +245,8 @@ struct rte_sched_port {
 
/* Large data structures */
struct rte_sched_subport_profile *subport_profiles;
-   struct rte_sched_subport *subports[0] __rte_cache_aligned;
-} __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_sched_subport *subports[0];
+};
 
 enum rte_sched_subport_array {
e_RTE_SCHED_SUBPORT_ARRAY_PIPE = 0,
diff --git a/lib/sched/rte_sched_common.h b/lib/sched/rte_sched_common.h
index 419700b..573d164 100644
--- a/lib/sched/rte_sched_common.h
+++ b/lib/sched/rte_sched_common.h
@@ -12,8 +12,6 @@
 #include 
 #include 
 
-#define __rte_aligned_16 __rte_aligned(16)
-
 #if 0
 static inline uint32_t
 rte_min_pos_4_u16(uint16_t *x)
-- 
1.8.3.1



[PATCH v5 06/39] pipeline: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/pipeline/rte_pipeline.c   |  4 ++--
 lib/pipeline/rte_port_in_action.c |  3 ++-
 lib/pipeline/rte_swx_ipsec.c  |  4 +++-
 lib/pipeline/rte_table_action.c   | 24 
 4 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/lib/pipeline/rte_pipeline.c b/lib/pipeline/rte_pipeline.c
index 945bb02..a09a89f 100644
--- a/lib/pipeline/rte_pipeline.c
+++ b/lib/pipeline/rte_pipeline.c
@@ -104,7 +104,7 @@ struct rte_table {
 
 #define RTE_PIPELINE_MAX_NAME_SZ   124
 
-struct rte_pipeline {
+struct __rte_cache_aligned rte_pipeline {
/* Input parameters */
char name[RTE_PIPELINE_MAX_NAME_SZ];
int socket_id;
@@ -132,7 +132,7 @@ struct rte_pipeline {
uint64_t pkts_mask;
uint64_t n_pkts_ah_drop;
uint64_t pkts_drop_mask;
-} __rte_cache_aligned;
+};
 
 static inline uint32_t
 rte_mask_get_next(uint64_t mask, uint32_t pos)
diff --git a/lib/pipeline/rte_port_in_action.c 
b/lib/pipeline/rte_port_in_action.c
index 5818973..bbacaff 100644
--- a/lib/pipeline/rte_port_in_action.c
+++ b/lib/pipeline/rte_port_in_action.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2018 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -282,7 +283,7 @@ struct rte_port_in_action_profile *
 struct rte_port_in_action {
struct ap_config cfg;
struct ap_data data;
-   uint8_t memory[0] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[0];
 };
 
 static __rte_always_inline void *
diff --git a/lib/pipeline/rte_swx_ipsec.c b/lib/pipeline/rte_swx_ipsec.c
index 28576c2..76b853f 100644
--- a/lib/pipeline/rte_swx_ipsec.c
+++ b/lib/pipeline/rte_swx_ipsec.c
@@ -1,6 +1,8 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2022 Intel Corporation
  */
+
+#include 
 #include 
 #include 
 #include 
@@ -154,7 +156,7 @@ struct rte_swx_ipsec {
/*
 * Table memory.
 */
-   uint8_t memory[] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[];
 };
 
 static inline struct ipsec_sa *
diff --git a/lib/pipeline/rte_table_action.c b/lib/pipeline/rte_table_action.c
index dfdbc66..87c3e0e 100644
--- a/lib/pipeline/rte_table_action.c
+++ b/lib/pipeline/rte_table_action.c
@@ -465,11 +465,11 @@ struct encap_qinq_data {
uint64_t)(s)) & 0x1LLU) << 8) |\
(((uint64_t)(ttl)) & 0xFFLLU)))
 
-struct encap_mpls_data {
+struct __rte_aligned(2) encap_mpls_data {
struct rte_ether_hdr ether;
uint32_t mpls[RTE_TABLE_ACTION_MPLS_LABELS_MAX];
uint32_t mpls_count;
-} __rte_packed __rte_aligned(2);
+} __rte_packed;
 
 #define PPP_PROTOCOL_IP0x0021
 
@@ -487,42 +487,42 @@ struct encap_pppoe_data {
 
 #define IP_PROTO_UDP   17
 
-struct encap_vxlan_ipv4_data {
+struct __rte_aligned(2) encap_vxlan_ipv4_data {
struct rte_ether_hdr ether;
struct rte_ipv4_hdr ipv4;
struct rte_udp_hdr udp;
struct rte_vxlan_hdr vxlan;
-} __rte_packed __rte_aligned(2);
+} __rte_packed;
 
-struct encap_vxlan_ipv4_vlan_data {
+struct __rte_aligned(2) encap_vxlan_ipv4_vlan_data {
struct rte_ether_hdr ether;
struct rte_vlan_hdr vlan;
struct rte_ipv4_hdr ipv4;
struct rte_udp_hdr udp;
struct rte_vxlan_hdr vxlan;
-} __rte_packed __rte_aligned(2);
+} __rte_packed;
 
-struct encap_vxlan_ipv6_data {
+struct __rte_aligned(2) encap_vxlan_ipv6_data {
struct rte_ether_hdr ether;
struct rte_ipv6_hdr ipv6;
struct rte_udp_hdr udp;
struct rte_vxlan_hdr vxlan;
-} __rte_packed __rte_aligned(2);
+} __rte_packed;
 
-struct encap_vxlan_ipv6_vlan_data {
+struct __rte_aligned(2) encap_vxlan_ipv6_vlan_data {
struct rte_ether_hdr ether;
struct rte_vlan_hdr vlan;
struct rte_ipv6_hdr ipv6;
struct rte_udp_hdr udp;
struct rte_vxlan_hdr vxlan;
-} __rte_pack

[PATCH v5 05/39] ring: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/ring/rte_ring_core.h| 16 +---
 lib/ring/rte_ring_peek_zc.h |  4 ++--
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/lib/ring/rte_ring_core.h b/lib/ring/rte_ring_core.h
index b770873..497d535 100644
--- a/lib/ring/rte_ring_core.h
+++ b/lib/ring/rte_ring_core.h
@@ -19,6 +19,8 @@
  * instead.
  */
 
+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -78,7 +80,7 @@ struct rte_ring_headtail {
 
 union __rte_ring_rts_poscnt {
/** raw 8B value to read/write *cnt* and *pos* as one atomic op */
-   RTE_ATOMIC(uint64_t) raw __rte_aligned(8);
+   alignas(8) RTE_ATOMIC(uint64_t) raw;
struct {
uint32_t cnt; /**< head/tail reference counter */
uint32_t pos; /**< head/tail position */
@@ -94,7 +96,7 @@ struct rte_ring_rts_headtail {
 
 union __rte_ring_hts_pos {
/** raw 8B value to read/write *head* and *tail* as one atomic op */
-   RTE_ATOMIC(uint64_t) raw __rte_aligned(8);
+   alignas(8) RTE_ATOMIC(uint64_t) raw;
struct {
RTE_ATOMIC(uint32_t) head; /**< head position */
RTE_ATOMIC(uint32_t) tail; /**< tail position */
@@ -117,7 +119,7 @@ struct rte_ring_hts_headtail {
  * a problem.
  */
 struct rte_ring {
-   char name[RTE_RING_NAMESIZE] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) char name[RTE_RING_NAMESIZE];
/**< Name of the ring. */
int flags;   /**< Flags supplied at creation. */
const struct rte_memzone *memzone;
@@ -129,20 +131,20 @@ struct rte_ring {
RTE_CACHE_GUARD;
 
/** Ring producer status. */
-   union {
+   union __rte_cache_aligned {
struct rte_ring_headtail prod;
struct rte_ring_hts_headtail hts_prod;
struct rte_ring_rts_headtail rts_prod;
-   }  __rte_cache_aligned;
+   };
 
RTE_CACHE_GUARD;
 
/** Ring consumer status. */
-   union {
+   union __rte_cache_aligned {
struct rte_ring_headtail cons;
struct rte_ring_hts_headtail hts_cons;
struct rte_ring_rts_headtail rts_cons;
-   }  __rte_cache_aligned;
+   };
 
RTE_CACHE_GUARD;
 };
diff --git a/lib/ring/rte_ring_peek_zc.h b/lib/ring/rte_ring_peek_zc.h
index 8fb279c..0b5e34b 100644
--- a/lib/ring/rte_ring_peek_zc.h
+++ b/lib/ring/rte_ring_peek_zc.h
@@ -79,7 +79,7 @@
  * This structure contains the pointers and length of the space
  * reserved on the ring storage.
  */
-struct rte_ring_zc_data {
+struct __rte_cache_aligned rte_ring_zc_data {
/* Pointer to the first space in the ring */
void *ptr1;
/* Pointer to the second space in the ring if there is wrap-around.
@@ -92,7 +92,7 @@ struct rte_ring_zc_data {
 * will give the number of elements available at ptr2.
 */
unsigned int n1;
-} __rte_cache_aligned;
+};
 
 static __rte_always_inline void
 __rte_ring_get_elem_addr(struct rte_ring *r, uint32_t head,
-- 
1.8.3.1



[PATCH v5 08/39] mbuf: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/mbuf/rte_mbuf_core.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/mbuf/rte_mbuf_core.h b/lib/mbuf/rte_mbuf_core.h
index 5688683..917a811 100644
--- a/lib/mbuf/rte_mbuf_core.h
+++ b/lib/mbuf/rte_mbuf_core.h
@@ -463,7 +463,7 @@ enum {
 /**
  * The generic rte_mbuf, containing a packet mbuf.
  */
-struct rte_mbuf {
+struct __rte_cache_aligned rte_mbuf {
RTE_MARKER cacheline0;
 
void *buf_addr;   /**< Virtual address of segment buffer. */
@@ -476,7 +476,7 @@ struct rte_mbuf {
 * same mbuf cacheline0 layout for 32-bit and 64-bit. This makes
 * working on vector drivers easier.
 */
-   rte_iova_t buf_iova __rte_aligned(sizeof(rte_iova_t));
+   alignas(sizeof(rte_iova_t)) rte_iova_t buf_iova;
 #else
/**
 * Next segment of scattered packet.
@@ -662,7 +662,7 @@ struct rte_mbuf {
uint16_t timesync;
 
uint32_t dynfield1[9]; /**< Reserved for dynamic fields. */
-} __rte_cache_aligned;
+};
 
 /**
  * Function typedef of callback to free externally attached buffer.
-- 
1.8.3.1



[PATCH v5 10/39] eventdev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/eventdev/event_timer_adapter_pmd.h  |  4 ++--
 lib/eventdev/eventdev_pmd.h |  8 
 lib/eventdev/rte_event_crypto_adapter.c | 16 
 lib/eventdev/rte_event_dma_adapter.c| 16 
 lib/eventdev/rte_event_eth_rx_adapter.c |  8 
 lib/eventdev/rte_event_eth_tx_adapter.c |  4 ++--
 lib/eventdev/rte_event_timer_adapter.c  |  9 +
 lib/eventdev/rte_event_timer_adapter.h  |  8 
 lib/eventdev/rte_eventdev.h |  8 
 lib/eventdev/rte_eventdev_core.h|  4 ++--
 10 files changed, 43 insertions(+), 42 deletions(-)

diff --git a/lib/eventdev/event_timer_adapter_pmd.h 
b/lib/eventdev/event_timer_adapter_pmd.h
index 65b421b..cd5127f 100644
--- a/lib/eventdev/event_timer_adapter_pmd.h
+++ b/lib/eventdev/event_timer_adapter_pmd.h
@@ -86,7 +86,7 @@ struct event_timer_adapter_ops {
  * @internal Adapter data; structure to be placed in shared memory to be
  * accessible by various processes in a multi-process configuration.
  */
-struct rte_event_timer_adapter_data {
+struct __rte_cache_aligned rte_event_timer_adapter_data {
uint8_t id;
/**< Event timer adapter ID */
uint8_t event_dev_id;
@@ -110,7 +110,7 @@ struct rte_event_timer_adapter_data {
 
uint8_t started : 1;
/**< Flag to indicate adapter started. */
-} __rte_cache_aligned;
+};
 
 #ifdef __cplusplus
 }
diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index c415624..3934d8e 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -107,7 +107,7 @@ struct rte_eventdev_global {
  * This structure is safe to place in shared memory to be common among
  * different processes in a multi-process configuration.
  */
-struct rte_eventdev_data {
+struct __rte_cache_aligned rte_eventdev_data {
int socket_id;
/**< Socket ID where memory is allocated */
uint8_t dev_id;
@@ -146,10 +146,10 @@ struct rte_eventdev_data {
 
uint64_t reserved_64s[4]; /**< Reserved for future fields */
void *reserved_ptrs[4];   /**< Reserved for future fields */
-} __rte_cache_aligned;
+};
 
 /** @internal The data structure associated with each event device. */
-struct rte_eventdev {
+struct __rte_cache_aligned rte_eventdev {
struct rte_eventdev_data *data;
/**< Pointer to device data */
struct eventdev_ops *dev_ops;
@@ -189,7 +189,7 @@ struct rte_eventdev {
 
uint64_t reserved_64s[3]; /**< Reserved for future fields */
void *reserved_ptrs[3];   /**< Reserved for future fields */
-} __rte_cache_aligned;
+};
 
 extern struct rte_eventdev *rte_eventdevs;
 /** @internal The pool of rte_eventdev structures. */
diff --git a/lib/eventdev/rte_event_crypto_adapter.c 
b/lib/eventdev/rte_event_crypto_adapter.c
index d46595d..6bc2769 100644
--- a/lib/eventdev/rte_event_crypto_adapter.c
+++ b/lib/eventdev/rte_event_crypto_adapter.c
@@ -42,7 +42,7 @@
 
 #define ECA_ADAPTER_ARRAY "crypto_adapter_array"
 
-struct crypto_ops_circular_buffer {
+struct __rte_cache_aligned crypto_ops_circular_buffer {
/* index of head element in circular buffer */
uint16_t head;
/* index of tail element in circular buffer */
@@ -53,9 +53,9 @@ struct crypto_ops_circular_buffer {
uint16_t size;
/* Pointer to hold rte_crypto_ops for batching */
struct rte_crypto_op **op_buffer;
-} __rte_cache_aligned;
+};
 
-struct event_crypto_adapter {
+struct __rte_cache_aligned event_crypto_adapter {
/* Event device identifier */
uint8_t eventdev_id;
/* Event port identifier */
@@ -98,10 +98,10 @@ struct event_crypto_adapter {
uint16_t nb_qps;
/* Adapter mode */
enum rte_event_crypto_adapter_mode mode;
-} __rte_cache_aligned;
+};
 
 /* Per crypto device information */
-struct crypto_device_info {
+struct __rte_cache_aligned crypto_device_info {
/* Pointer to cryptodev */
struc

[PATCH v5 11/39] ethdev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/ethdev/ethdev_driver.h   |  8 
 lib/ethdev/rte_ethdev.h  | 16 
 lib/ethdev/rte_ethdev_core.h |  4 ++--
 lib/ethdev/rte_flow_driver.h |  4 ++--
 4 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 0e4c1f0..bab3a8c 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -48,7 +48,7 @@ struct rte_eth_rxtx_callback {
  * memory. This split allows the function pointer and driver data to be per-
  * process, while the actual configuration data for the device is shared.
  */
-struct rte_eth_dev {
+struct __rte_cache_aligned rte_eth_dev {
eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function */
eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function */
 
@@ -93,7 +93,7 @@ struct rte_eth_dev {
 
enum rte_eth_dev_state state; /**< Flag indicating the port state */
void *security_ctx; /**< Context for security ops */
-} __rte_cache_aligned;
+};
 
 struct rte_eth_dev_sriov;
 struct rte_eth_dev_owner;
@@ -104,7 +104,7 @@ struct rte_eth_dev {
  * device. This structure is safe to place in shared memory to be common
  * among different processes in a multi-process configuration.
  */
-struct rte_eth_dev_data {
+struct __rte_cache_aligned rte_eth_dev_data {
char name[RTE_ETH_NAME_MAX_LEN]; /**< Unique identifier name */
 
void **rx_queues; /**< Array of pointers to Rx queues */
@@ -190,7 +190,7 @@ struct rte_eth_dev_data {
uint16_t backer_port_id;
 
pthread_mutex_t flow_ops_mutex; /**< rte_flow ops mutex */
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index ed27360..2a92953 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -333,12 +333,12 @@ struct rte_eth_stats {
  * A structure used to retrieve link-level information of an Ethernet port.
  */
 __extension__
-struct rte_eth_link {
+struct __rte_aligned(8) rte_eth_link {
uint32_t link_speed;/**< RTE_ETH_SPEED_NUM_ */
uint16_t link_duplex  : 1;  /**< RTE_ETH_LINK_[HALF/FULL]_DUPLEX */
uint16_t link_autoneg : 1;  /**< RTE_ETH_LINK_[AUTONEG/FIXED] */
uint16_t link_status  : 1;  /**< RTE_ETH_LINK_[DOWN/UP] */
-} __rte_aligned(8);  /**< aligned for atomic64 read/write */
+};  /**< aligned for atomic64 read/write */
 
 /**@{@name Link negotiation
  * Constants used in link management.
@@ -1836,7 +1836,7 @@ struct rte_eth_dev_info {
  * Ethernet device Rx queue information structure.
  * Used to retrieve information about configured queue.
  */
-struct rte_eth_rxq_info {
+struct __rte_cache_min_aligned rte_eth_rxq_info {
struct rte_mempool *mp; /**< mempool used by that queue. */
struct rte_eth_rxconf conf; /**< queue config parameters. */
uint8_t scattered_rx;   /**< scattered packets Rx supported. */
@@ -1850,17 +1850,17 @@ struct rte_eth_rxq_info {
 * Value 0 means that the threshold monitoring is disabled.
 */
uint8_t avail_thresh;
-} __rte_cache_min_aligned;
+};
 
 /**
  * Ethernet device Tx queue information structure.
  * Used to retrieve information about configured queue.
  */
-struct rte_eth_txq_info {
+struct __rte_cache_min_aligned rte_eth_txq_info {
struct rte_eth_txconf conf; /**< queue config parameters. */
uint16_t nb_desc;   /**< configured number of TXDs. */
uint8_t queue_state;/**< one of RTE_ETH_QUEUE_STATE_*. */
-} __rte_cache_min_aligned;
+};
 
 /**
  * @warning
@@ -1870,7 +1870,7 @@ struct rte_eth_txq_info {
  * Used to retrieve Rx queue information when Tx queue reusing mbufs and moving
  * them into Rx mbuf ring.
  */
-struct rte_eth_recycle_rxq_info {
+struct __rte_cache_min_aligned rte_eth_recycle_rxq_info {
struct rte_mbuf **mbuf_ring; /**< mbuf ring of Rx queue. */
struct rte_mempool *m

[PATCH v5 12/39] dmadev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
Acked-by: Chengwen Feng 
---
 lib/dmadev/rte_dmadev_core.h | 4 ++--
 lib/dmadev/rte_dmadev_pmd.h  | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
index e8239c2..29f5251 100644
--- a/lib/dmadev/rte_dmadev_core.h
+++ b/lib/dmadev/rte_dmadev_core.h
@@ -61,7 +61,7 @@ typedef uint16_t (*rte_dma_completed_status_t)(void 
*dev_private,
  * The 'dev_private' field was placed in the first cache line to optimize
  * performance because the PMD mainly depends on this field.
  */
-struct rte_dma_fp_object {
+struct __rte_cache_aligned rte_dma_fp_object {
/** PMD-specific private data. The driver should copy
 * rte_dma_dev.data->dev_private to this field during initialization.
 */
@@ -73,7 +73,7 @@ struct rte_dma_fp_object {
rte_dma_completed_tcompleted;
rte_dma_completed_status_t completed_status;
rte_dma_burst_capacity_t   burst_capacity;
-} __rte_cache_aligned;
+};
 
 extern struct rte_dma_fp_object *rte_dma_fp_objs;
 
diff --git a/lib/dmadev/rte_dmadev_pmd.h b/lib/dmadev/rte_dmadev_pmd.h
index 7f354f6..5872908 100644
--- a/lib/dmadev/rte_dmadev_pmd.h
+++ b/lib/dmadev/rte_dmadev_pmd.h
@@ -94,7 +94,7 @@ struct rte_dma_dev_ops {
  *
  * @see struct rte_dma_dev::data
  */
-struct rte_dma_dev_data {
+struct __rte_cache_aligned rte_dma_dev_data {
char dev_name[RTE_DEV_NAME_MAX_LEN]; /**< Unique identifier name */
int16_t dev_id; /**< Device [external] identifier. */
int16_t numa_node; /**< Local NUMA memory ID. -1 if unknown. */
@@ -103,7 +103,7 @@ struct rte_dma_dev_data {
__extension__
uint8_t dev_started : 1; /**< Device state: STARTED(1)/STOPPED(0). */
uint64_t reserved[2]; /**< Reserved for future fields */
-} __rte_cache_aligned;
+};
 
 /**
  * Possible states of a DMA device.
@@ -122,7 +122,7 @@ enum rte_dma_dev_state {
  * @internal
  * The generic data structure associated with each DMA device.
  */
-struct rte_dma_dev {
+struct __rte_cache_aligned rte_dma_dev {
/** Device info which supplied during device initialization. */
struct rte_device *device;
struct rte_dma_dev_data *data; /**< Pointer to shared device data. */
@@ -132,7 +132,7 @@ struct rte_dma_dev {
const struct rte_dma_dev_ops *dev_ops;
enum rte_dma_dev_state state; /**< Flag indicating the device state. */
uint64_t reserved[2]; /**< Reserved for future fields. */
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
-- 
1.8.3.1



[PATCH v5 09/39] hash: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/hash/rte_cuckoo_hash.h | 16 +---
 lib/hash/rte_thash.c   |  4 +++-
 lib/hash/rte_thash.h   |  8 
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/lib/hash/rte_cuckoo_hash.h b/lib/hash/rte_cuckoo_hash.h
index 8ea793c..a528f1d 100644
--- a/lib/hash/rte_cuckoo_hash.h
+++ b/lib/hash/rte_cuckoo_hash.h
@@ -11,6 +11,8 @@
 #ifndef _RTE_CUCKOO_HASH_H_
 #define _RTE_CUCKOO_HASH_H_
 
+#include 
+
 #if defined(RTE_ARCH_X86)
 #include "rte_cmp_x86.h"
 #endif
@@ -117,10 +119,10 @@ enum cmp_jump_table_case {
 
 #define RTE_HASH_TSX_MAX_RETRY  10
 
-struct lcore_cache {
+struct __rte_cache_aligned lcore_cache {
unsigned len; /**< Cache len */
uint32_t objs[LCORE_CACHE_SIZE]; /**< Cache objects */
-} __rte_cache_aligned;
+};
 
 /* Structure that stores key-value pair */
 struct rte_hash_key {
@@ -141,7 +143,7 @@ enum rte_hash_sig_compare_function {
 };
 
 /** Bucket structure */
-struct rte_hash_bucket {
+struct __rte_cache_aligned rte_hash_bucket {
uint16_t sig_current[RTE_HASH_BUCKET_ENTRIES];
 
RTE_ATOMIC(uint32_t) key_idx[RTE_HASH_BUCKET_ENTRIES];
@@ -149,10 +151,10 @@ struct rte_hash_bucket {
uint8_t flag[RTE_HASH_BUCKET_ENTRIES];
 
void *next;
-} __rte_cache_aligned;
+};
 
 /** A hash table structure. */
-struct rte_hash {
+struct __rte_cache_aligned rte_hash {
char name[RTE_HASH_NAMESIZE];   /**< Name of the hash. */
uint32_t entries;   /**< Total table entries. */
uint32_t num_buckets;   /**< Number of buckets in table. */
@@ -170,7 +172,7 @@ struct rte_hash {
 
/* Fields used in lookup */
 
-   uint32_t key_len __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint32_t key_len;
/**< Length of hash key. */
uint8_t hw_trans_mem_support;
/**< If hardware transactional memory is used. */
@@ -220,7 +222,7 @@ struct rte_hash {
uint32_t *ext_bkt_to_free;
RTE_ATOMIC(uint32_t) *tbl_chng_cnt;
/**< Indicates if the hash table changed from last read. */
-} __rte_cache_aligned;
+};
 
 struct queue_node {
struct rte_hash_bucket *bkt; /* Current bucket on the bfs search */
diff --git a/lib/hash/rte_thash.c b/lib/hash/rte_thash.c
index e8de071..6464fd3 100644
--- a/lib/hash/rte_thash.c
+++ b/lib/hash/rte_thash.c
@@ -2,6 +2,8 @@
  * Copyright(c) 2021 Intel Corporation
  */
 
+#include 
+
 #include 
 
 #include 
@@ -80,7 +82,7 @@ struct rte_thash_subtuple_helper {
uint32_ttuple_offset;   /** < Offset in bits of the subtuple */
uint32_ttuple_len;  /** < Length in bits of the subtuple */
uint32_tlsb_msk;/** < (1 << reta_sz_log) - 1 */
-   __extension__ uint32_t  compl_table[0] __rte_cache_aligned;
+   __extension__ alignas(RTE_CACHE_LINE_SIZE) uint32_t compl_table[0];
/** < Complementary table */
 };
 
diff --git a/lib/hash/rte_thash.h b/lib/hash/rte_thash.h
index 2681b1b..30b657e 100644
--- a/lib/hash/rte_thash.h
+++ b/lib/hash/rte_thash.h
@@ -99,14 +99,14 @@ struct rte_ipv6_tuple {
};
 };
 
+#ifdef RTE_ARCH_X86
+union __rte_aligned(XMM_SIZE) rte_thash_tuple {
+#else
 union rte_thash_tuple {
+#endif
struct rte_ipv4_tuple   v4;
struct rte_ipv6_tuple   v6;
-#ifdef RTE_ARCH_X86
-} __rte_aligned(XMM_SIZE);
-#else
 };
-#endif
 
 /**
  * Prepare special converted key to use with rte_softrss_be()
-- 
1.8.3.1



[PATCH v5 15/39] vhost: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
Reviewed-by: Maxime Coquelin 
---
 lib/vhost/vhost.h| 8 
 lib/vhost/vhost_crypto.c | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index f163ff7..af48393 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -272,7 +272,7 @@ struct vhost_async {
 /**
  * Structure contains variables relevant to RX/TX virtqueues.
  */
-struct vhost_virtqueue {
+struct __rte_cache_aligned vhost_virtqueue {
union {
struct vring_desc   *desc;
struct vring_packed_desc   *desc_packed;
@@ -351,7 +351,7 @@ struct vhost_virtqueue {
struct virtqueue_stats  stats;
 
RTE_ATOMIC(bool) irq_pending;
-} __rte_cache_aligned;
+};
 
 /* Virtio device status as per Virtio specification */
 #define VIRTIO_DEVICE_STATUS_RESET 0x00
@@ -479,7 +479,7 @@ struct inflight_mem_info {
  * Device structure contains all configuration information relating
  * to the device.
  */
-struct virtio_net {
+struct __rte_cache_aligned virtio_net {
/* Frontend (QEMU) memory and memory region information */
struct rte_vhost_memory *mem;
uint64_tfeatures;
@@ -538,7 +538,7 @@ struct virtio_net {
struct rte_vhost_user_extern_ops extern_ops;
 
struct vhost_backend_ops *backend_ops;
-} __rte_cache_aligned;
+};
 
 static inline void
 vq_assert_lock__(struct virtio_net *dev, struct vhost_virtqueue *vq, const 
char *func)
diff --git a/lib/vhost/vhost_crypto.c b/lib/vhost/vhost_crypto.c
index 3704fbb..eb4a158 100644
--- a/lib/vhost/vhost_crypto.c
+++ b/lib/vhost/vhost_crypto.c
@@ -190,7 +190,7 @@ static int get_iv_len(enum rte_crypto_cipher_algorithm algo)
  * one DPDK crypto device that deals with all crypto workloads. It is declared
  * here and defined in vhost_crypto.c
  */
-struct vhost_crypto {
+struct __rte_cache_aligned vhost_crypto {
/** Used to lookup DPDK Cryptodev Session based on VIRTIO crypto
 *  session ID.
 */
@@ -213,7 +213,7 @@ struct vhost_crypto {
struct virtio_net *dev;
 
uint8_t option;
-} __rte_cache_aligned;
+};
 
 struct vhost_crypto_writeback_data {
uint8_t *src;
-- 
1.8.3.1



[PATCH v5 14/39] acl: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/acl/acl_run.h | 4 ++--
 lib/acl/acl_run_altivec.h | 6 --
 lib/acl/acl_run_neon.h| 6 --
 3 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/lib/acl/acl_run.h b/lib/acl/acl_run.h
index 7d215de..7f09241 100644
--- a/lib/acl/acl_run.h
+++ b/lib/acl/acl_run.h
@@ -55,12 +55,12 @@ struct acl_flow_data {
  * Structure to maintain running results for
  * a single packet (up to 4 tries).
  */
-struct completion {
+struct __rte_aligned(XMM_SIZE) completion {
uint32_t *results;  /* running results. */
int32_t   priority[RTE_ACL_MAX_CATEGORIES]; /* running priorities. */
uint32_t  count;/* num of remaining tries */
/* true for allocated struct */
-} __rte_aligned(XMM_SIZE);
+};
 
 /*
  * One parms structure for each slot in the search engine.
diff --git a/lib/acl/acl_run_altivec.h b/lib/acl/acl_run_altivec.h
index 3c30466..2d398ff 100644
--- a/lib/acl/acl_run_altivec.h
+++ b/lib/acl/acl_run_altivec.h
@@ -3,15 +3,17 @@
  * Copyright (C) IBM Corporation 2016.
  */
 
+#include 
+
 #include "acl_run.h"
 #include "acl_vect.h"
 
-struct _altivec_acl_const {
+alignas(RTE_CACHE_LINE_SIZE) struct _altivec_acl_const {
rte_xmm_t xmm_shuffle_input;
rte_xmm_t xmm_index_mask;
rte_xmm_t xmm_ones_16;
rte_xmm_t range_base;
-} altivec_acl_const __rte_cache_aligned = {
+} altivec_acl_const = {
{
.u32 = {0x, 0x04040404, 0x08080808, 0x0c0c0c0c}
},
diff --git a/lib/acl/acl_run_neon.h b/lib/acl/acl_run_neon.h
index 69d1b6d..63074f8 100644
--- a/lib/acl/acl_run_neon.h
+++ b/lib/acl/acl_run_neon.h
@@ -2,14 +2,16 @@
  * Copyright(c) 2015 Cavium, Inc
  */
 
+#include 
+
 #include "acl_run.h"
 #include "acl_vect.h"
 
-struct _neon_acl_const {
+alignas(RTE_CACHE_LINE_SIZE) struct _neon_acl_const {
rte_xmm_t xmm_shuffle_input;
rte_xmm_t xmm_index_mask;
rte_xmm_t range_base;
-} neon_acl_const __rte_cache_aligned = {
+} neon_acl_const = {
{
.u32 = {0x, 0x04040404, 0x08080808, 0x0c0c0c0c}
},
-- 
1.8.3.1



[PATCH v5 21/39] power: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/power/power_acpi_cpufreq.c   | 4 ++--
 lib/power/power_amd_pstate_cpufreq.c | 4 ++--
 lib/power/power_cppc_cpufreq.c   | 4 ++--
 lib/power/power_intel_uncore.c   | 4 ++--
 lib/power/power_pstate_cpufreq.c | 4 ++--
 lib/power/rte_power_pmd_mgmt.c   | 4 ++--
 6 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/lib/power/power_acpi_cpufreq.c b/lib/power/power_acpi_cpufreq.c
index f8d978d..81996e1 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/lib/power/power_acpi_cpufreq.c
@@ -41,7 +41,7 @@ enum power_state {
 /**
  * Power info per lcore.
  */
-struct acpi_power_info {
+struct __rte_cache_aligned acpi_power_info {
unsigned int lcore_id;   /**< Logical core id */
uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
uint32_t nb_freqs;   /**< number of available freqs */
@@ -51,7 +51,7 @@ struct acpi_power_info {
RTE_ATOMIC(uint32_t) state;  /**< Power in use state */
uint16_t turbo_available;/**< Turbo Boost available */
uint16_t turbo_enable;   /**< Turbo Boost enable/disable */
-} __rte_cache_aligned;
+};
 
 static struct acpi_power_info lcore_power_info[RTE_MAX_LCORE];
 
diff --git a/lib/power/power_amd_pstate_cpufreq.c 
b/lib/power/power_amd_pstate_cpufreq.c
index 028f844..090a0d9 100644
--- a/lib/power/power_amd_pstate_cpufreq.c
+++ b/lib/power/power_amd_pstate_cpufreq.c
@@ -45,7 +45,7 @@ enum power_state {
 /**
  * Power info per lcore.
  */
-struct amd_pstate_power_info {
+struct __rte_cache_aligned amd_pstate_power_info {
uint32_t lcore_id;   /**< Logical core id */
RTE_ATOMIC(uint32_t) state;  /**< Power in use state */
FILE *f; /**< FD of scaling_setspeed */
@@ -58,7 +58,7 @@ struct amd_pstate_power_info {
uint16_t turbo_enable;   /**< Turbo Boost enable/disable */
uint32_t nb_freqs;   /**< number of available freqs */
uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-} __rte_cache_aligned;
+};
 
 static struct amd_pstate_power_info lcore_power_info[RTE_MAX_LCORE];
 
diff --git a/lib/power/power_cppc_cpufreq.c b/lib/power/power_cppc_cpufreq.c
index 3ddf39b..32aaacb 100644
--- a/lib/power/power_cppc_cpufreq.c
+++ b/lib/power/power_cppc_cpufreq.c
@@ -49,7 +49,7 @@ enum power_state {
 /**
  * Power info per lcore.
  */
-struct cppc_power_info {
+struct __rte_cache_aligned cppc_power_info {
unsigned int lcore_id;   /**< Logical core id */
RTE_ATOMIC(uint32_t) state;  /**< Power in use state */
FILE *f; /**< FD of scaling_setspeed */
@@ -61,7 +61,7 @@ struct cppc_power_info {
uint16_t turbo_enable;   /**< Turbo Boost enable/disable */
uint32_t nb_freqs;   /**< number of available freqs */
uint32_t freqs[RTE_MAX_LCORE_FREQS]; /**< Frequency array */
-} __rte_cache_aligned;
+};
 
 static struct cppc_power_info lcore_power_info[RTE_MAX_LCORE];
 
diff --git a/lib/power/power_intel_uncore.c b/lib/power/power_intel_uncore.c
index 3ce8fcc..9c152e4 100644
--- a/lib/power/power_intel_uncore.c
+++ b/lib/power/power_intel_uncore.c
@@ -29,7 +29,7 @@

"/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/initial_min_freq_khz"
 
 
-struct uncore_power_info {
+struct __rte_cache_aligned uncore_power_info {
unsigned int die;  /* Core die id */
unsigned int pkg;  /* Package id */
uint32_t freqs[MAX_UNCORE_FREQS];  /* Frequency array */
@@ -41,7 +41,7 @@ struct uncore_power_info {
uint32_t org_max_freq; /* Original max freq of uncore */
uint32_t init_max_freq;/* System max uncore freq */
uint32_t init_min_freq;/* System min uncore freq */
-} __rte_cache_

[PATCH v5 13/39] distributor: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/distributor/distributor_private.h | 34 ++
 lib/distributor/rte_distributor.c |  5 +++--
 2 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/lib/distributor/distributor_private.h 
b/lib/distributor/distributor_private.h
index dfeb9b5..07c2c05 100644
--- a/lib/distributor/distributor_private.h
+++ b/lib/distributor/distributor_private.h
@@ -5,6 +5,8 @@
 #ifndef _DIST_PRIV_H_
 #define _DIST_PRIV_H_
 
+#include 
+
 /**
  * @file
  * RTE distributor
@@ -51,10 +53,10 @@
  * the next cache line to worker 0, we pad this out to three cache lines.
  * Only 64-bits of the memory is actually used though.
  */
-union rte_distributor_buffer_single {
+union __rte_cache_aligned rte_distributor_buffer_single {
volatile RTE_ATOMIC(int64_t) bufptr64;
char pad[RTE_CACHE_LINE_SIZE*3];
-} __rte_cache_aligned;
+};
 
 /*
  * Transfer up to 8 mbufs at a time to/from workers, and
@@ -62,12 +64,12 @@
  */
 #define RTE_DIST_BURST_SIZE 8
 
-struct rte_distributor_backlog {
+struct __rte_cache_aligned rte_distributor_backlog {
unsigned int start;
unsigned int count;
-   int64_t pkts[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) int64_t pkts[RTE_DIST_BURST_SIZE];
uint16_t *tags; /* will point to second cacheline of inflights */
-} __rte_cache_aligned;
+};
 
 
 struct rte_distributor_returned_pkts {
@@ -113,17 +115,17 @@ enum rte_distributor_match_function {
  * There is a separate cacheline for returns in the burst API.
  */
 struct rte_distributor_buffer {
-   volatile RTE_ATOMIC(int64_t) bufptr64[RTE_DIST_BURST_SIZE]
-   __rte_cache_aligned; /* <= outgoing to worker */
+   volatile alignas(RTE_CACHE_LINE_SIZE) RTE_ATOMIC(int64_t) 
bufptr64[RTE_DIST_BURST_SIZE];
+   /* <= outgoing to worker */
 
-   int64_t pad1 __rte_cache_aligned;/* <= one cache line  */
+   alignas(RTE_CACHE_LINE_SIZE) int64_t pad1;/* <= one cache line  */
 
-   volatile RTE_ATOMIC(int64_t) retptr64[RTE_DIST_BURST_SIZE]
-   __rte_cache_aligned; /* <= incoming from worker */
+   volatile alignas(RTE_CACHE_LINE_SIZE) RTE_ATOMIC(int64_t) 
retptr64[RTE_DIST_BURST_SIZE];
+   /* <= incoming from worker */
 
-   int64_t pad2 __rte_cache_aligned;/* <= one cache line  */
+   alignas(RTE_CACHE_LINE_SIZE) int64_t pad2;/* <= one cache line  */
 
-   int count __rte_cache_aligned;   /* <= number of current mbufs */
+   alignas(RTE_CACHE_LINE_SIZE) int count;   /* <= number of current 
mbufs */
 };
 
 struct rte_distributor {
@@ -138,11 +140,11 @@ struct rte_distributor {
 * on the worker core. Second cache line are the backlog
 * that are going to go to the worker core.
 */
-   uint16_t in_flight_tags[RTE_DISTRIB_MAX_WORKERS][RTE_DIST_BURST_SIZE*2]
-   __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint16_t
+   in_flight_tags[RTE_DISTRIB_MAX_WORKERS][RTE_DIST_BURST_SIZE*2];
 
-   struct rte_distributor_backlog backlog[RTE_DISTRIB_MAX_WORKERS]
-   __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_distributor_backlog
+   backlog[RTE_DISTRIB_MAX_WORKERS];
 
struct rte_distributor_buffer bufs[RTE_DISTRIB_MAX_WORKERS];
 
diff --git a/lib/distributor/rte_distributor.c 
b/lib/distributor/rte_distributor.c
index e842dc9..e58727c 100644
--- a/lib/distributor/rte_distributor.c
+++ b/lib/distributor/rte_distributor.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2017 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -447,7 +448,7 @@
struct rte_mbuf *next_mb = NULL;
int64_t next_value = 0;
uint16_t new_tag = 0;
-   uint16_t flows[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint16_t flows[RTE_DIST_BURST_SIZE];
unsigned int i, j, w,

[PATCH v5 22/39] rawdev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/rawdev/rte_rawdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/rawdev/rte_rawdev.h b/lib/rawdev/rte_rawdev.h
index 7d5764d..640037b 100644
--- a/lib/rawdev/rte_rawdev.h
+++ b/lib/rawdev/rte_rawdev.h
@@ -279,7 +279,7 @@
  * It is a placeholder for PMD specific data, encapsulating only information
  * related to framework.
  */
-struct rte_rawdev {
+struct __rte_cache_aligned rte_rawdev {
/**< Socket ID where memory is allocated */
int socket_id;
/**< Device ID for this instance */
@@ -300,7 +300,7 @@ struct rte_rawdev {
rte_rawdev_obj_t dev_private;
/**< Device name */
char name[RTE_RAWDEV_NAME_MAX_LEN];
-} __rte_cache_aligned;
+};
 
 /** @internal The pool of rte_rawdev structures. */
 extern struct rte_rawdev *rte_rawdevs;
-- 
1.8.3.1



[PATCH v5 16/39] timer: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/timer/rte_timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/timer/rte_timer.c b/lib/timer/rte_timer.c
index 53ed221..bb8b6a6 100644
--- a/lib/timer/rte_timer.c
+++ b/lib/timer/rte_timer.c
@@ -24,7 +24,7 @@
 /**
  * Per-lcore info for timers.
  */
-struct priv_timer {
+struct __rte_cache_aligned priv_timer {
struct rte_timer pending_head;  /**< dummy timer instance to head up 
list */
rte_spinlock_t list_lock;   /**< lock to protect list access */
 
@@ -44,7 +44,7 @@ struct priv_timer {
/** per-lcore statistics */
struct rte_timer_debug_stats stats;
 #endif
-} __rte_cache_aligned;
+};
 
 #define FL_ALLOCATED   (1 << 0)
 struct rte_timer_data {
-- 
1.8.3.1



[PATCH v5 17/39] table: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/table/rte_swx_table_learner.c | 4 ++--
 lib/table/rte_table_acl.c | 3 ++-
 lib/table/rte_table_array.c   | 7 ---
 lib/table/rte_table_hash_cuckoo.c | 4 +++-
 lib/table/rte_table_hash_ext.c| 3 ++-
 lib/table/rte_table_hash_key16.c  | 4 +++-
 lib/table/rte_table_hash_key32.c  | 4 +++-
 lib/table/rte_table_hash_key8.c   | 4 +++-
 lib/table/rte_table_hash_lru.c| 3 ++-
 lib/table/rte_table_lpm.c | 3 ++-
 lib/table/rte_table_lpm_ipv6.c| 3 ++-
 11 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/lib/table/rte_swx_table_learner.c 
b/lib/table/rte_swx_table_learner.c
index 2b5e6bd..55a3645 100644
--- a/lib/table/rte_swx_table_learner.c
+++ b/lib/table/rte_swx_table_learner.c
@@ -145,13 +145,13 @@ struct table_params {
size_t total_size;
 };
 
-struct table {
+struct __rte_cache_aligned table {
/* Table parameters. */
struct table_params params;
 
/* Table buckets. */
uint8_t buckets[];
-} __rte_cache_aligned;
+};
 
 /* The timeout (in cycles) is stored in the table as a 32-bit value by 
truncating its least
  * significant 32 bits. Therefore, to make sure the time is always advancing 
when adding the timeout
diff --git a/lib/table/rte_table_acl.c b/lib/table/rte_table_acl.c
index 83411d2..2764cda 100644
--- a/lib/table/rte_table_acl.c
+++ b/lib/table/rte_table_acl.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -47,7 +48,7 @@ struct rte_table_acl {
uint8_t *acl_rule_memory; /* Memory to store the rules */
 
/* Memory to store the action table and stack of free entries */
-   uint8_t memory[0] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[0];
 };
 
 
diff --git a/lib/table/rte_table_array.c b/lib/table/rte_table_array.c
index 80bc2a7..31a17d5 100644
--- a/lib/table/rte_table_array.c
+++ b/lib/table/rte_table_array.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2014 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -27,7 +28,7 @@
 
 #endif
 
-struct rte_table_array {
+struct __rte_cache_aligned rte_table_array {
struct rte_table_stats stats;
 
/* Input parameters */
@@ -39,8 +40,8 @@ struct rte_table_array {
uint32_t entry_pos_mask;
 
/* Internal table */
-   uint8_t array[0] __rte_cache_aligned;
-} __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t array[0];
+};
 
 static void *
 rte_table_array_create(void *params, int socket_id, uint32_t entry_size)
diff --git a/lib/table/rte_table_hash_cuckoo.c 
b/lib/table/rte_table_hash_cuckoo.c
index 0f4900c..d3b60f3 100644
--- a/lib/table/rte_table_hash_cuckoo.c
+++ b/lib/table/rte_table_hash_cuckoo.c
@@ -1,6 +1,8 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2010-2017 Intel Corporation
  */
+
+#include 
 #include 
 #include 
 
@@ -42,7 +44,7 @@ struct rte_table_hash {
struct rte_hash *h_table;
 
/* Lookup table */
-   uint8_t memory[0] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[0];
 };
 
 static int
diff --git a/lib/table/rte_table_hash_ext.c b/lib/table/rte_table_hash_ext.c
index 2148d83..61e3c79 100644
--- a/lib/table/rte_table_hash_ext.c
+++ b/lib/table/rte_table_hash_ext.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2010-2017 Intel Corporation
  */
 
+#include 
 #include 
 #include 
 
@@ -99,7 +100,7 @@ struct rte_table_hash {
uint32_t *bkt_ext_stack;
 
/* Table memory */
-   uint8_t memory[0] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint8_t memory[0];
 };
 
 static int
diff --git a/lib/table/rte_table_hash_key16.c b/lib/table/rte_table_hash_key16.c
index 7734aef..2af34a5 100644
--- a/lib/table/rte_table_hash_key16.c
+++ b/lib/table/rte_table_hash_key16.c
@@ -1,6 +1,8 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2010-2017 Intel Corporation
  */
+
+#include 
 #include 
 #include 
 
@@ -83,7 +85,7 @@ 

[PATCH v5 25/39] node: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/node/node_private.h | 4 ++--
 lib/node/pkt_cls.c  | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/node/node_private.h b/lib/node/node_private.h
index 2b9bad1..ff04659 100644
--- a/lib/node/node_private.h
+++ b/lib/node/node_private.h
@@ -51,9 +51,9 @@ struct node_mbuf_priv1 {
 /**
  * Node mbuf private area 2.
  */
-struct node_mbuf_priv2 {
+struct __rte_cache_aligned node_mbuf_priv2 {
uint64_t priv_data;
-} __rte_cache_aligned;
+};
 
 #define NODE_MBUF_PRIV2_SIZE sizeof(struct node_mbuf_priv2)
 
diff --git a/lib/node/pkt_cls.c b/lib/node/pkt_cls.c
index a8302b8..9d21b7f 100644
--- a/lib/node/pkt_cls.c
+++ b/lib/node/pkt_cls.c
@@ -2,6 +2,8 @@
  * Copyright (C) 2020 Marvell.
  */
 
+#include 
+
 #include 
 #include 
 
@@ -9,7 +11,7 @@
 #include "node_private.h"
 
 /* Next node for each ptype, default is '0' is "pkt_drop" */
-static const uint8_t p_nxt[256] __rte_cache_aligned = {
+static const alignas(RTE_CACHE_LINE_SIZE) uint8_t p_nxt[256] = {
[RTE_PTYPE_L3_IPV4] = PKT_CLS_NEXT_IP4_LOOKUP,
 
[RTE_PTYPE_L3_IPV4_EXT] = PKT_CLS_NEXT_IP4_LOOKUP,
-- 
1.8.3.1



[PATCH v5 18/39] reorder: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/reorder/rte_reorder.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/reorder/rte_reorder.c b/lib/reorder/rte_reorder.c
index c080b2c..ae97e1a 100644
--- a/lib/reorder/rte_reorder.c
+++ b/lib/reorder/rte_reorder.c
@@ -37,16 +37,16 @@
 int rte_reorder_seqn_dynfield_offset = -1;
 
 /* A generic circular buffer */
-struct cir_buffer {
+struct __rte_cache_aligned cir_buffer {
unsigned int size;   /**< Number of entries that can be stored */
unsigned int mask;   /**< [buffer_size - 1]: used for wrap-around */
unsigned int head;   /**< insertion point in buffer */
unsigned int tail;   /**< extraction point in buffer */
struct rte_mbuf **entries;
-} __rte_cache_aligned;
+};
 
 /* The reorder buffer data structure itself */
-struct rte_reorder_buffer {
+struct __rte_cache_aligned rte_reorder_buffer {
char name[RTE_REORDER_NAMESIZE];
uint32_t min_seqn;  /**< Lowest seq. number that can be in the buffer */
unsigned int memsize; /**< memory area size of reorder buffer */
@@ -54,7 +54,7 @@ struct rte_reorder_buffer {
 
struct cir_buffer ready_buf; /**< temp buffer for dequeued entries */
struct cir_buffer order_buf; /**< buffer used to reorder entries */
-} __rte_cache_aligned;
+};
 
 static void
 rte_reorder_free_mbufs(struct rte_reorder_buffer *b);
-- 
1.8.3.1



[PATCH v5 20/39] rcu: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/rcu/rte_rcu_qsbr.h | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/lib/rcu/rte_rcu_qsbr.h b/lib/rcu/rte_rcu_qsbr.h
index e7ef788..d8ecf11 100644
--- a/lib/rcu/rte_rcu_qsbr.h
+++ b/lib/rcu/rte_rcu_qsbr.h
@@ -21,6 +21,8 @@
  * entered quiescent state.
  */
 
+#include 
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -69,7 +71,7 @@
 #define RTE_QSBR_THRID_INVALID 0x
 
 /* Worker thread counter */
-struct rte_rcu_qsbr_cnt {
+struct __rte_cache_aligned rte_rcu_qsbr_cnt {
RTE_ATOMIC(uint64_t) cnt;
/**< Quiescent state counter. Value 0 indicates the thread is offline
 *   64b counter is used to avoid adding more code to address
@@ -78,7 +80,7 @@ struct rte_rcu_qsbr_cnt {
 */
RTE_ATOMIC(uint32_t) lock_cnt;
/**< Lock counter. Used when RTE_LIBRTE_RCU_DEBUG is enabled */
-} __rte_cache_aligned;
+};
 
 #define __RTE_QSBR_CNT_THR_OFFLINE 0
 #define __RTE_QSBR_CNT_INIT 1
@@ -91,28 +93,28 @@ struct rte_rcu_qsbr_cnt {
  * 1) Quiescent state counter array
  * 2) Register thread ID array
  */
-struct rte_rcu_qsbr {
-   RTE_ATOMIC(uint64_t) token __rte_cache_aligned;
+struct __rte_cache_aligned rte_rcu_qsbr {
+   alignas(RTE_CACHE_LINE_SIZE) RTE_ATOMIC(uint64_t) token;
/**< Counter to allow for multiple concurrent quiescent state queries */
RTE_ATOMIC(uint64_t) acked_token;
/**< Least token acked by all the threads in the last call to
 *   rte_rcu_qsbr_check API.
 */
 
-   uint32_t num_elems __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) uint32_t num_elems;
/**< Number of elements in the thread ID array */
RTE_ATOMIC(uint32_t) num_threads;
/**< Number of threads currently using this QS variable */
uint32_t max_threads;
/**< Maximum number of threads using this QS variable */
 
-   struct rte_rcu_qsbr_cnt qsbr_cnt[0] __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_rcu_qsbr_cnt qsbr_cnt[0];
/**< Quiescent state counter array of 'max_threads' elements */
 
/**< Registered thread IDs are stored in a bitmap array,
 *   after the quiescent state counter array.
 */
-} __rte_cache_aligned;
+};
 
 /**
  * Call back function called to free the resources.
-- 
1.8.3.1



[PATCH v5 27/39] mempool: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 27 ++-
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 6fa4d48..23fd5c8 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -34,6 +34,7 @@
  * user cache created with rte_mempool_cache_create().
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -66,7 +67,7 @@
  * captured since they can be calculated from other stats.
  * For example: put_cache_objs = put_objs - put_common_pool_objs.
  */
-struct rte_mempool_debug_stats {
+struct __rte_cache_aligned rte_mempool_debug_stats {
uint64_t put_bulk; /**< Number of puts. */
uint64_t put_objs; /**< Number of objects successfully put. 
*/
uint64_t put_common_pool_bulk; /**< Number of bulks enqueued in common 
pool. */
@@ -80,13 +81,13 @@ struct rte_mempool_debug_stats {
uint64_t get_success_blks; /**< Successful allocation number of 
contiguous blocks. */
uint64_t get_fail_blks;/**< Failed allocation number of 
contiguous blocks. */
RTE_CACHE_GUARD;
-} __rte_cache_aligned;
+};
 #endif
 
 /**
  * A structure that stores a per-core object cache.
  */
-struct rte_mempool_cache {
+struct __rte_cache_aligned rte_mempool_cache {
uint32_t size;/**< Size of the cache */
uint32_t flushthresh; /**< Threshold before we flush excess elements */
uint32_t len; /**< Current cache count */
@@ -109,8 +110,8 @@ struct rte_mempool_cache {
 * Cache is allocated to this size to allow it to overflow in certain
 * cases to avoid needless emptying of cache.
 */
-   void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2] __rte_cache_aligned;
-} __rte_cache_aligned;
+   alignas(RTE_CACHE_LINE_SIZE) void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 2];
+};
 
 /**
  * A structure that stores the size of mempool elements.
@@ -218,15 +219,15 @@ struct rte_mempool_memhdr {
  * The structure is cache-line aligned to avoid ABI breakages in
  * a number of cases when something small is added.
  */
-struct rte_mempool_info {
+struct __rte_cache_aligned rte_mempool_info {
/** Number of objects in the contiguous block */
unsigned int contig_block_size;
-} __rte_cache_aligned;
+};
 
 /**
  * The RTE mempool structure.
  */
-struct rte_mempool {
+struct __rte_cache_aligned rte_mempool {
char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
union {
void *pool_data; /**< Ring or pool to store objects. */
@@ -268,7 +269,7 @@ struct rte_mempool {
 */
struct rte_mempool_debug_stats stats[RTE_MAX_LCORE + 1];
 #endif
-}  __rte_cache_aligned;
+};
 
 /** Spreading among memory channels not required. */
 #define RTE_MEMPOOL_F_NO_SPREAD0x0001
@@ -688,7 +689,7 @@ typedef int (*rte_mempool_get_info_t)(const struct 
rte_mempool *mp,
 
 
 /** Structure defining mempool operations structure */
-struct rte_mempool_ops {
+struct __rte_cache_aligned rte_mempool_ops {
char name[RTE_MEMPOOL_OPS_NAMESIZE]; /**< Name of mempool ops struct. */
rte_mempool_alloc_t alloc;   /**< Allocate private data. */
rte_mempool_free_t free; /**< Free the external pool. */
@@ -713,7 +714,7 @@ struct rte_mempool_ops {
 * Dequeue a number of contiguous object blocks.
 */
rte_mempool_dequeue_contig_blocks_t dequeue_contig_blocks;
-} __rte_cache_aligned;
+};
 
 #define RTE_MEMPOOL_MAX_OPS_IDX 16  /**< Max registered ops structs */
 
@@ -726,14 +727,14 @@ struct rte_mempool_ops {
  * any function pointers stored directly in the mempool struct would not be.
  * This results in us simply having "ops_index" in the mempool struct.
  */
-struct rte_mempool_ops_table {
+struct __rte_cache_aligned rte_mempool_ops_table {
rte_spinlock_t sl; /**< Spinlock for add/delete. */
uint32_t num_ops; 

[PATCH v5 31/39] jobstats: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/jobstats/rte_jobstats.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/jobstats/rte_jobstats.h b/lib/jobstats/rte_jobstats.h
index 45b460e..bdd85fe 100644
--- a/lib/jobstats/rte_jobstats.h
+++ b/lib/jobstats/rte_jobstats.h
@@ -32,7 +32,7 @@
 typedef void (*rte_job_update_period_cb_t)(struct rte_jobstats *job,
int64_t job_result);
 
-struct rte_jobstats {
+struct __rte_cache_aligned rte_jobstats {
uint64_t period;
/**< Estimated period of execution. */
 
@@ -65,9 +65,9 @@ struct rte_jobstats {
 
struct rte_jobstats_context *context;
/**< Job stats context object that is executing this job. */
-} __rte_cache_aligned;
+};
 
-struct rte_jobstats_context {
+struct __rte_cache_aligned rte_jobstats_context {
/** Variable holding time at different points:
 * -# loop start time if loop was started but no job executed yet.
 * -# job start time if job is currently executing.
@@ -111,7 +111,7 @@ struct rte_jobstats_context {
 
uint64_t loop_cnt;
/**< Total count of executed loops with at least one executed job. */
-} __rte_cache_aligned;
+};
 
 /**
  * Initialize given context object with default values.
-- 
1.8.3.1



[PATCH v5 32/39] bpf: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/bpf/bpf_pkt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/bpf/bpf_pkt.c b/lib/bpf/bpf_pkt.c
index 793a75d..aaca935 100644
--- a/lib/bpf/bpf_pkt.c
+++ b/lib/bpf/bpf_pkt.c
@@ -23,7 +23,7 @@
  * information about installed BPF rx/tx callback
  */
 
-struct bpf_eth_cbi {
+struct __rte_cache_aligned bpf_eth_cbi {
/* used by both data & control path */
RTE_ATOMIC(uint32_t) use;/*usage counter */
const struct rte_eth_rxtx_callback *cb;  /* callback handle */
@@ -33,7 +33,7 @@ struct bpf_eth_cbi {
LIST_ENTRY(bpf_eth_cbi) link;
uint16_t port;
uint16_t queue;
-} __rte_cache_aligned;
+};
 
 /*
  * Odd number means that callback is used by datapath.
-- 
1.8.3.1



[PATCH v5 23/39] port: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/port/rte_port_frag.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/port/rte_port_frag.c b/lib/port/rte_port_frag.c
index 883601a..0940f94 100644
--- a/lib/port/rte_port_frag.c
+++ b/lib/port/rte_port_frag.c
@@ -34,7 +34,7 @@
struct rte_mempool *pool_direct,
struct rte_mempool *pool_indirect);
 
-struct rte_port_ring_reader_frag {
+struct __rte_cache_aligned rte_port_ring_reader_frag {
struct rte_port_in_stats stats;
 
/* Input parameters */
@@ -53,7 +53,7 @@ struct rte_port_ring_reader_frag {
uint32_t pos_frags;
 
frag_op f_frag;
-} __rte_cache_aligned;
+};
 
 static void *
 rte_port_ring_reader_frag_create(void *params, int socket_id, int is_ipv4)
-- 
1.8.3.1



[PATCH v5 24/39] pdcp: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/pdcp/rte_pdcp.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/pdcp/rte_pdcp.h b/lib/pdcp/rte_pdcp.h
index dd8b6e4..f74524f 100644
--- a/lib/pdcp/rte_pdcp.h
+++ b/lib/pdcp/rte_pdcp.h
@@ -49,7 +49,7 @@ typedef uint16_t (*rte_pdcp_post_p_t)(const struct 
rte_pdcp_entity *entity,
  * A PDCP entity is associated either to the control plane or the user plane
  * depending on which radio bearer it is carrying data for.
  */
-struct rte_pdcp_entity {
+struct __rte_cache_aligned rte_pdcp_entity {
/** Entity specific pre-process handle. */
rte_pdcp_pre_p_t pre_process;
/** Entity specific post-process handle. */
@@ -66,7 +66,7 @@ struct rte_pdcp_entity {
 * hold additionally 'max_pkt_cache' number of packets.
 */
uint32_t max_pkt_cache;
-} __rte_cache_aligned;
+};
 
 /**
  * Callback function type for t-Reordering timer start, set during PDCP entity 
establish.
-- 
1.8.3.1



[PATCH v5 28/39] member: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/member/rte_member.h| 8 
 lib/member/rte_member_ht.h | 4 ++--
 lib/member/rte_member_sketch.c | 4 ++--
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/lib/member/rte_member.h b/lib/member/rte_member.h
index 3278bbb..aec192e 100644
--- a/lib/member/rte_member.h
+++ b/lib/member/rte_member.h
@@ -139,7 +139,7 @@ typedef void (*sketch_delete_fn_t)(const struct 
rte_member_setsum *ss,
   const void *key);
 
 /** @internal setsummary structure. */
-struct rte_member_setsum {
+struct __rte_cache_aligned rte_member_setsum {
enum rte_member_setsum_type type; /* Type of the set summary. */
uint32_t key_len;   /* Length of key. */
uint32_t prim_hash_seed;/* Primary hash function seed. */
@@ -185,14 +185,14 @@ struct rte_member_setsum {
 #ifdef RTE_ARCH_X86
bool use_avx512;
 #endif
-} __rte_cache_aligned;
+};
 
 /**
  * Parameters used when create the set summary table. Currently user can
  * specify two types of setsummary: HT based and vBF. For HT based, user can
  * specify cache or non-cache mode. Here is a table to describe some 
differences
  */
-struct rte_member_parameters {
+struct __rte_cache_aligned rte_member_parameters {
const char *name;   /**< Name of the hash. */
 
/**
@@ -326,7 +326,7 @@ struct rte_member_parameters {
uint32_t extra_flag;
 
int socket_id;  /**< NUMA Socket ID for memory. */
-} __rte_cache_aligned;
+};
 
 /**
  * Find an existing set-summary and return a pointer to it.
diff --git a/lib/member/rte_member_ht.h b/lib/member/rte_member_ht.h
index 9e24ccd..c9673e3 100644
--- a/lib/member/rte_member_ht.h
+++ b/lib/member/rte_member_ht.h
@@ -15,10 +15,10 @@
 typedef uint16_t member_sig_t; /* signature size is 16 bit */
 
 /* The bucket struct for ht setsum */
-struct member_ht_bucket {
+struct __rte_cache_aligned member_ht_bucket {
member_sig_t sigs[RTE_MEMBER_BUCKET_ENTRIES];   /* 2-byte signature */
member_set_t sets[RTE_MEMBER_BUCKET_ENTRIES];   /* 2-byte set */
-} __rte_cache_aligned;
+};
 
 int
 rte_member_create_ht(struct rte_member_setsum *ss,
diff --git a/lib/member/rte_member_sketch.c b/lib/member/rte_member_sketch.c
index e006e83..15af678 100644
--- a/lib/member/rte_member_sketch.c
+++ b/lib/member/rte_member_sketch.c
@@ -23,7 +23,7 @@
 #include "rte_member_sketch_avx512.h"
 #endif /* CC_AVX512_SUPPORT */
 
-struct sketch_runtime {
+struct __rte_cache_aligned sketch_runtime {
uint64_t pkt_cnt;
uint32_t until_next;
int converged;
@@ -31,7 +31,7 @@ struct sketch_runtime {
struct node *report_array;
void *key_slots;
struct rte_ring *free_key_slots;
-} __rte_cache_aligned;
+};
 
 /*
  * Geometric sampling to calculate how many packets needs to be
-- 
1.8.3.1



[PATCH v5 33/39] compressdev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/compressdev/rte_comp.h | 4 ++--
 lib/compressdev/rte_compressdev_internal.h | 8 
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/compressdev/rte_comp.h b/lib/compressdev/rte_comp.h
index 3606ebf..830a240 100644
--- a/lib/compressdev/rte_comp.h
+++ b/lib/compressdev/rte_comp.h
@@ -356,7 +356,7 @@ struct rte_comp_xform {
  * Comp operations are enqueued and dequeued in comp PMDs using the
  * rte_compressdev_enqueue_burst() / rte_compressdev_dequeue_burst() APIs
  */
-struct rte_comp_op {
+struct __rte_cache_aligned rte_comp_op {
enum rte_comp_op_type op_type;
union {
void *private_xform;
@@ -478,7 +478,7 @@ struct rte_comp_op {
 * will be set to RTE_COMP_OP_STATUS_SUCCESS after operation
 * is successfully processed by a PMD
 */
-} __rte_cache_aligned;
+};
 
 /**
  * Creates an operation pool
diff --git a/lib/compressdev/rte_compressdev_internal.h 
b/lib/compressdev/rte_compressdev_internal.h
index 01b7764..8a626d3 100644
--- a/lib/compressdev/rte_compressdev_internal.h
+++ b/lib/compressdev/rte_compressdev_internal.h
@@ -69,7 +69,7 @@ typedef uint16_t (*compressdev_enqueue_pkt_burst_t)(void *qp,
struct rte_comp_op **ops, uint16_t nb_ops);
 
 /** The data structure associated with each comp device. */
-struct rte_compressdev {
+struct __rte_cache_aligned rte_compressdev {
compressdev_dequeue_pkt_burst_t dequeue_burst;
/**< Pointer to PMD receive function */
compressdev_enqueue_pkt_burst_t enqueue_burst;
@@ -87,7 +87,7 @@ struct rte_compressdev {
__extension__
uint8_t attached : 1;
/**< Flag indicating the device is attached */
-} __rte_cache_aligned;
+};
 
 /**
  *
@@ -96,7 +96,7 @@ struct rte_compressdev {
  * This structure is safe to place in shared memory to be common among
  * different processes in a multi-process configuration.
  */
-struct rte_compressdev_data {
+struct __rte_cache_aligned rte_compressdev_data {
uint8_t dev_id;
/**< Compress device identifier */
int socket_id;
@@ -115,7 +115,7 @@ struct rte_compressdev_data {
 
void *dev_private;
/**< PMD-specific private data */
-} __rte_cache_aligned;
+};
 
 #ifdef __cplusplus
 }
-- 
1.8.3.1



[PATCH v5 34/39] cryptodev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/cryptodev/cryptodev_pmd.h  | 8 
 lib/cryptodev/rte_cryptodev_core.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/cryptodev/cryptodev_pmd.h b/lib/cryptodev/cryptodev_pmd.h
index 0732b35..6229ad4 100644
--- a/lib/cryptodev/cryptodev_pmd.h
+++ b/lib/cryptodev/cryptodev_pmd.h
@@ -61,7 +61,7 @@ struct rte_cryptodev_pmd_init_params {
  * This structure is safe to place in shared memory to be common among
  * different processes in a multi-process configuration.
  */
-struct rte_cryptodev_data {
+struct __rte_cache_aligned rte_cryptodev_data {
/** Device ID for this instance */
uint8_t dev_id;
/** Socket ID where memory is allocated */
@@ -82,10 +82,10 @@ struct rte_cryptodev_data {
 
/** PMD-specific private data */
void *dev_private;
-} __rte_cache_aligned;
+};
 
 /** @internal The data structure associated with each crypto device. */
-struct rte_cryptodev {
+struct __rte_cache_aligned rte_cryptodev {
/** Pointer to PMD dequeue function. */
dequeue_pkt_burst_t dequeue_burst;
/** Pointer to PMD enqueue function. */
@@ -117,7 +117,7 @@ struct rte_cryptodev {
struct rte_cryptodev_cb_rcu *enq_cbs;
/** User application callback for post dequeue processing */
struct rte_cryptodev_cb_rcu *deq_cbs;
-} __rte_cache_aligned;
+};
 
 /** Global structure used for maintaining state of allocated crypto devices */
 struct rte_cryptodev_global {
diff --git a/lib/cryptodev/rte_cryptodev_core.h 
b/lib/cryptodev/rte_cryptodev_core.h
index 5de89d0..8d7e58d 100644
--- a/lib/cryptodev/rte_cryptodev_core.h
+++ b/lib/cryptodev/rte_cryptodev_core.h
@@ -40,7 +40,7 @@ struct rte_cryptodev_qpdata {
struct rte_cryptodev_cb_rcu *deq_cb;
 };
 
-struct rte_crypto_fp_ops {
+struct __rte_cache_aligned rte_crypto_fp_ops {
/** PMD enqueue burst function. */
enqueue_pkt_burst_t enqueue_burst;
/** PMD dequeue burst function. */
@@ -49,7 +49,7 @@ struct rte_crypto_fp_ops {
struct rte_cryptodev_qpdata qp;
/** Reserved for future ops. */
uintptr_t reserved[3];
-} __rte_cache_aligned;
+};
 
 extern struct rte_crypto_fp_ops rte_crypto_fp_ops[RTE_CRYPTO_MAX_DEVS];
 
-- 
1.8.3.1



[PATCH v5 19/39] regexdev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/regexdev/rte_regexdev_core.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/regexdev/rte_regexdev_core.h b/lib/regexdev/rte_regexdev_core.h
index 15ba712..32eef6e 100644
--- a/lib/regexdev/rte_regexdev_core.h
+++ b/lib/regexdev/rte_regexdev_core.h
@@ -144,13 +144,13 @@ enum rte_regexdev_state {
  * This structure is safe to place in shared memory to be common among 
different
  * processes in a multi-process configuration.
  */
-struct rte_regexdev_data {
+struct __rte_cache_aligned rte_regexdev_data {
void *dev_private; /**< PMD-specific private data. */
char dev_name[RTE_REGEXDEV_NAME_MAX_LEN]; /**< Unique identifier name */
uint16_t dev_id; /**< Device [external]  identifier. */
struct rte_regexdev_config dev_conf; /**< RegEx configuration. */
uint8_t dev_started : 1; /**< Device started to work. */
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
@@ -162,7 +162,7 @@ struct rte_regexdev_data {
  * memory. This split allows the function pointer and driver data to be per-
  * process, while the actual configuration data for the device is shared.
  */
-struct rte_regexdev {
+struct __rte_cache_aligned rte_regexdev {
regexdev_enqueue_t enqueue;
regexdev_dequeue_t dequeue;
const struct rte_regexdev_ops *dev_ops;
@@ -170,7 +170,7 @@ struct rte_regexdev {
struct rte_device *device; /**< Backing device */
enum rte_regexdev_state state; /**< The device state. */
struct rte_regexdev_data *data;  /**< Pointer to device data. */
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
-- 
1.8.3.1



[PATCH v5 30/39] ipsec: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/ipsec/rte_ipsec.h | 4 ++--
 lib/ipsec/sa.h| 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/ipsec/rte_ipsec.h b/lib/ipsec/rte_ipsec.h
index 44cecab..f15f6f2 100644
--- a/lib/ipsec/rte_ipsec.h
+++ b/lib/ipsec/rte_ipsec.h
@@ -55,7 +55,7 @@ struct rte_ipsec_sa_pkt_func {
  * - pointer to security/crypto session, plus other related data
  * - session/device specific functions to prepare/process IPsec packets.
  */
-struct rte_ipsec_session {
+struct __rte_cache_aligned rte_ipsec_session {
/**
 * SA that session belongs to.
 * Note that multiple sessions can belong to the same SA.
@@ -77,7 +77,7 @@ struct rte_ipsec_session {
};
/** functions to prepare/process IPsec packets */
struct rte_ipsec_sa_pkt_func pkt_func;
-} __rte_cache_aligned;
+};
 
 /**
  * Checks that inside given rte_ipsec_session crypto/security fields
diff --git a/lib/ipsec/sa.h b/lib/ipsec/sa.h
index 4b30bea..2560d33 100644
--- a/lib/ipsec/sa.h
+++ b/lib/ipsec/sa.h
@@ -75,7 +75,7 @@ enum sa_algo_type {
ALGO_TYPE_MAX
 };
 
-struct rte_ipsec_sa {
+struct __rte_cache_aligned rte_ipsec_sa {
 
uint64_t type; /* type of given SA */
uint64_t udata;/* user defined */
@@ -141,7 +141,7 @@ struct rte_ipsec_sa {
} errors;
} statistics;
 
-} __rte_cache_aligned;
+};
 
 int
 ipsec_sa_pkt_func_select(const struct rte_ipsec_session *ss,
-- 
1.8.3.1



[PATCH v5 29/39] lpm: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/lpm/rte_lpm.h  | 5 +++--
 lib/lpm/rte_lpm6.c | 8 
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/lib/lpm/rte_lpm.h b/lib/lpm/rte_lpm.h
index f57977b..f311fd9 100644
--- a/lib/lpm/rte_lpm.h
+++ b/lib/lpm/rte_lpm.h
@@ -11,6 +11,7 @@
  * RTE Longest Prefix Match (LPM)
  */
 
+#include 
 #include 
 #include 
 
@@ -118,8 +119,8 @@ struct rte_lpm_config {
 /** @internal LPM structure. */
 struct rte_lpm {
/* LPM Tables. */
-   struct rte_lpm_tbl_entry tbl24[RTE_LPM_TBL24_NUM_ENTRIES]
-   __rte_cache_aligned; /**< LPM tbl24 table. */
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_lpm_tbl_entry 
tbl24[RTE_LPM_TBL24_NUM_ENTRIES];
+   /**< LPM tbl24 table. */
struct rte_lpm_tbl_entry *tbl8; /**< LPM tbl8 table. */
 };
 
diff --git a/lib/lpm/rte_lpm6.c b/lib/lpm/rte_lpm6.c
index 271bc48..ed5970c 100644
--- a/lib/lpm/rte_lpm6.c
+++ b/lib/lpm/rte_lpm6.c
@@ -98,16 +98,16 @@ struct rte_lpm6 {
 
/* LPM Tables. */
struct rte_hash *rules_tbl; /**< LPM rules. */
-   struct rte_lpm6_tbl_entry tbl24[RTE_LPM6_TBL24_NUM_ENTRIES]
-   __rte_cache_aligned; /**< LPM tbl24 table. */
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_lpm6_tbl_entry 
tbl24[RTE_LPM6_TBL24_NUM_ENTRIES];
+   /**< LPM tbl24 table. */
 
uint32_t *tbl8_pool; /**< pool of indexes of free tbl8s */
uint32_t tbl8_pool_pos; /**< current position in the tbl8 pool */
 
struct rte_lpm_tbl8_hdr *tbl8_hdrs; /* array of tbl8 headers */
 
-   struct rte_lpm6_tbl_entry tbl8[0]
-   __rte_cache_aligned; /**< LPM tbl8 table. */
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_lpm6_tbl_entry tbl8[0];
+   /**< LPM tbl8 table. */
 };
 
 /*
-- 
1.8.3.1



[PATCH v5 26/39] mldev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/mldev/rte_mldev.h  | 4 ++--
 lib/mldev/rte_mldev_core.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/mldev/rte_mldev.h b/lib/mldev/rte_mldev.h
index 27e372f..02913f3 100644
--- a/lib/mldev/rte_mldev.h
+++ b/lib/mldev/rte_mldev.h
@@ -421,7 +421,7 @@ struct rte_ml_buff_seg {
  * This structure contains data related to performing an ML operation on the 
buffers using
  * the model specified through model_id.
  */
-struct rte_ml_op {
+struct __rte_cache_aligned rte_ml_op {
uint16_t model_id;
/**< Model ID to be used for the operation. */
uint16_t nb_batches;
@@ -469,7 +469,7 @@ struct rte_ml_op {
 * dequeue and enqueue operation.
 * The application should not modify this field.
 */
-} __rte_cache_aligned;
+};
 
 /* Enqueue/Dequeue operations */
 
diff --git a/lib/mldev/rte_mldev_core.h b/lib/mldev/rte_mldev_core.h
index 2279b1d..b3bd281 100644
--- a/lib/mldev/rte_mldev_core.h
+++ b/lib/mldev/rte_mldev_core.h
@@ -626,7 +626,7 @@ struct rte_ml_dev_data {
  *
  * The data structure associated with each ML device.
  */
-struct rte_ml_dev {
+struct __rte_cache_aligned rte_ml_dev {
/** Pointer to PMD enqueue function. */
mldev_enqueue_t enqueue_burst;
 
@@ -647,7 +647,7 @@ struct rte_ml_dev {
 
/** Flag indicating the device is attached. */
__extension__ uint8_t attached : 1;
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
-- 
1.8.3.1



[PATCH v5 35/39] dispatcher: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/dispatcher/rte_dispatcher.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/dispatcher/rte_dispatcher.c b/lib/dispatcher/rte_dispatcher.c
index f546d75..7934917 100644
--- a/lib/dispatcher/rte_dispatcher.c
+++ b/lib/dispatcher/rte_dispatcher.c
@@ -41,7 +41,7 @@ struct rte_dispatcher_finalizer {
void *finalize_data;
 };
 
-struct rte_dispatcher_lcore {
+struct __rte_cache_aligned rte_dispatcher_lcore {
uint8_t num_ports;
uint16_t num_handlers;
int32_t prio_count;
@@ -49,7 +49,7 @@ struct rte_dispatcher_lcore {
struct rte_dispatcher_handler handlers[EVD_MAX_HANDLERS];
struct rte_dispatcher_stats stats;
RTE_CACHE_GUARD;
-} __rte_cache_aligned;
+};
 
 struct rte_dispatcher {
uint8_t event_dev_id;
-- 
1.8.3.1



[PATCH v5 37/39] gpudev: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/gpudev/gpudev_driver.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/gpudev/gpudev_driver.h b/lib/gpudev/gpudev_driver.h
index 0b1e7f2..37b6ae3 100644
--- a/lib/gpudev/gpudev_driver.h
+++ b/lib/gpudev/gpudev_driver.h
@@ -72,7 +72,7 @@ struct rte_gpu_mpshared {
RTE_ATOMIC(uint16_t) process_refcnt; /* Updated by this library. */
 };
 
-struct rte_gpu {
+struct __rte_cache_aligned rte_gpu {
/* Backing device. */
struct rte_device *device;
/* Data shared between processes. */
@@ -85,7 +85,7 @@ struct rte_gpu {
enum rte_gpu_state process_state; /* Updated by this library. */
/* Driver-specific private data for the running process. */
void *process_private;
-} __rte_cache_aligned;
+};
 
 __rte_internal
 struct rte_gpu *rte_gpu_get_by_name(const char *name);
-- 
1.8.3.1



[PATCH v5 39/39] ip_frag: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/ip_frag/ip_reassembly.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/ip_frag/ip_reassembly.h b/lib/ip_frag/ip_reassembly.h
index a9f97ae..5443c73 100644
--- a/lib/ip_frag/ip_reassembly.h
+++ b/lib/ip_frag/ip_reassembly.h
@@ -47,7 +47,7 @@ struct ip_frag_key {
  * Fragmented packet to reassemble.
  * First two entries in the frags[] array are for the last and first fragments.
  */
-struct ip_frag_pkt {
+struct __rte_cache_aligned ip_frag_pkt {
RTE_TAILQ_ENTRY(ip_frag_pkt) lru;  /* LRU list */
struct ip_frag_key key;/* fragmentation key */
uint64_t start;/* creation timestamp */
@@ -55,20 +55,20 @@ struct ip_frag_pkt {
uint32_t frag_size;/* size of fragments received */
uint32_t last_idx; /* index of next entry to fill */
struct ip_frag frags[IP_MAX_FRAG_NUM]; /* fragments */
-} __rte_cache_aligned;
+};
 
  /* fragments tailq */
 RTE_TAILQ_HEAD(ip_pkt_list, ip_frag_pkt);
 
 /* fragmentation table statistics */
-struct ip_frag_tbl_stat {
+struct __rte_cache_aligned ip_frag_tbl_stat {
uint64_t find_num; /* total # of find/insert attempts. */
uint64_t add_num;  /* # of add ops. */
uint64_t del_num;  /* # of del ops. */
uint64_t reuse_num;/* # of reuse (del/add) ops. */
uint64_t fail_total;   /* total # of add failures. */
uint64_t fail_nospace; /* # of 'no space' add failures. */
-} __rte_cache_aligned;
+};
 
 /* fragmentation table */
 struct rte_ip_frag_tbl {
-- 
1.8.3.1



[PATCH v5 36/39] fib: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/fib/dir24_8.h | 4 +++-
 lib/fib/trie.h| 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/fib/dir24_8.h b/lib/fib/dir24_8.h
index b0d1a40..6d350f7 100644
--- a/lib/fib/dir24_8.h
+++ b/lib/fib/dir24_8.h
@@ -6,6 +6,8 @@
 #ifndef _DIR24_8_H_
 #define _DIR24_8_H_
 
+#include 
+
 #include 
 #include 
 
@@ -32,7 +34,7 @@ struct dir24_8_tbl {
uint64_t*tbl8;  /**< tbl8 table. */
uint64_t*tbl8_idxes;/**< bitmap containing free tbl8 idxes*/
/* tbl24 table. */
-   __extension__ uint64_t  tbl24[0] __rte_cache_aligned;
+   __extension__ alignas(RTE_CACHE_LINE_SIZE) uint64_t tbl24[0];
 };
 
 static inline void *
diff --git a/lib/fib/trie.h b/lib/fib/trie.h
index 3cf161a..36ce1fd 100644
--- a/lib/fib/trie.h
+++ b/lib/fib/trie.h
@@ -6,6 +6,8 @@
 #ifndef _TRIE_H_
 #define _TRIE_H_
 
+#include 
+
 /**
  * @file
  * RTE IPv6 Longest Prefix Match (LPM)
@@ -36,7 +38,7 @@ struct rte_trie_tbl {
uint32_t*tbl8_pool; /**< bitmap containing free tbl8 idxes*/
uint32_ttbl8_pool_pos;
/* tbl24 table. */
-   __extension__ uint64_t  tbl24[0] __rte_cache_aligned;
+   __extension__ alignas(RTE_CACHE_LINE_SIZE) uint64_t tbl24[0];
 };
 
 static inline uint32_t
-- 
1.8.3.1



[PATCH v5 38/39] graph: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/graph/graph_private.h   |  4 ++--
 lib/graph/graph_stats.c |  4 ++--
 lib/graph/rte_graph.h   |  4 ++--
 lib/graph/rte_graph_worker_common.h | 17 ++---
 4 files changed, 16 insertions(+), 13 deletions(-)

diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index fb88d4b..7e4d9f8 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -71,11 +71,11 @@ struct node {
  * Structure that holds the graph scheduling workqueue node stream.
  * Used for mcore dispatch model.
  */
-struct graph_mcore_dispatch_wq_node {
+struct __rte_cache_aligned graph_mcore_dispatch_wq_node {
rte_graph_off_t node_off;
uint16_t nb_objs;
void *objs[RTE_GRAPH_BURST_SIZE];
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
diff --git a/lib/graph/graph_stats.c b/lib/graph/graph_stats.c
index cc32245..2fb808b 100644
--- a/lib/graph/graph_stats.c
+++ b/lib/graph/graph_stats.c
@@ -28,7 +28,7 @@ struct cluster_node {
struct rte_node *nodes[];
 };
 
-struct rte_graph_cluster_stats {
+struct __rte_cache_aligned rte_graph_cluster_stats {
/* Header */
rte_graph_cluster_stats_cb_t fn;
uint32_t cluster_node_size; /* Size of struct cluster_node */
@@ -38,7 +38,7 @@ struct rte_graph_cluster_stats {
size_t sz;
 
struct cluster_node clusters[];
-} __rte_cache_aligned;
+};
 
 #define boarder_model_dispatch()   
   \
fprintf(f, "+---+---+" \
diff --git a/lib/graph/rte_graph.h b/lib/graph/rte_graph.h
index 2d37d5e..ecfec20 100644
--- a/lib/graph/rte_graph.h
+++ b/lib/graph/rte_graph.h
@@ -200,7 +200,7 @@ struct rte_graph_cluster_stats_param {
  *
  * @see struct rte_graph_cluster_stats_param::fn
  */
-struct rte_graph_cluster_node_stats {
+struct __rte_cache_aligned rte_graph_cluster_node_stats {
uint64_t ts;/**< Current timestamp. */
uint64_t calls; /**< Current number of calls made. */
uint64_t objs;  /**< Current number of objs processed. */
@@ -225,7 +225,7 @@ struct rte_graph_cluster_node_stats {
rte_node_t id;  /**< Node identifier of stats. */
uint64_t hz;/**< Cycles per seconds. */
char name[RTE_NODE_NAMESIZE];   /**< Name of the node. */
-} __rte_cache_aligned;
+};
 
 /**
  * Create Graph.
diff --git a/lib/graph/rte_graph_worker_common.h 
b/lib/graph/rte_graph_worker_common.h
index 4045a7a..36d864e 100644
--- a/lib/graph/rte_graph_worker_common.h
+++ b/lib/graph/rte_graph_worker_common.h
@@ -12,6 +12,8 @@
  * process, enqueue and move streams of objects to the next nodes.
  */
 
+#include 
+
 #include 
 #include 
 #include 
@@ -43,7 +45,7 @@
  *
  * Data structure to hold graph data.
  */
-struct rte_graph {
+struct __rte_cache_aligned rte_graph {
/* Fast path area. */
uint32_t tail;   /**< Tail of circular buffer. */
uint32_t head;   /**< Head of circular buffer. */
@@ -57,7 +59,8 @@ struct rte_graph {
union {
/* Fast schedule area for mcore dispatch model */
struct {
-   struct rte_graph_rq_head *rq __rte_cache_aligned; /* 
The run-queue */
+   alignas(RTE_CACHE_LINE_SIZE) struct rte_graph_rq_head 
*rq;
+   /* The run-queue */
struct rte_graph_rq_head rq_head; /* The head for 
run-queue list */
 
unsigned int lcore_id;  /**< The graph running Lcore. */
@@ -77,14 +80,14 @@ struct rte_graph {
uint64_t nb_pkt_to_capture;
char pcap_filename[RTE_GRAPH_PCAP_FILE_SZ];  /**< Pcap filename. */
uint64_t fence; /**< Fence. */
-} __rte_cache_aligned;
+};
 
 /**
  * @internal
  *
  * Data structure to hold node data.
  */
-struct rte_node {
+struct __rte_cache_aligned rte_node {

[PATCH v5 07/39] net: use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

To allow alignment for both compilers do the following:

* Move __rte_aligned from the end of {struct,union} definitions to
  be between {struct,union} and tag.

  The placement between {struct,union} and the tag allows the desired
  alignment to be imparted on the type regardless of the toolchain being
  used for all of GCC, LLVM, MSVC compilers building both C and C++.

* Replace use of __rte_aligned(a) on variables/fields with alignas(a).

Signed-off-by: Tyler Retzlaff 
Acked-by: Morten Brørup 
---
 lib/net/net_crc_avx512.c | 14 --
 lib/net/net_crc_neon.c   | 11 ++-
 lib/net/net_crc_sse.c| 17 +
 lib/net/rte_arp.h|  8 
 lib/net/rte_ether.h  |  8 
 5 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/lib/net/net_crc_avx512.c b/lib/net/net_crc_avx512.c
index f6a3ce9..0f48ca0 100644
--- a/lib/net/net_crc_avx512.c
+++ b/lib/net/net_crc_avx512.c
@@ -3,6 +3,8 @@
  */
 
 
+#include 
+
 #include 
 
 #include "net_crc.h"
@@ -20,8 +22,8 @@ struct crc_vpclmulqdq_ctx {
__m128i fold_1x128b;
 };
 
-static struct crc_vpclmulqdq_ctx crc32_eth __rte_aligned(64);
-static struct crc_vpclmulqdq_ctx crc16_ccitt __rte_aligned(64);
+static alignas(64) struct crc_vpclmulqdq_ctx crc32_eth;
+static alignas(64) struct crc_vpclmulqdq_ctx crc16_ccitt;
 
 static uint16_t byte_len_to_mask_table[] = {
0x, 0x0001, 0x0003, 0x0007,
@@ -30,18 +32,18 @@ struct crc_vpclmulqdq_ctx {
0x0fff, 0x1fff, 0x3fff, 0x7fff,
0x};
 
-static const uint8_t shf_table[32] __rte_aligned(16) = {
+static const alignas(16) uint8_t shf_table[32] = {
0x00, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87,
0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f,
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
 };
 
-static const uint32_t mask[4] __rte_aligned(16) = {
+static const alignas(16) uint32_t mask[4] = {
0x, 0x, 0x, 0x
 };
 
-static const uint32_t mask2[4] __rte_aligned(16) = {
+static const alignas(16) uint32_t mask2[4] = {
0x, 0x, 0x, 0x
 };
 
@@ -93,7 +95,7 @@ struct crc_vpclmulqdq_ctx {
uint32_t offset;
__m128i res2, res3, res4, pshufb_shf;
 
-   const uint32_t mask3[4] __rte_aligned(16) = {
+   const alignas(16) uint32_t mask3[4] = {
   0x80808080, 0x80808080, 0x80808080, 0x80808080
};
 
diff --git a/lib/net/net_crc_neon.c b/lib/net/net_crc_neon.c
index f61d75a..cee75dd 100644
--- a/lib/net/net_crc_neon.c
+++ b/lib/net/net_crc_neon.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2017 Cavium, Inc
  */
 
+#include 
 #include 
 
 #include 
@@ -19,8 +20,8 @@ struct crc_pmull_ctx {
uint64x2_t rk7_rk8;
 };
 
-struct crc_pmull_ctx crc32_eth_pmull __rte_aligned(16);
-struct crc_pmull_ctx crc16_ccitt_pmull __rte_aligned(16);
+alignas(16) struct crc_pmull_ctx crc32_eth_pmull;
+alignas(16) struct crc_pmull_ctx crc16_ccitt_pmull;
 
 /**
  * @brief Performs one folding round
@@ -96,10 +97,10 @@ struct crc_pmull_ctx {
 crcr32_reduce_64_to_32(uint64x2_t data64,
uint64x2_t precomp)
 {
-   static uint32_t mask1[4] __rte_aligned(16) = {
+   static alignas(16) uint32_t mask1[4] = {
0x, 0x, 0x, 0x
};
-   static uint32_t mask2[4] __rte_aligned(16) = {
+   static alignas(16) uint32_t mask2[4] = {
0x, 0x, 0x, 0x
};
uint64x2_t tmp0, tmp1, tmp2;
@@ -148,7 +149,7 @@ struct crc_pmull_ctx {
 
if (unlikely(data_len < 16)) {
/* 0 to 15 bytes */
-   uint8_t buffer[16] __rte_aligned(16);
+   alignas(16) uint8_t buffer[16];
 
memset(buffer, 0, sizeof(buffer));
memcpy(buffer, data, data_len);
diff --git a/lib/net/net_crc_sse.c b/lib/net/net_crc_sse.c
index dd75845..d673ae3 100644
--- a/lib/net/net_crc_sse.c
+++ b/lib/net/net_crc_sse.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2017-2020 Intel Corporation
  */
 
+#include 
 #include 
 
 #include 
@@ -18,8 +19,8 @@ struct crc_pclmulqdq_ctx {
__m128i rk7_rk8;
 };
 
-static struct crc_pclmulqdq_ctx crc32_eth_pclmulqdq __rte_aligned(16);
-static struct crc_pclmulqdq_ctx crc16_ccitt_pclmulqdq __rte_aligned(16);
+static alignas(16) struct crc_pclmulqdq_ctx crc32_e

[PATCH v5 00/39] use C11 alignas

2024-02-23 Thread Tyler Retzlaff
The current location used for __rte_aligned(a) for alignment of types
and variables is not compatible with MSVC. There is only a single
location accepted by both toolchains.

For variables standard C11 offers alignas(a) supported by conformant
compilers i.e. both MSVC and GCC.

For types the standard offers no alignment facility that compatibly
interoperates with C and C++ but may be achieved by relocating the
placement of __rte_aligned(a) to the aforementioned location accepted
by all currently supported toolchains.

** NOTE **

Finally, In the interest of not creating more API (internal or not) the
series does not introduce a wrapper for C11 alignas. If we don't introduce
a macro an application can't take a dependency.

v5:
  * rebase series.
  * reword all commit messages with why the change is necessary.
  * document guidance for the usage of __rte_aligned macro indicating
that it should be used for type alignment only and advising that for
variable alignment standard C11 alignas(a) should be preferred.

v4:
  * restore explicit alignment of 8-byte integers in mbuf and
ring patches, natural alignment may be 4-bytes on 32-bit
targets.
v3:
  * add missing patches for __rte_cache_aligned and
__rte_cache_min_aligned
v2:
  * add missing #include  for alignas macro.

Tyler Retzlaff (39):
  eal: use C11 alignas
  eal: redefine macro to be integer literal for MSVC
  stack: use C11 alignas
  sched: use C11 alignas
  ring: use C11 alignas
  pipeline: use C11 alignas
  net: use C11 alignas
  mbuf: use C11 alignas
  hash: use C11 alignas
  eventdev: use C11 alignas
  ethdev: use C11 alignas
  dmadev: use C11 alignas
  distributor: use C11 alignas
  acl: use C11 alignas
  vhost: use C11 alignas
  timer: use C11 alignas
  table: use C11 alignas
  reorder: use C11 alignas
  regexdev: use C11 alignas
  rcu: use C11 alignas
  power: use C11 alignas
  rawdev: use C11 alignas
  port: use C11 alignas
  pdcp: use C11 alignas
  node: use C11 alignas
  mldev: use C11 alignas
  mempool: use C11 alignas
  member: use C11 alignas
  lpm: use C11 alignas
  ipsec: use C11 alignas
  jobstats: use C11 alignas
  bpf: use C11 alignas
  compressdev: use C11 alignas
  cryptodev: use C11 alignas
  dispatcher: use C11 alignas
  fib: use C11 alignas
  gpudev: use C11 alignas
  graph: use C11 alignas
  ip_frag: use C11 alignas

 lib/acl/acl_run.h  |  4 ++--
 lib/acl/acl_run_altivec.h  |  6 --
 lib/acl/acl_run_neon.h |  6 --
 lib/bpf/bpf_pkt.c  |  4 ++--
 lib/compressdev/rte_comp.h |  4 ++--
 lib/compressdev/rte_compressdev_internal.h |  8 +++
 lib/cryptodev/cryptodev_pmd.h  |  8 +++
 lib/cryptodev/rte_cryptodev_core.h |  4 ++--
 lib/dispatcher/rte_dispatcher.c|  4 ++--
 lib/distributor/distributor_private.h  | 34 --
 lib/distributor/rte_distributor.c  |  5 +++--
 lib/dmadev/rte_dmadev_core.h   |  4 ++--
 lib/dmadev/rte_dmadev_pmd.h|  8 +++
 lib/eal/arm/include/rte_vect.h |  4 ++--
 lib/eal/common/malloc_elem.h   |  4 ++--
 lib/eal/common/malloc_heap.h   |  4 ++--
 lib/eal/common/rte_keepalive.c |  3 ++-
 lib/eal/common/rte_random.c|  4 ++--
 lib/eal/common/rte_service.c   |  8 +++
 lib/eal/include/generic/rte_atomic.h   |  4 ++--
 lib/eal/include/rte_common.h   | 23 +---
 lib/eal/loongarch/include/rte_vect.h   |  8 +++
 lib/eal/ppc/include/rte_vect.h |  4 ++--
 lib/eal/riscv/include/rte_vect.h   |  4 ++--
 lib/eal/x86/include/rte_vect.h |  9 +---
 lib/eal/x86/rte_power_intrinsics.c | 10 +
 lib/ethdev/ethdev_driver.h |  8 +++
 lib/ethdev/rte_ethdev.h| 16 +++---
 lib/ethdev/rte_ethdev_core.h   |  4 ++--
 lib/ethdev/rte_flow_driver.h   |  4 ++--
 lib/eventdev/event_timer_adapter_pmd.h |  4 ++--
 lib/eventdev/eventdev_pmd.h|  8 +++
 lib/eventdev/rte_event_crypto_adapter.c| 16 +++---
 lib/eventdev/rte_event_dma_adapter.c   | 16 +++---
 lib/eventdev/rte_event_eth_rx_adapter.c|  8 +++
 lib/eventdev/rte_event_eth_tx_adapter.c|  4 ++--
 lib/eventdev/rte_event_timer_adapter.c |  9 
 lib/eventdev/rte_event_timer_adapter.h |  8 +++
 lib/eventdev/rte_eventdev.h|  8 +++
 lib/eventdev/rte_eventdev_core.h   |  4 ++--
 lib/fib/dir24_8.h  |  4 +++-
 lib/fib/trie.h |  4 +++-
 lib/gpudev/gpudev_driver.h |  4 ++--
 lib/graph/graph_private.h  |  4 ++--
 lib/graph/graph_stats.c|  4 ++--
 lib/graph/rte_graph.h  |  4 ++--
 lib/graph/rte_graph_worker_common.h 

Re: [PATCH 1/4] dts: constrain DPDK source flag

2024-02-23 Thread Luca Vizzarro

Hi Juraj,

Thank you for your review!

On 29/01/2024 11:47, Juraj Linkeš wrote:

I didn't see the mutual exclusion being enforced in the code. From
what I can tell, I could pass both --tarball FILEPATH and --revision
and the former would be used without checking the latter.


Yep, it is enforced in the code, you may have missed it. The two 
arguments are under the same mutual exclusion group in line 220:


dpdk_source = parser.add_mutually_exclusive_group(required=True)

When using both arguments `argparse` will automatically complain that 
you can only use one or the other.



whether `filepath` is valid
Even though private methods don't get included in the API docs, I like
to be consistent. In this case, it doesn't really detract (but maybe
some disability would prove this wrong) while adding a bit (the fact
that we're referencing the argument).


Yes, it is a good idea. Especially since this will work within IDEs.


I think the name should either be _validate_tarball or
_parse_tarball_path. The argument name is two words, so let's put an
underscore between them.


Ack.


I think this would read better as one sentence.


Ack.


Since this is a patch with improvements, maybe we could add metavars
to other arguments as well. It looks pretty good.


Sure, no problem!


This removes the support for environment variables. It's possible we
don't want the support for these two arguments. Maybe we don't need
the support for variables at all. I'm leaning towards supporting the
env variables, but we probably should refactor how they're done. The
easiest would be to not do them with action, but just modifying the
default value if set. That would be a worthwhile improvement.


I've tried to find a way to still keep them. But with arguments done 
this way, it is somewhat hard to understand the provenance of the value, 
whether it's set in the arguments, an environment variable or just the 
default value. Therefore, give a useful error message to the user when 
using something invalid. I'll try to come up with something as you 
suggested, although I am not entirely convinced it'll be ideal.



This would also probably read better as one sentence.


Ack.


We shuffled the order of operations a bit and now the error message
doesn't correspond.


Sorry, I don't think I am understanding what you are referring to 
exactly. What do you mean?


Best,
Luca


Re: [PATCH 2/4] dts: customise argparse error message

2024-02-23 Thread Luca Vizzarro

On 29/01/2024 13:04, Juraj Linkeš wrote:

I'm curious, what exactly is confusing about the message?


Unfortunately a bit too much time has passed... but if I remember 
correctly I think that given the great amount of arguments, whenever the 
message is printed a bit too much information is given to the user. So 
bottomline, too crowded


Re: [PATCH 4/4] dts: log stderr with failed remote commands

2024-02-23 Thread Luca Vizzarro

On 29/01/2024 13:10, Juraj Linkeš wrote:

Here's I'd add logged additionally as an error, as this sounds as if
we're changing debug to error


That is also a way of doing this, but an error is an error. If we wanted 
to log the same thing in debug and error, then when we go read the debug 
we get duplicates... making it less readable. What do you say?



I'd change the order here (and all other places) so that stderr is
before the return code.

Ack.


We should mention that the last string is the stderr output. Maybe we
just add 'Stderr:' before {self._command_stderr}. And maybe we should
put quotes around {self._command_stderr}.


Since you mentioned "quotes", I'd think that it'd be even better to 
indent it as if it's a quote. With logs as busy as the ones DTS prints, 
adding some quotes may not change much as it's all already very crowded. 
Can prefix with 'Stderr: ' though.


Community CI Meeting Minutes - February 22, 2024

2024-02-23 Thread Patrick Robb
February 22, 2024

#
Attendees
1. Patrick Robb
2. Ali Alnubani
3. Paul Szczepanek
4. Juraj Linkeš
5. Luca Vizzarro

#
Minutes

=
General Announcements
* Aaron is polling the tech board for feedback on what server hardware is
most needed in the community lab going forward. Some ideas are:
   * AMD CPUs
   * RISC-V CPUs
   * Better PCI-E generation slots which will allow us to use newer NICs
   * He is visiting UNH today so we can work on starting a proposal and plan
* 24.03 RC1 has been released this morning
* Retest framework: Email has been sent out with the proposed syntax and
approach for retests in which users want to request their patch be
re-applied on tip of branch

=
CI Status

-
UNH-IOL Community Lab
* Hardware Refresh:
   * NVIDIA CX7:
  * Had some minor improvements in the performance results, but still
debugging with Ali and NVIDIA
  * Can possibly replaced the cx5 with another cx7, and do forwarding
between 2 nics, solving 1 bandwidth bottleneck
  * Will work on resolving the lower speeds seen on the CX6 first -
this could be related to using a different board and CPU with different
clock rate etc.
 * This server has a broadwell CPU
 * Patrick Robb Make sure that Ali and Gal are included for the
initial feedback thread for server refresh and what is most needed
* Bringing the NVIDIA testbed offline today for a few hours so Ali and Bing
can do some debugging on the mlx5 failure on the CX5 yesterday
* Arm IPSEC-MB Library: Had to move to running from tip of main on the ARM
ipsec repo to enablev1.4 - just storing the commit hash right now but ARM
will publish a new tag for a known good version soon.
   * Wathsala will be doing the new tag soon

-
Intel Lab
* None

-
Github Actions
* We plan to have a maintenance window either Thus the 22nd, or sometime
next week to cover migrating to the original server.  This will involve
upgrading the base os for both the host and the VM.  Michael  will send out
the notice on the day it happens letting everyone know  of the downtime.
   * We don't expect that the downtime will last too long, less than a
day.  We will likely recover the workflows shortly after that.

-
Loongarch Lab
* Patrick pinged Zhoumin about adding retest framework support to the
Loongson lab
   * UNH willing to assist - not sure right now what is possible/not
possible based on how the loongson lab stood up their automation. They do
use the dpdk-ci repo tools.

=
DTS Improvements & Test Development
* Dockerfile patch can be merged - Thomas has been pinged about this on
slack
* Scattered packets patch:
   * Patrick tested this on a bnxt_en NIC, and it worked fine
   * When gathering device capabilities, the scatter capability is always
off on mlnx nics, even when including the scatter offload flag with testpmd
  * Juraj is going to send to Patrick and Jermey a summary of what he
has learned about passing the scatter flag and how DPDK handles it. And
what he has learned about querying for this capability.
   * For now, not including the scatter offload flag testcase with this
testsuite, only submitting the testcase which is a direct port from old dts
   * Luca is going to review this today. He is also running it on a MLNX
nic.
   * Juraj: Interestingly, on the Intel NICs it is on whether you include
the --enable-scatter flag or not, but the Mellanox NICs don't have it set
to on in either case
   * Ali: Have you tried "--enable-scatter --tx-offloads=0x8000"
  * Jeremy Spewockwill try this
* For next Wednesday, We are going to have to have discussions for putting
together the 24.07 DTS roadmap since Juraj will be on vacation in March and
we won’t be able to have the conversation then.
   * Patrick will put together some ethdev suite ideas which the new DTS
employees at UNH can start on
   * We will also review bugzilla tickets then, assign more tickets if
needed
* Patch for the testcast blocking:
   * There is a bug (it does not include the smoke tests in the list of
suites to run), so Juraj will be sending a new version
* Gregory reached out to see whether his framework’s approach could be used
for simple DTS cases. Juraj and Patrick read the code. There are some good
ideas we are bringing into the framework, but not the phase based yaml
approach which translates scapy and testpmd commands to testsuites.
* XMLRPC Server: Jeremy found that there is a python d