Re: [RFC PATCH 3/3] ethdev: rename parameter in API to get FEC

2023-05-04 Thread Denis Pryazhennikov

I think my patch isn't correct.
I would say that
- fec_capa is a set of bits produced by 'RTE_ETH_FEC_MODE_TO_CAPA()';
- fec_mode is an element of 'enum rte_eth_fec_mode'.

Based on this definition, replacing 'fec_capa' with 'fec_mode' should 
involve changing the parameter type.
Probably I shouldn't change the name, but I should add a more detailed 
comment.


> Independent from being single FEC mode or not, I think both 
'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as 
param, what do you think?


Andrew Rybchenko asked to replace 'mode' with 'fec_capa' for 
'rte_eth_fec_set()' in

https://inbox.dpdk.org/dev/aa745bd1-a564-fa8c-c77b-2d99c9769...@solarflare.com/
I don't think we need to change it for rte_eth_fec_set().

On 5/2/23 7:02 PM, Ferruh Yigit wrote:

On 4/28/2023 11:27 AM, Denis Pryazhennikov wrote:

Only one valid FEC mode can be get by rte_eth_fec_get().
The previous name implied that more than one FEC mode
can be obtained.

+1 and patch looks good.

But isn't this valid for 'rte_eth_fec_set()', it gets 'fec_mode'. FEC
capability has its own type "struct rte_eth_fec_capa".

Independent from being single FEC mode or not, I think both
'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as
param, what do you think?


Documentation was updated accordingly.

Signed-off-by: Denis Pryazhennikov 
Acked-by: Ivan Malov 
Acked-by: Viacheslav Galaktionov 

<...>



Re: [PATCH V3] lib: set/get max memzone segments

2023-05-04 Thread David Marchand
Hello Ophir,

Good thing someone is looking into this.
Thanks.

I have a few comments.


This commitlog is a bit compact.
Separating it with some empty lines would help digest it.


On Wed, May 3, 2023 at 9:27 AM Ophir Munk  wrote:
>
> In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard
> coded as 2560.
> For applications requiring different values of this
> parameter – it is more convenient to set the max value via an rte API -
> rather than changing the dpdk source code per application.  In many
> organizations, the possibility to compile a private DPDK library for a
> particular application does not exist at all.  With this option there is
> no need to recompile DPDK and it allows using an in-box packaged DPDK.
> An example usage for updating the RTE_MAX_MEMZONE would be of an
> application that uses the DPDK mempool library which is based on DPDK
> memzone library.  The application may need to create a number of
> steering tables, each of which will require its own mempool allocation.
> This commit is not about how to optimize the application usage of
> mempool nor about how to improve the mempool implementation based on
> memzone.  It is about how to make the max memzone definition - run-time
> customized.

The code was using a rather short name "RTE_MAX_MEMZONE".
But I prefer we spell this as "max memzones count" (or a better
wording), in the descriptions/comments.


> This commit adds an API which must be called before rte_eal_init():
> rte_memzone_max_set(int max).  If not called, the default memzone

Afaics, this prototype got unaligned with the patch content, as a
size_t is now taken as input.
You can simply mention rte_memzone_max_set().


> (RTE_MAX_MEMZONE) is used.  There is also an API to query the effective

After the patch RTE_MAX_MEMZONE does not exist anymore.


> max memzone: rte_memzone_max_get().
>
> Signed-off-by: Ophir Munk 


A global comment on the patch:

rte_calloc provides what you want in all cases below: an array of objects.
I prefer rte_calloc(count, sizeof elem) to rte_zmalloc(count * sizeof elem).

This also avoids a temporary variable to compute the total size of
such an array.


> ---
>  app/test/test_func_reentrancy.c |  2 +-
>  app/test/test_malloc_perf.c |  2 +-
>  app/test/test_memzone.c | 43 
> -
>  config/rte_config.h |  1 -
>  drivers/net/qede/base/bcm_osal.c| 30 --
>  drivers/net/qede/base/bcm_osal.h|  3 +++
>  drivers/net/qede/qede_main.c|  7 ++
>  lib/eal/common/eal_common_memzone.c | 28 +---
>  lib/eal/include/rte_memzone.h   | 20 +
>  lib/eal/version.map |  4 
>  10 files changed, 112 insertions(+), 28 deletions(-)
>
> diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
> index d1ed5d4..ae9de6f 100644
> --- a/app/test/test_func_reentrancy.c
> +++ b/app/test/test_func_reentrancy.c
> @@ -51,7 +51,7 @@ typedef void (*case_clean_t)(unsigned lcore_id);
>  #define MEMPOOL_ELT_SIZE(sizeof(uint32_t))
>  #define MEMPOOL_SIZE(4)
>
> -#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U))
> +#define MAX_LCORES (rte_memzone_max_get() / (MAX_ITER_MULTI * 4U))
>
>  static uint32_t obj_count;
>  static uint32_t synchro;
> diff --git a/app/test/test_malloc_perf.c b/app/test/test_malloc_perf.c
> index ccec43a..9bd1662 100644
> --- a/app/test/test_malloc_perf.c
> +++ b/app/test/test_malloc_perf.c
> @@ -165,7 +165,7 @@ test_malloc_perf(void)
> return -1;
>
> if (test_alloc_perf("rte_memzone_reserve", memzone_alloc, 
> memzone_free,
> -   NULL, memset_us_gb, RTE_MAX_MEMZONE - 1) < 0)
> +   NULL, memset_us_gb, rte_memzone_max_get() - 1) < 0)
> return -1;
>
> return 0;
> diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
> index c9255e5..36b9790 100644
> --- a/app/test/test_memzone.c
> +++ b/app/test/test_memzone.c
> @@ -871,9 +871,18 @@ test_memzone_bounded(void)
>  static int
>  test_memzone_free(void)
>  {
> -   const struct rte_memzone *mz[RTE_MAX_MEMZONE + 1];
> +   size_t mz_size;
> +   const struct rte_memzone **mz;
> int i;
> char name[20];
> +   int rc = -1;
> +
> +   mz_size = (rte_memzone_max_get() + 1) * sizeof(struct rte_memzone *);
> +   mz = rte_zmalloc("memzone_test", mz_size, 0);
> +   if (!mz) {
> +   printf("Fail allocating memzone test array\n");
> +   return rc;
> +   }
>
> mz[0] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone0"), 2000,
> SOCKET_ID_ANY, 0);
> @@ -881,42 +890,42 @@ test_memzone_free(void)
> SOCKET_ID_ANY, 0);
>
> if (mz[0] > mz[1])
> -   return -1;
> +   goto exit_test;
> if (!rte_memzone_lookup(TEST_

Re: [v2] [RFC] dpaa2: replace system("echo ...") with file i/o

2023-05-04 Thread Mattias Rönnblom

On 2023-05-02 11:54, Sachin Saxena (OSS) wrote:

From: Stephen Hemminger 

  Using system() is a bad idea in driver code because it introduces
  a number of potential security issues. The codeql analysis tool
  flags this a potential security issue.

  Instead just use normal stdio to do the same thing.

  Compile test only, do not have this hardware and therefore can
  not test this.

Signed-off-by: Stephen Hemminger 
Reviewed-by: Sachin Saxena 
---
  drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 45 
  1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 4aec7b2cd8..990cfc5d3b 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -125,14 +125,21 @@ static void
  dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id)
  {
  #define STRING_LEN28
-#define COMMAND_LEN50
+#define AFFINITY_LEN   128
+#define CMD_LEN300
uint32_t cpu_mask = 1;
-   int ret;
-   size_t len = 0;
-   char *temp = NULL, *token = NULL;
-   char string[STRING_LEN], command[COMMAND_LEN];
+   size_t len = CMD_LEN;
+   char *temp, *token = NULL;
+   char string[STRING_LEN];
+   char smp_affinity[AFFINITY_LEN];
FILE *file;
  
+	temp = (char *)malloc(len * sizeof(char));


No cast is necessary to go from void * to any other pointer.

sizeof(char) is by definition one. What you allocate from the heap are 
in units of chars, so multiplying with sizeof(char) doesn't make sense.


How is pre-allocating the buffer an improvement over the old code 
(deferring memory allocation to getline())?



+   if (temp == NULL) {
+   DPAA2_BUS_WARN("Unable to allocate temp buffer");
+   return;
+   }
+
snprintf(string, STRING_LEN, "dpio.%d", dpio_id);
file = fopen("/proc/interrupts", "r");
if (!file) {
@@ -155,17 +162,27 @@ dpaa2_affine_dpio_intr_to_respective_core(int32_t 
dpio_id, int cpu_id)
}
  
  	cpu_mask = cpu_mask << cpu_id;

-   snprintf(command, COMMAND_LEN, "echo %X > /proc/irq/%s/smp_affinity",
-cpu_mask, token);
-   ret = system(command);
-   if (ret < 0)
-   DPAA2_BUS_DEBUG(
-   "Failed to affine interrupts on respective core");
-   else
-   DPAA2_BUS_DEBUG(" %s command is executed", command);
-
+   snprintf(smp_affinity, AFFINITY_LEN,
+"/proc/irq/%s/smp_affinity", token);
+   /* Free 'temp' memory after using the substring 'token' */
free(temp);
fclose(file);
+
+   file = fopen(smp_affinity, "w");
+   if (file == NULL) {
+   DPAA2_BUS_WARN("Failed to open %s", smp_affinity);
+   return;
+   }
+   fprintf(file, "%X\n", cpu_mask);
+   fflush(file);
+
+   if (ferror(file)) {
+   fclose(file);
+   DPAA2_BUS_WARN("Failed to write to %s", smp_affinity);
+   return;
+   }
+
+   fclose(file);
  }
  
  static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev)


Re: [PATCH 1/2] config/arm: Do not require processor information

2023-05-04 Thread Akihiko Odaki

On 2023/04/20 16:12, Akihiko Odaki wrote:

On 2023/04/20 16:10, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Thursday, April 20, 2023 9:40 AM
To: Ruifeng Wang ; Bruce Richardson 
;

Juraj Linkeš 
Cc: dev@dpdk.org; nd 
Subject: Re: [PATCH 1/2] config/arm: Do not require processor 
information


On 2023/04/17 16:41, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Friday, April 14, 2023 8:42 PM
To: Ruifeng Wang ; Bruce Richardson

Cc: dev@dpdk.org; Akihiko Odaki 
Subject: [PATCH 1/2] config/arm: Do not require processor information

DPDK can be built even without exact processor information for x86
and ppc so allow to build for Arm even if we don't know the 
targeted processor is

unknown.


Hi Akihiko,

The design idea was to require an explicit generic build.
Default/native build doesn't fall back to generic build when SoC 
info is not on the list.
So the user has less chance to generate a suboptimal binary by 
accident.


Hi,

It is true that the suboptimal binary can result, but the rationale 
here is that we
tolerate that for x86 and ppc so it should not really matter for Arm 
too. On x86 and ppc
you don't need to modify meson.build just to run dts on a development 
machine.


What modification do you need for a development machine?
I suppose "meson setup build -Dplatform=generic" will generate a 
binary that can run

on your development machine.


I didn't describe the situation well. I use DPDK Test Suite for testing 
and it determines what flags to be passed to Meson. You need to modify 
DPDK's meson.build or DTS to get it built.






Regards,
Akihiko Odaki


Hi,

Can you have a look at this again?

Regards,
Akihiko Odaki


Re: [RFC PATCH 3/3] ethdev: rename parameter in API to get FEC

2023-05-04 Thread Ferruh Yigit
On 5/4/2023 8:13 AM, Denis Pryazhennikov wrote:
> I think my patch isn't correct.
> I would say that
> - fec_capa is a set of bits produced by 'RTE_ETH_FEC_MODE_TO_CAPA()';
> - fec_mode is an element of 'enum rte_eth_fec_mode'.
> 
> Based on this definition, replacing 'fec_capa' with 'fec_mode' should
> involve changing the parameter type.
> Probably I shouldn't change the name, but I should add a more detailed
> comment.
> 
>> Independent from being single FEC mode or not, I think both
> 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as
> param, what do you think?
> 
> Andrew Rybchenko asked to replace 'mode' with 'fec_capa' for
> 'rte_eth_fec_set()' in
> https://inbox.dpdk.org/dev/aa745bd1-a564-fa8c-c77b-2d99c9769...@solarflare.com/
> I don't think we need to change it for rte_eth_fec_set().
> 

OK

> On 5/2/23 7:02 PM, Ferruh Yigit wrote:
>> On 4/28/2023 11:27 AM, Denis Pryazhennikov wrote:
>>> Only one valid FEC mode can be get by rte_eth_fec_get().
>>> The previous name implied that more than one FEC mode
>>> can be obtained.
>> +1 and patch looks good.
>>
>> But isn't this valid for 'rte_eth_fec_set()', it gets 'fec_mode'. FEC
>> capability has its own type "struct rte_eth_fec_capa".
>>
>> Independent from being single FEC mode or not, I think both
>> 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as
>> param, what do you think?
>>
>>> Documentation was updated accordingly.
>>>
>>> Signed-off-by: Denis Pryazhennikov 
>>> Acked-by: Ivan Malov 
>>> Acked-by: Viacheslav Galaktionov 
>> <...>
>>



Re: [PATCH v2] build: announce requirement for C11

2023-05-04 Thread Mattias Rönnblom

On 2023-05-03 19:30, Bruce Richardson wrote:

Add a deprecation notice informing users that we will require a C11
compiler from 23.11 release onwards. This requirement was agreed by
technical board to enable use of newer C language features, e.g.
standard atomics. [1]

[1] 
http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/

Signed-off-by: Bruce Richardson 
Acked-by: Tyler Retzlaff 

---

V2:
- add requirement for stdatomics
- fix sphinx formatting
---
  doc/guides/rel_notes/deprecation.rst | 9 +
  1 file changed, 9 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index dcc1ca1696..70c6019d26 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -11,6 +11,15 @@ here.
  Deprecation Notices
  ---
  
+* C Compiler: From DPDK 23.11 onwards,

+  building DPDK will require a C compiler which supports the C11 standard,
+  including support for C11 standard atomics.


The whole of C11, or just the mandatory parts (+atomics)?


+
+  Please note:
+
+  * C11 is supported from GCC version 5 onwards, and is the default language 
version in that release
+  * C11 is the default compilation mode in Clang from version 3.6
+
  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
for returning key match count. It will ease handling of no-match case.
  


RE: DPDK 22.11 Troubleshooting

2023-05-04 Thread Gilbert Carrillo
Hi Bruce, 

Thank you for the response.

There is no errors when I run the makefile, however I do see a difference in 
the programs. I don't believe the makefile is linking all the libraries 
together as intended.

For example, when I run the ethtool sample program and compile it using meson, 
it works fine and rte_eth_dev_count_avail() returns the correct amount. 
However, when I compile ethtool with the makefile and run it 
rte_eth_dev_count_avail() returns 0.

pkg-config --path libdpdk returns: /usr/lib64/pkgconfig/libdpdk.pc (I am using 
Redat 8.7)


V/r,
- 
Name       : Gilbert Carrillo | Software Engineer
Company : Ampex Data Systems
Address    : 26460 Corporate Ave, Ste 200
   Hayward, CA 94545
Office        : +1-650-367-2011
Mobile      : +1-575-312-7477
Website   : ampex.com 



-Original Message-
From: Bruce Richardson  
Sent: Wednesday, May 3, 2023 2:28 AM
To: Gilbert Carrillo 
Cc: dev@dpdk.org
Subject: Re: DPDK 22.11 Troubleshooting

On Mon, May 01, 2023 at 10:27:05PM +, Gilbert Carrillo wrote:
>Hello,
> 
> 
>I installed DPDK version 22.11 and the QDMA DPDK driver. However, I am
>having trouble compiling the test applications.
> 
> 
>I have a c++ program that has an external buffer and my end goal is to
>attach/map an mbuf to my external buffer for zero-copy DMA.
> 
> 
>Currently I use CMAKE to compile my program, so I was curious if this
>application has to be run on meson/ninja or is there a cmake option? I
>tried compiling the test applications with the makefile but had no
>luck, I could only compile using meson/ninja.
> 

If you have installed DPDK on your system, then a pkg-config file for it should 
be available, allowing applications to be built against it using any build 
system. The makefiles for the sample applications demonstrate how this can be 
done.

If building the example applications with make is failing, can you share the 
error messages got here, as it should work ok, once DPDK is correctly installed 
on the system. An additional test you can run is "pkg-config --path libdpdk" to 
check where DPDK is installed [though not all versions of pkg-config support 
--path, I think].

Regards,
/Bruce


RE: DPDK 22.11 Troubleshooting

2023-05-04 Thread Gilbert Carrillo
Make static returns an error (see attached).

v/R,
Gilbert

-Original Message-
From: Bruce Richardson  
Sent: Wednesday, May 3, 2023 10:35 AM
To: Gilbert Carrillo 
Cc: dev@dpdk.org
Subject: Re: DPDK 22.11 Troubleshooting

On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote:
> Hi Bruce,
> 
> Thank you for the response.
> 
> There is no errors when I run the makefile, however I do see a difference in 
> the programs. I don't believe the makefile is linking all the libraries 
> together as intended.
> 
> For example, when I run the ethtool sample program and compile it using 
> meson, it works fine and rte_eth_dev_count_avail() returns the correct 
> amount. However, when I compile ethtool with the makefile and run it 
> rte_eth_dev_count_avail() returns 0.
> 

Note that by default the meson build will statically link the examples, while 
the makefile will dynamically load the drivers at runtime. That may explain the 
difference. Can you try building and running using "make static" rather than 
just "make".

/Bruce


RE: DPDK 22.11 Troubleshooting

2023-05-04 Thread Gilbert Carrillo



-Original Message-
From: Bruce Richardson  
Sent: Wednesday, May 3, 2023 11:18 AM
To: Gilbert Carrillo 
Cc: dev@dpdk.org
Subject: Re: DPDK 22.11 Troubleshooting

On Wed, May 03, 2023 at 04:53:20PM +, Gilbert Carrillo wrote:
> Make static returns an error (see attached).
> 
> v/R,
> Gilbert
>

To help investigate this issue, can you perhaps include the text of the full 
build output when you run "make static". On my system I see libelf listed on 
the linker flags when I run "make static", and things link properly. I'm 
wondering how my setup may differ from yours.

/Bruce
 
> -Original Message-
> From: Bruce Richardson 
> Sent: Wednesday, May 3, 2023 10:35 AM
> To: Gilbert Carrillo 
> Cc: dev@dpdk.org
> Subject: Re: DPDK 22.11 Troubleshooting
> 
> On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote:
> > Hi Bruce,
> > 
> > Thank you for the response.
> > 
> > There is no errors when I run the makefile, however I do see a difference 
> > in the programs. I don't believe the makefile is linking all the libraries 
> > together as intended.
> > 
> > For example, when I run the ethtool sample program and compile it using 
> > meson, it works fine and rte_eth_dev_count_avail() returns the correct 
> > amount. However, when I compile ethtool with the makefile and run it 
> > rte_eth_dev_count_avail() returns 0.
> > 
> 
> Note that by default the meson build will statically link the examples, while 
> the makefile will dynamically load the drivers at runtime. That may explain 
> the difference. Can you try building and running using "make static" rather 
> than just "make".
> 
> /Bruce

Below is the full output after running make static from the ethtool folder in 
examples.
[root@localhost ethtool]# make static
make -C lib static
make[1]: Entering directory 
'/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/lib'
cc -O3 -fPIC -DALLOW_EXPERIMENTAL_API -O3 -I/usr/include/dpdk -include 
rte_config.h -march=corei7 -mno-avx512f  -c rte_ethtool.c -o 
build/rte_ethtool.c.o
In file included from rte_ethtool.c:7:
rte_ethtool.c: In function 'rte_ethtool_get_drvinfo':
rte_ethtool.c:52:29: warning: implicit declaration of function 'rte_dev_name'; 
did you mean 'rte_dev_remove'? [-Wimplicit-function-declaration]
  strlcpy(drvinfo->bus_info, rte_dev_name(dev_info.device),
 ^~~~
/usr/include/dpdk/rte_string_fns.h:90:50: note: in definition of macro 'strlcpy'
 #define strlcpy(dst, src, size) rte_strlcpy(dst, src, size)
  ^~~
rte_ethtool.c:52:29: warning: passing argument 2 of 'rte_strlcpy' makes pointer 
from integer without a cast [-Wint-conversion]
  strlcpy(drvinfo->bus_info, rte_dev_name(dev_info.device),
 ^
/usr/include/dpdk/rte_string_fns.h:90:50: note: in definition of macro 'strlcpy'
 #define strlcpy(dst, src, size) rte_strlcpy(dst, src, size)
  ^~~
/usr/include/dpdk/rte_string_fns.h:59:36: note: expected 'const char *' but 
argument is of type 'int'
 rte_strlcpy(char *dst, const char *src, size_t size)
^~~
ar -cr build/librte_ethtool.a build/*.o
make[1]: Leaving directory 
'/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/lib'
make -C ethtool-app static
make[1]: Entering directory 
'/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/ethtool-app'
cc -I../lib -O3 -I/usr/include/dpdk -include rte_config.h -march=corei7 
-mno-avx512f  -DALLOW_EXPERIMENTAL_API main.c ethapp.c -o build/ethtool-static 
-L../lib/build -l:librte_ethtool.a -Wl,--whole-archive -l:librte_common_iavf.a 
-l:librte_bus_auxiliary.a -l:librte_bus_pci.a -l:librte_bus_vdev.a 
-l:librte_bus_vmbus.a -l:librte_common_mlx5.a -l:librte_mempool_ring.a 
-l:librte_net_bnxt.a -l:librte_net_e1000.a -l:librte_net_enic.a 
-l:librte_net_failsafe.a -l:librte_net_i40e.a -l:librte_net_iavf.a 
-l:librte_net_ice.a -l:librte_net_ixgbe.a -l:librte_net_mlx4.a 
-l:librte_net_mlx5.a -l:librte_net_netvsc.a -l:librte_net_nfp.a 
-l:librte_net_qede.a -l:librte_net_ring.a -l:librte_net_tap.a 
-l:librte_net_vdev_netvsc.a -l:librte_net_vhost.a -l:librte_net_virtio.a 
-l:librte_node.a -l:librte_graph.a -l:librte_flow_classify.a 
-l:librte_pipeline.a -l:librte_table.a -l:librte_pdump.a -l:librte_port.a 
-l:librte_fib.a -l:librte_ipsec.a -l:librte_vhost.a -l:librte_stack.a 
-l:librte_security.a -l:librte_sched.a -l:librte_reorder.a -l:librte_rib.a 
-l:librte_dmadev.a -l:librte_regexdev.a -l:librte_rawdev.a -l:librte_pcapng.a 
-l:librte_member.a -l:librte_lpm.a -l:librte_latencystats.a -l:librte_ip_frag.a 
-l:librte_gso.a -l:librte_gro.a -l:librte_eventdev.a -l:librte_efd.a 
-l:librte_distributor.a -l:librte_cryptodev.a -l:librte_compressdev.a 
-l:librte_cfgfile.a -l:librte_bpf.a -l:librte_bitratestats.a -l:librte_bbdev.a 
-l:librte_acl.a -l:librte_timer.a -l:librte_hash.a -l:librte_metrics.a 
-l:librte_cmdline.a -l:librte_pci.a -l:l

Re: [PATCH v11 0/4] add support for self monitoring

2023-05-04 Thread David Marchand
Hello Tomasz,

On Thu, Feb 16, 2023 at 6:55 PM Tomasz Duszynski  wrote:
>
> This series adds self monitoring support i.e allows to configure and
> read performance measurement unit (PMU) counters in runtime without
> using perf utility. This has certain advantages when application runs on
> isolated cores running dedicated tasks.
>
> Events can be read directly using rte_pmu_read() or using dedicated
> tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be
> stored inside CTF file.
>
> By design, all enabled events are grouped together and the same group
> is attached to lcores that use self monitoring funtionality.
>
> Events are enabled by names, which need to be read from standard
> location under sysfs i.e
>
> /sys/bus/event_source/devices/PMU/events
>
> where PMU is a core pmu i.e one measuring cpu events. As of today
> raw events are not supported.
>
> Tomasz Duszynski (4):
>   lib: add generic support for reading PMU events
>   pmu: support reading ARM PMU events in runtime
>   pmu: support reading Intel x86_64 PMU events in runtime
>   eal: add PMU support to tracing library

There are still some pending comments on this series and it can't be
merged until they get sorted out.

I noted two points :
- Konstantin asked for better explanations in the implementation.
- He also pointed out at using this feature with non EAL lcores.

Could you work on this so we can consider this series for v23.07?

Thank you.


-- 
David Marchand



Re: [PATCH] net/gve: add struct members and typedefs for DQO

2023-05-04 Thread Ferruh Yigit
On 4/10/2023 7:47 AM, Junfeng Guo wrote:
> Add struct members for gve_tx_queue and gve_rx_queue.
> Add typedefs for little endians.
> 
> Signed-off-by: Junfeng Guo 
> Signed-off-by: Rushil Gupta 
> Signed-off-by: Joshua Washington 
> Signed-off-by: Jeroen de Borst 

Applied to dpdk-next-net/main, thanks.


Re: [PATCH 1/1] net/gve: update base code for DQO

2023-05-04 Thread Ferruh Yigit
On 4/11/2023 5:59 AM, Rushil Gupta wrote:
> Update gve base code to support DQO.
> 
> This patch is based on this:
> https://patchwork.dpdk.org/project/dpdk/list/?series=27647&state=*
> 
> Signed-off-by: Rushil Gupta 
> Signed-off-by: Junfeng Guo 

Applied to dpdk-next-net/main, thanks.


RE: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13

2023-05-04 Thread Guo, Junfeng



> -Original Message-
> From: jer...@marvell.com 
> Sent: Tuesday, May 2, 2023 21:51
> To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng
> ; Li, Xiaoyun 
> Cc: tho...@monjalon.net; david.march...@redhat.com;
> ferruh.yi...@xilinx.com; step...@networkplumber.org; Jerin Jacob
> ; sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13
> 
> From: Jerin Jacob 
> 
> Fix the following build issue by not allowing nb_ids to be zero.
> nb_ids can be zero based on rte_rawdev_xstats_get() API
> documentation but it is not valid for the case when second
> argument is NULL.

Is this the new standard for GCC 13?

> 
> examples/ntb/ntb_fwd.c: In function 'ntb_stats_display':
> examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get'
> accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=]
>   945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) {
>   |
> ^~
> 
> examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3
> of type 'uint64_t[0]' {aka 'long unsigned int[]'}
> In file included from ../examples/ntb/ntb_fwd.c:17:
> lib/rawdev/rte_rawdev.h:504:1: note: in a call to function
> 'rte_rawdev_xstats_get'
>   504 | rte_rawdev_xstats_get(uint16_t dev_id,
> 
> Fixes: 5194299d6ef5 ("examples/ntb: support more functions")
> Cc: sta...@dpdk.org
> Signed-off-by: Jerin Jacob 
> ---
>  examples/ntb/ntb_fwd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
> index f9abed28e4..5489c3b3cd 100644
> --- a/examples/ntb/ntb_fwd.c
> +++ b/examples/ntb/ntb_fwd.c
> @@ -923,7 +923,7 @@ ntb_stats_display(void)
> 
>   /* Get NTB dev stats and stats names */
>   nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0);
> - if (nb_ids  < 0) {
> + if (nb_ids <= 0) {

Should the one in func ntb_stats_clear also be updated non-zero?

>   printf("Error: Cannot get count of xstats\n");
>   return;
>   }
> --
> 2.40.1



Re: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13

2023-05-04 Thread Jerin Jacob
On Thu, May 4, 2023 at 2:06 PM Guo, Junfeng  wrote:
>
>
>
> > -Original Message-
> > From: jer...@marvell.com 
> > Sent: Tuesday, May 2, 2023 21:51
> > To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng
> > ; Li, Xiaoyun 
> > Cc: tho...@monjalon.net; david.march...@redhat.com;
> > ferruh.yi...@xilinx.com; step...@networkplumber.org; Jerin Jacob
> > ; sta...@dpdk.org
> > Subject: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13
> >
> > From: Jerin Jacob 
> >
> > Fix the following build issue by not allowing nb_ids to be zero.
> > nb_ids can be zero based on rte_rawdev_xstats_get() API
> > documentation but it is not valid for the case when second
> > argument is NULL.
>
> Is this the new standard for GCC 13?

No. Looks like optimization from compiler. It is able to detect their
case for zero memory allocation.

>
> >
> > examples/ntb/ntb_fwd.c: In function 'ntb_stats_display':
> > examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get'
> > accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=]
> >   945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) {
> >   |
> > ^~
> >
> > examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3
> > of type 'uint64_t[0]' {aka 'long unsigned int[]'}
> > In file included from ../examples/ntb/ntb_fwd.c:17:
> > lib/rawdev/rte_rawdev.h:504:1: note: in a call to function
> > 'rte_rawdev_xstats_get'
> >   504 | rte_rawdev_xstats_get(uint16_t dev_id,
> >
> > Fixes: 5194299d6ef5 ("examples/ntb: support more functions")
> > Cc: sta...@dpdk.org
> > Signed-off-by: Jerin Jacob 
> > ---
> >  examples/ntb/ntb_fwd.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
> > index f9abed28e4..5489c3b3cd 100644
> > --- a/examples/ntb/ntb_fwd.c
> > +++ b/examples/ntb/ntb_fwd.c
> > @@ -923,7 +923,7 @@ ntb_stats_display(void)
> >
> >   /* Get NTB dev stats and stats names */
> >   nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0);
> > - if (nb_ids  < 0) {
> > + if (nb_ids <= 0) {
>
> Should the one in func ntb_stats_clear also be updated non-zero?

Some reason compiler is not detecting it. I will fix in next version.


>
> >   printf("Error: Cannot get count of xstats\n");
> >   return;
> >   }
> > --
> > 2.40.1
>


Re: [PATCH v2] build: announce requirement for C11

2023-05-04 Thread Bruce Richardson
On Thu, May 04, 2023 at 09:50:09AM +0200, Mattias Rönnblom wrote:
> On 2023-05-03 19:30, Bruce Richardson wrote:
> > Add a deprecation notice informing users that we will require a C11
> > compiler from 23.11 release onwards. This requirement was agreed by
> > technical board to enable use of newer C language features, e.g.
> > standard atomics. [1]
> > 
> > [1] 
> > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > 
> > Signed-off-by: Bruce Richardson 
> > Acked-by: Tyler Retzlaff 
> > 
> > ---
> > 
> > V2:
> > - add requirement for stdatomics
> > - fix sphinx formatting
> > ---
> >   doc/guides/rel_notes/deprecation.rst | 9 +
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > b/doc/guides/rel_notes/deprecation.rst
> > index dcc1ca1696..70c6019d26 100644
> > --- a/doc/guides/rel_notes/deprecation.rst
> > +++ b/doc/guides/rel_notes/deprecation.rst
> > @@ -11,6 +11,15 @@ here.
> >   Deprecation Notices
> >   ---
> > +* C Compiler: From DPDK 23.11 onwards,
> > +  building DPDK will require a C compiler which supports the C11 standard,
> > +  including support for C11 standard atomics.
> 
> The whole of C11, or just the mandatory parts (+atomics)?
> 
I assume we only need mandatory + atomics, however perhaps someone more
knowledgable about what the optional parts of the spec are, can correct me
on this. Once clarified, I maybe should reword this yet again to call it
out even more specificallly.

/Bruce


[dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13

2023-05-04 Thread jerinj
From: Jerin Jacob 

Fix the following build issue by not allowing nb_ids to be zero.
nb_ids can be zero based on rte_rawdev_xstats_get() API
documentation but it is not valid for the case when second
argument is NULL.

examples/ntb/ntb_fwd.c: In function 'ntb_stats_display':
examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get'
accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=]
  945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) {
  |   ^~

examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3
of type 'uint64_t[0]' {aka 'long unsigned int[]'}
In file included from ../examples/ntb/ntb_fwd.c:17:
lib/rawdev/rte_rawdev.h:504:1: note: in a call to function
'rte_rawdev_xstats_get'
  504 | rte_rawdev_xstats_get(uint16_t dev_id,

Fixes: 5194299d6ef5 ("examples/ntb: support more functions")
Cc: sta...@dpdk.org
Signed-off-by: Jerin Jacob 
Tested-by: Ali Alnubani 
---
v2:
- Apply the same fix for ntb_stats_clear()(Junfeng)

 examples/ntb/ntb_fwd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
index f9abed28e4..585aad9d70 100644
--- a/examples/ntb/ntb_fwd.c
+++ b/examples/ntb/ntb_fwd.c
@@ -865,7 +865,7 @@ ntb_stats_clear(void)
 
/* Clear NTB dev stats */
nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0);
-   if (nb_ids  < 0) {
+   if (nb_ids <= 0) {
printf("Error: Cannot get count of xstats\n");
return;
}
@@ -923,7 +923,7 @@ ntb_stats_display(void)
 
/* Get NTB dev stats and stats names */
nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0);
-   if (nb_ids  < 0) {
+   if (nb_ids <= 0) {
printf("Error: Cannot get count of xstats\n");
return;
}
-- 
2.40.1



RE: [dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13

2023-05-04 Thread Guo, Junfeng



> -Original Message-
> From: jer...@marvell.com 
> Sent: Thursday, May 4, 2023 16:54
> To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng
> ; Li, Xiaoyun 
> Cc: tho...@monjalon.net; david.march...@redhat.com;
> ferruh.yi...@amd.com; Jerin Jacob ;
> sta...@dpdk.org; Ali Alnubani 
> Subject: [dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13
> 
> From: Jerin Jacob 
> 
> Fix the following build issue by not allowing nb_ids to be zero.
> nb_ids can be zero based on rte_rawdev_xstats_get() API
> documentation but it is not valid for the case when second
> argument is NULL.
> 
> examples/ntb/ntb_fwd.c: In function 'ntb_stats_display':
> examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get'
> accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=]
>   945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) {
>   |
> ^~
> 
> examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3
> of type 'uint64_t[0]' {aka 'long unsigned int[]'}
> In file included from ../examples/ntb/ntb_fwd.c:17:
> lib/rawdev/rte_rawdev.h:504:1: note: in a call to function
> 'rte_rawdev_xstats_get'
>   504 | rte_rawdev_xstats_get(uint16_t dev_id,
> 
> Fixes: 5194299d6ef5 ("examples/ntb: support more functions")
> Cc: sta...@dpdk.org
> Signed-off-by: Jerin Jacob 
> Tested-by: Ali Alnubani 
> ---
> v2:
> - Apply the same fix for ntb_stats_clear()(Junfeng)
> 
>  examples/ntb/ntb_fwd.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c
> index f9abed28e4..585aad9d70 100644
> --- a/examples/ntb/ntb_fwd.c
> +++ b/examples/ntb/ntb_fwd.c
> @@ -865,7 +865,7 @@ ntb_stats_clear(void)
> --
> 2.40.1

LGTM, it makes sense to avoid zero memory allocation. Thanks!

Acked-by: Junfeng Guo 

Regards,
Junfeng Guo


RE: [PATCH 1/2] config/arm: Do not require processor information

2023-05-04 Thread Ruifeng Wang
> -Original Message-
> From: Akihiko Odaki 
> Sent: Thursday, May 4, 2023 3:47 PM
> To: Ruifeng Wang ; Bruce Richardson 
> ;
> Juraj Linkeš 
> Cc: dev@dpdk.org; nd 
> Subject: Re: [PATCH 1/2] config/arm: Do not require processor information
> 
> On 2023/04/20 16:12, Akihiko Odaki wrote:
> > On 2023/04/20 16:10, Ruifeng Wang wrote:
> >>> -Original Message-
> >>> From: Akihiko Odaki 
> >>> Sent: Thursday, April 20, 2023 9:40 AM
> >>> To: Ruifeng Wang ; Bruce Richardson
> >>> ; Juraj Linkeš
> >>> 
> >>> Cc: dev@dpdk.org; nd 
> >>> Subject: Re: [PATCH 1/2] config/arm: Do not require processor
> >>> information
> >>>
> >>> On 2023/04/17 16:41, Ruifeng Wang wrote:
> > -Original Message-
> > From: Akihiko Odaki 
> > Sent: Friday, April 14, 2023 8:42 PM
> > To: Ruifeng Wang ; Bruce Richardson
> > 
> > Cc: dev@dpdk.org; Akihiko Odaki 
> > Subject: [PATCH 1/2] config/arm: Do not require processor
> > information
> >
> > DPDK can be built even without exact processor information for x86
> > and ppc so allow to build for Arm even if we don't know the
> > targeted processor is
> >>> unknown.
> 
>  Hi Akihiko,
> 
>  The design idea was to require an explicit generic build.
>  Default/native build doesn't fall back to generic build when SoC
>  info is not on the list.
>  So the user has less chance to generate a suboptimal binary by
>  accident.
> >>>
> >>> Hi,
> >>>
> >>> It is true that the suboptimal binary can result, but the rationale
> >>> here is that we tolerate that for x86 and ppc so it should not
> >>> really matter for Arm too. On x86 and ppc you don't need to modify
> >>> meson.build just to run dts on a development machine.
> >>
> >> What modification do you need for a development machine?
> >> I suppose "meson setup build -Dplatform=generic" will generate a
> >> binary that can run on your development machine.
> >
> > I didn't describe the situation well. I use DPDK Test Suite for
> > testing and it determines what flags to be passed to Meson. You need
> > to modify DPDK's meson.build or DTS to get it built.
> >
> >>
> >>>
> >>> Regards,
> >>> Akihiko Odaki
> 
> Hi,
> 
> Can you have a look at this again?

Thanks for the clarification of your use case.
Changes to DTS are in planning. It will allow the user to choose
the type of the build.
Your use case will be fulfilled then.

Regards,
Ruifeng
> 
> Regards,
> Akihiko Odaki


Re: 21.11.4 patches review and test

2023-05-04 Thread Kevin Traynor

On 04/05/2023 03:13, Xu, HailinX wrote:

-Original Message-
From: Kevin Traynor 
Sent: Tuesday, May 2, 2023 5:35 PM
To: Xu, HailinX ; sta...@dpdk.org
Cc: Stokes, Ian ; Mcnamara, John
; Luca Boccassi ; Xu, Qian Q
; Thomas Monjalon ; Peng,
Yuan ; Chen, Zhaoyan ;
dev@dpdk.org
Subject: Re: 21.11.4 patches review and test

On 20/04/2023 11:32, Kevin Traynor wrote:

On 20/04/2023 03:40, Xu, HailinX wrote:

-Original Message-
From: Xu, HailinX 
Sent: Thursday, April 13, 2023 2:13 PM
To: Kevin Traynor ; sta...@dpdk.org
Cc: dev@dpdk.org; Abhishek Marathe

;

Ali Alnubani ; Walker, Benjamin
; David Christensen
; Hemant Agrawal

;

Stokes, Ian ; Jerin Jacob
; Mcnamara, John ;
Ju-Hyoung Lee ; Luca Boccassi
; Pei Zhang ; Xu, Qian Q
; Raslan Darawsheh ;

Thomas

Monjalon ; yangh...@redhat.com; Peng, Yuan
; Chen, Zhaoyan 
Subject: RE: 21.11.4 patches review and test


-Original Message-
From: Kevin Traynor 
Sent: Thursday, April 6, 2023 7:38 PM
To: sta...@dpdk.org
Cc: dev@dpdk.org; Abhishek Marathe
; Ali Alnubani
; Walker, Benjamin ;
David Christensen ; Hemant Agrawal
; Stokes, Ian ; Jerin
Jacob ; Mcnamara, John
; Ju-Hyoung Lee ;
Kevin Traynor ; Luca Boccassi
; Pei Zhang ; Xu, Qian Q
; Raslan Darawsheh ;

Thomas

Monjalon ; yangh...@redhat.com; Peng, Yuan
; Chen, Zhaoyan 
Subject: 21.11.4 patches review and test

Hi all,

Here is a list of patches targeted for stable release 21.11.4.

The planned date for the final release is 25th April.

Please help with testing and validation of your use cases and
report any issues/results with reply-all to this mail. For the
final release the fixes and reported validations will be added to the

release notes.


A release candidate tarball can be found at:

   https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1

These patches are located at branch 21.11 of dpdk-stable repo:
   https://dpdk.org/browse/dpdk-stable/

Thanks.

Kevin


HI All,

Update the test status for Intel part. Till now dpdk21.11.4-rc1
validation test rate is 85%. No critical issue is found.
2 new bugs are found, 1 new issue is under confirming by Intel Dev.
New bugs:   --20.11.8-rc1 also has these two issues
 1.

pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac:

performance drop about 23.5% when send small packets
https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet
 2. some of the virtio tests are failing:-- Intel dev is under

investigating

# Basic Intel(R) NIC testing
* Build & CFLAG compile: cover the build test combination with
latest GCC/Clang version and the popular OS revision such as
 Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4,
FreeBSD13.1, SUSE15, CentOS7.9, etc.
- All test done. No new dpdk issue is found.
* PF(i40e, ixgbe): test scenarios including
RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
- All test done. No new dpdk issue is found.
* VF(i40e, ixgbe): test scenarios including
VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
- All test done. No new dpdk issue is found.
* PF/VF(ice): test scenarios including Switch features/Package
Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible
Descriptor, etc.
- All test done. No new dpdk issue is found.
* Intel NIC single core/NIC performance: test scenarios including
PF/VF single core performance test, etc.
- All test done. No new dpdk issue is found.
* IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic
test - QAT&SW/FIB library, etc.
- On going.

# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as
PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
testing/VMAWARE ESXI 8.0, etc.
- All test done. found bug1.
* Cryptodev:
 *Function test: test scenarios including Cryptodev API
testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
   - Execution rate is 90%. found bug2.
 *Performance test: test scenarios including Thoughput
Performance/Cryptodev Latency, etc.
   - All test done. No new dpdk issue is found.

Regards,
Xu, Hailin

Update the test status for Intel part. completed dpdk21.11.4-rc1 all

validation. No critical issue is found.


Hi. Thanks for testing.


2 new bugs are found, 1 new issue is under confirming by Intel Dev.
New bugs: --20.11.8-rc1 also has these two issues
 1.

pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac:
performance drop about 23.5% when send small packets

https://bugs.dpdk.org/show_bug.cgi?id=1212  --not fix yet, Only the
specified platform exists


Do you know which patch caaused the regression? I'm not fully clear
from the Bz for 20.11. The backported patch ID'd as root cause [0] in
20.11 is in the previous releases of 20.11 (and 21.11).

Trying to understand because then it would have shown in testing for
previous releases. Or is this a new test introduced for latest LTS
releases? and if so, what is the baseline performance based on?

[0]
commit 1c9a7fba5c90e0422b517404

DPDK Release Status Meeting 2023-05-04

2023-05-04 Thread Mcnamara, John
Release status meeting minutes 2023-05-04
=

Agenda:
* Release Dates
* Subtrees
* Roadmaps
* LTS
* Defects
* Opens

Participants:
* AMD
* ARM
* Intel
* Marvell
* Nvidia
* Red Hat


Release Dates
-

The following are the proposed current dates for 23.07:

* V1:  22 April 2023
* RC1: 31   May 2023
* RC2: 21  June 2023
* RC3: 28  June 2023
* Release: 12  July 2023


Subtrees


* next-net
  * Under review.
  * ~120 patches in backlog.
  * GVE license update

* next-net-intel
  * Starting to apply patches, about 20 patches are waiting for merge.

* next-net-mlx
  * No update

* next-net-mvl
  * Some patches merged.
  * ~30 patches remaining.

* next-eventdev
  * Some patches merged.
  * AMD patches under review

* next-baseband
  * No major features so far.

* next-virtio
  * Intel series for port mirroring
* Some concerns on code duplication and API complexity
  * Report of performance degradation in vhost driver
  * New series on guest notifications in slow path
  * VDUSE - Vhost VDPA in userspace. https://lwn.net/Articles/841054/
* Under review

* next-crypto
  * 60-70 patches in backlog
  * Patches to add new algorithms
  * Reviewing PDCP patches
  * Reviews and merges ongoing

* main
  * Starting to merge patches to improve the Windows port
  * Reviewing ethdev patches
  * Patches from ARM for PMU
* Postponed from last release
* For tracing
* Awaiting feedback on comments
  * Power Management feature from AMD
  * Memarea feature
  * CDX bus patches from AMD
* Virtual PCI like bus
  * New checkpatch checks for atomic functions
  * Issues with Fedora 38 and GCC 13
  * Issue with Coverity automated analysis
  * Call for updates to the external Roadmap:
https://core.dpdk.org/roadmap/


Proposed Schedule for 2023
--

See also http://core.dpdk.org/roadmap/#dates

23.07
  * Proposal deadline (RFC/v1 patches): 22 April 2023
  * API freeze (-rc1): 31 May 2023
  * PMD features freeze (-rc2): 21 June 2023
  * Builtin applications features freeze (-rc3): 28 June 2023
  * Release: 12 July 2023

23.11
  * Proposal deadline (RFC/v1 patches): 12 August 2023
  * API freeze (-rc1): 29 September 2023
  * PMD features freeze (-rc2): 20 October 2023
  * Builtin applications features freeze (-rc3): 27 October 2023
  * Release: 15 November 2023


LTS
---

Next LTS releases will be:

* 22.11.1
  * RC1 sent - Scheduled for May 5
* 21.11.4
  * Got test feedback from Nvidia, Red Hat and Intel
* 20.11.8
  * RC1 sent and first test report received
* 19.11.15
  * CVE and critical fixes only.


* Distros
  * v20.11 in Debian 11
  * Ubuntu 22.04-LTS contains 21.11
  * Ubuntu 23.04 contains 22.11

Defects
---

* Bugzilla links, 'Bugs',  added for hosted projects
  * https://www.dpdk.org/hosted-projects/


Opens
-

* None


DPDK Release Status Meetings


The DPDK Release Status Meeting is intended for DPDK Committers to discuss the
status of the master tree and sub-trees, and for project managers to track
progress or milestone dates.

The meeting occurs on every Thursday at 9:30 UTC over Jitsi on 
https://meet.jit.si/DPDK

You don't need an invite to join the meeting but if you want a calendar 
reminder just
send an email to "John McNamara john.mcnam...@intel.com" for the invite.



Re: [PATCH 00/10] gve PMD enhancement

2023-05-04 Thread Ferruh Yigit
On 4/13/2023 7:16 AM, Junfeng Guo wrote:
> This patch set includs two main enhancements for gve PMD:
>  - support basic data path with DQO queue format
>  - support jumbo frame with GQI queue format
> 
> This patch set is based on this:
> patchwork.dpdk.org/project/dpdk/list/?series=27653&state=*
> 
> Junfeng Guo (10):
>   net/gve: add Tx queue setup for DQO
>   net/gve: add Rx queue setup for DQO
>   net/gve: support device start and close for DQO
>   net/gve: support queue release and stop for DQO
>   net/gve: support basic Tx data path for DQO
>   net/gve: support basic Rx data path for DQO
>   net/gve: support basic stats for DQO
>   net/gve: enable Tx checksum offload for DQO
>   net/gve: add maintainers for GVE
>   net/gve: support jumbo frame for GQI

Except 9/10 (maintainers file update), please see note on relevant patch
for it,
Series applied to dpdk-next-net/main, thanks.


Re: [PATCH 09/10] net/gve: add maintainers for GVE

2023-05-04 Thread Ferruh Yigit
On 4/13/2023 7:16 AM, Junfeng Guo wrote:
> Add maintainers from Google for GVE.
> 
> Signed-off-by: Junfeng Guo 
> Signed-off-by: Rushil Gupta 
> ---
>  MAINTAINERS | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8df23e5099..08001751b0 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini
>  
>  Google Virtual Ethernet
>  M: Junfeng Guo 
> +M: Jeroen de Borst 
> +M: Rushil Gupta 
> +M: Joshua Washington 
>  F: drivers/net/gve/
>  F: doc/guides/nics/gve.rst
>  F: doc/guides/nics/features/gve.ini

Requested ack from new added maintainers not received (to other version
of this patch [2]). This patch is not dependency for rest of the set, so
dropping it from the set.

Also there is a standalone version of this patch [1], can you please
send a new version to standalone one, instead of including same patch in
various sets?



[1]
https://patches.dpdk.org/project/dpdk/patch/20221109072352.1387300-1-junfeng@intel.com/

[2]
https://patches.dpdk.org/project/dpdk/patch/20230328094512.1796648-4-junfeng@intel.com/


[RFC PATCH v2 0/4] dts: add dts api docs

2023-05-04 Thread Juraj Linkeš
Augment the meson build system with dts api generation. The api docs are
generated from Python docstrings in DTS using Sphinx. The format of
choice is the Google format [0].

The guide html sphinx configuration is used to preserve the same style.

The build requires the same Python version and dependencies as DTS,
because Sphinx imports the Python modules. Dependencies are installed
using Poetry from the dts directory:

poetry install --with docs

After installing, enter the Poetry shell:

poetry shell

And then run the build:
ninja -C  dts/doc

There's only one properly documented module that serves as a
demonstration of the style - framework.testbed_model.node.

[0] https://google.github.io/styleguide/pyguide.html#s3.8.4-comments-in-classes

Juraj Linkeš (4):
  dts: code adjustments for sphinx
  dts: add doc generation dependencies
  dts: add doc generation
  dts: format docstrigs to google format

 doc/api/meson.build   |   1 +
 doc/guides/conf.py|  22 +-
 doc/guides/meson.build|   1 +
 doc/guides/tools/dts.rst  |  29 +
 dts/doc/doc-index.rst |  20 +
 dts/doc/meson.build   |  50 ++
 dts/framework/config/__init__.py  |  11 +
 .../{testbed_model/hw => config}/cpu.py   |  13 +
 dts/framework/dts.py  |   8 +-
 dts/framework/remote_session/__init__.py  |   3 +-
 dts/framework/remote_session/linux_session.py |   2 +-
 dts/framework/remote_session/os_session.py|  12 +-
 .../remote_session/remote/__init__.py |  16 -
 .../{remote => }/remote_session.py|   0
 .../{remote => }/ssh_session.py   |   0
 dts/framework/settings.py |  55 +-
 dts/framework/testbed_model/__init__.py   |  10 +-
 dts/framework/testbed_model/hw/__init__.py|  27 -
 dts/framework/testbed_model/node.py   | 164 ++--
 dts/framework/testbed_model/sut_node.py   |   9 +-
 .../testbed_model/{hw => }/virtual_device.py  |   0
 dts/main.py   |   3 +-
 dts/meson.build   |  16 +
 dts/poetry.lock   | 770 --
 dts/pyproject.toml|   7 +
 dts/tests/TestSuite_hello_world.py|   6 +-
 meson.build   |   1 +
 meson_options.txt |   2 +
 28 files changed, 1038 insertions(+), 220 deletions(-)
 create mode 100644 dts/doc/doc-index.rst
 create mode 100644 dts/doc/meson.build
 rename dts/framework/{testbed_model/hw => config}/cpu.py (95%)
 delete mode 100644 dts/framework/remote_session/remote/__init__.py
 rename dts/framework/remote_session/{remote => }/remote_session.py (100%)
 rename dts/framework/remote_session/{remote => }/ssh_session.py (100%)
 delete mode 100644 dts/framework/testbed_model/hw/__init__.py
 rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%)
 create mode 100644 dts/meson.build

-- 
2.30.2



[RFC PATCH v2 1/4] dts: code adjustments for sphinx

2023-05-04 Thread Juraj Linkeš
sphinx-build only imports the Python modules when building the
documentation; it doesn't run DTS. This requires changes that make the
code importable without running it. This means:
* properly guarding argument parsing in the if __name__ == '__main__'
  block.
* the logger used by DTS runner underwent the same treatment so that it
  doesn't create unnecessary log files.
* however, DTS uses the arguments to construct an object holding global
  variables. The defaults for the global variables needed to be moved
  from argument parsing elsewhere.
* importing the remote_session module from framework resulted in
  circular imports because of one module trying to import another
  module. This is fixed by more granular imports.

Signed-off-by: Juraj Linkeš 
---
 dts/framework/config/__init__.py  | 11 
 .../{testbed_model/hw => config}/cpu.py   | 13 +
 dts/framework/dts.py  |  8 ++-
 dts/framework/remote_session/__init__.py  |  3 +-
 dts/framework/remote_session/linux_session.py |  2 +-
 dts/framework/remote_session/os_session.py| 12 +++-
 .../remote_session/remote/__init__.py | 16 --
 .../{remote => }/remote_session.py|  0
 .../{remote => }/ssh_session.py   |  0
 dts/framework/settings.py | 55 ++-
 dts/framework/testbed_model/__init__.py   | 10 +---
 dts/framework/testbed_model/hw/__init__.py| 27 -
 dts/framework/testbed_model/node.py   | 12 ++--
 dts/framework/testbed_model/sut_node.py   |  9 ++-
 .../testbed_model/{hw => }/virtual_device.py  |  0
 dts/main.py   |  3 +-
 dts/tests/TestSuite_hello_world.py|  6 +-
 17 files changed, 88 insertions(+), 99 deletions(-)
 rename dts/framework/{testbed_model/hw => config}/cpu.py (95%)
 delete mode 100644 dts/framework/remote_session/remote/__init__.py
 rename dts/framework/remote_session/{remote => }/remote_session.py (100%)
 rename dts/framework/remote_session/{remote => }/ssh_session.py (100%)
 delete mode 100644 dts/framework/testbed_model/hw/__init__.py
 rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%)

diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py
index ebb0823ff5..293c4cb15b 100644
--- a/dts/framework/config/__init__.py
+++ b/dts/framework/config/__init__.py
@@ -7,6 +7,8 @@
 Yaml config parsing methods
 """
 
+# pylama:ignore=W0611
+
 import json
 import os.path
 import pathlib
@@ -19,6 +21,15 @@
 
 from framework.settings import SETTINGS
 
+from .cpu import (
+LogicalCore,
+LogicalCoreCount,
+LogicalCoreCountFilter,
+LogicalCoreList,
+LogicalCoreListFilter,
+lcore_filter,
+)
+
 
 class StrEnum(Enum):
 @staticmethod
diff --git a/dts/framework/testbed_model/hw/cpu.py b/dts/framework/config/cpu.py
similarity index 95%
rename from dts/framework/testbed_model/hw/cpu.py
rename to dts/framework/config/cpu.py
index d1918a12dc..8fe785dfe4 100644
--- a/dts/framework/testbed_model/hw/cpu.py
+++ b/dts/framework/config/cpu.py
@@ -272,3 +272,16 @@ def filter(self) -> list[LogicalCore]:
 )
 
 return filtered_lcores
+
+
+def lcore_filter(
+core_list: list[LogicalCore],
+filter_specifier: LogicalCoreCount | LogicalCoreList,
+ascending: bool,
+) -> LogicalCoreFilter:
+if isinstance(filter_specifier, LogicalCoreList):
+return LogicalCoreListFilter(core_list, filter_specifier, ascending)
+elif isinstance(filter_specifier, LogicalCoreCount):
+return LogicalCoreCountFilter(core_list, filter_specifier, ascending)
+else:
+raise ValueError(f"Unsupported filter r{filter_specifier}")
diff --git a/dts/framework/dts.py b/dts/framework/dts.py
index 0502284580..22a09b7e34 100644
--- a/dts/framework/dts.py
+++ b/dts/framework/dts.py
@@ -3,6 +3,7 @@
 # Copyright(c) 2022-2023 PANTHEON.tech s.r.o.
 # Copyright(c) 2022-2023 University of New Hampshire
 
+import logging
 import sys
 
 from .config import CONFIGURATION, BuildTargetConfiguration, 
ExecutionConfiguration
@@ -12,7 +13,8 @@
 from .testbed_model import SutNode
 from .utils import check_dts_python_version
 
-dts_logger: DTSLOG = getLogger("DTSRunner")
+# dummy defaults to satisfy linters
+dts_logger: DTSLOG | logging.Logger = logging.getLogger("DTSRunner")
 result: DTSResult = DTSResult(dts_logger)
 
 
@@ -24,6 +26,10 @@ def run_all() -> None:
 global dts_logger
 global result
 
+# create a regular DTS logger and create a new result with it
+dts_logger = getLogger("DTSRunner")
+result = DTSResult(dts_logger)
+
 # check the python version of the server that run dts
 check_dts_python_version()
 
diff --git a/dts/framework/remote_session/__init__.py 
b/dts/framework/remote_session/__init__.py
index ee221503df..17ca1459f7 100644
--- a/dts/framework/remote_session/__init__.py
+++ b/dts/framework/remote_session/__init__.py
@@ -17,7 +17,8 @@
 
 from .linux_session import Lin

[RFC PATCH v2 2/4] dts: add doc generation dependencies

2023-05-04 Thread Juraj Linkeš
Sphinx imports every Python module when generating documentation from
docstrings, meaning all dts dependencies, including Python version,
must be satisfied.
By adding Sphinx to dts dependencies we make sure that the proper
Python version and dependencies are used when Sphinx is executed.

Signed-off-by: Juraj Linkeš 
---
 dts/poetry.lock| 770 +
 dts/pyproject.toml |   7 +
 2 files changed, 710 insertions(+), 67 deletions(-)

diff --git a/dts/poetry.lock b/dts/poetry.lock
index 0b2a007d4d..500f89dac1 100644
--- a/dts/poetry.lock
+++ b/dts/poetry.lock
@@ -1,24 +1,69 @@
+# This file is automatically @generated by Poetry and should not be changed by 
hand.
+
+[[package]]
+name = "alabaster"
+version = "0.7.13"
+description = "A configurable sidebar-enabled Sphinx theme"
+category = "dev"
+optional = false
+python-versions = ">=3.6"
+files = [
+{file = "alabaster-0.7.13-py3-none-any.whl", hash = 
"sha256:1ee19aca801bbabb5ba3f5f258e4422dfa86f82f3e9cefb0859b283cdd7f62a3"},
+{file = "alabaster-0.7.13.tar.gz", hash = 
"sha256:a27a4a084d5e690e16e01e03ad2b2e552c61a65469419b907243193de1a84ae2"},
+]
+
 [[package]]
 name = "attrs"
-version = "22.1.0"
+version = "22.2.0"
 description = "Classes Without Boilerplate"
 category = "main"
 optional = false
-python-versions = ">=3.5"
+python-versions = ">=3.6"
+files = [
+{file = "attrs-22.2.0-py3-none-any.whl", hash = 
"sha256:29e95c7f6778868dbd49170f98f8818f78f3dc5e0e37c0b1f474e3561b240836"},
+{file = "attrs-22.2.0.tar.gz", hash = 
"sha256:c9227bfc2f01993c03f68db37d1d15c9690188323c067c641f1a35ca58185f99"},
+]
 
 [package.extras]
-dev = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest 
(>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", 
"furo", "sphinx", "sphinx-notfound-page", "pre-commit", "cloudpickle"]
-docs = ["furo", "sphinx", "zope.interface", "sphinx-notfound-page"]
-tests = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest 
(>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", 
"cloudpickle"]
-tests_no_zope = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest 
(>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "cloudpickle"]
+cov = ["attrs[tests]", "coverage-enable-subprocess", "coverage[toml] (>=5.3)"]
+dev = ["attrs[docs,tests]"]
+docs = ["furo", "myst-parser", "sphinx", "sphinx-notfound-page", 
"sphinxcontrib-towncrier", "towncrier", "zope.interface"]
+tests = ["attrs[tests-no-zope]", "zope.interface"]
+tests-no-zope = ["cloudpickle", "cloudpickle", "hypothesis", "hypothesis", 
"mypy (>=0.971,<0.990)", "mypy (>=0.971,<0.990)", "pympler", "pympler", "pytest 
(>=4.3.0)", "pytest (>=4.3.0)", "pytest-mypy-plugins", "pytest-mypy-plugins", 
"pytest-xdist[psutil]", "pytest-xdist[psutil]"]
+
+[[package]]
+name = "babel"
+version = "2.12.1"
+description = "Internationalization utilities"
+category = "dev"
+optional = false
+python-versions = ">=3.7"
+files = [
+{file = "Babel-2.12.1-py3-none-any.whl", hash = 
"sha256:b4246fb7677d3b98f501a39d43396d3cafdc8eadb045f4a31be01863f655c610"},
+{file = "Babel-2.12.1.tar.gz", hash = 
"sha256:cc2d9cd01d44420ae725a21c9e3711b3aadc7976d6147f622d8581963455"},
+]
 
 [[package]]
 name = "black"
-version = "22.10.0"
+version = "22.12.0"
 description = "The uncompromising code formatter."
 category = "dev"
 optional = false
 python-versions = ">=3.7"
+files = [
+{file = 
"black-22.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", 
hash = 
"sha256:9eedd20838bd5d75b80c9f5487dbcb06836a43833a37846cf1d8c1cc01cef59d"},
+{file = "black-22.12.0-cp310-cp310-win_amd64.whl", hash = 
"sha256:159a46a4947f73387b4d83e87ea006dbb2337eab6c879620a3ba52699b1f4351"},
+{file = 
"black-22.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", 
hash = 
"sha256:d30b212bffeb1e252b31dd269dfae69dd17e06d92b87ad26e23890f3efea366f"},
+{file = "black-22.12.0-cp311-cp311-win_amd64.whl", hash = 
"sha256:7412e75863aa5c5411886804678b7d083c7c28421210180d67dfd8cf1221e1f4"},
+{file = 
"black-22.12.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash 
= "sha256:c116eed0efb9ff870ded8b62fe9f28dd61ef6e9ddd28d83d7d264a38417dcee2"},
+{file = "black-22.12.0-cp37-cp37m-win_amd64.whl", hash = 
"sha256:1f58cbe16dfe8c12b7434e50ff889fa479072096d79f0a7f25e4ab8e94cd8350"},
+{file = 
"black-22.12.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash 
= "sha256:77d86c9f3db9b1bf6761244bc0b3572a546f5fe37917a044e02f3166d5aafa7d"},
+{file = "black-22.12.0-cp38-cp38-win_amd64.whl", hash = 
"sha256:82d9fe8fee3401e02e79767016b4907820a7dc28d70d137eb397b92ef3cc5bfc"},
+{file = 
"black-22.12.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash 
= "sha256:101c69b23df9b44247bd88e1d7e90154336ac4992502d4197bdac35dd7ee3320"},
+{file = "black-22.12.0-cp39-cp39-win_amd64.whl", hash = 
"sha256:559c7a1ba9a006226f09e4916060982fd27334ae1998e

[RFC PATCH v2 3/4] dts: add doc generation

2023-05-04 Thread Juraj Linkeš
The tool used to generate developer docs is sphinx, which is already
used in DPDK. The configuration is kept the same to preserve the style.

Sphinx generates the documentation from Python docstrings. The docstring
format most suitable for DTS seems to be the Google format [0] which
requires the sphinx.ext.napoleon extension.

There are two requirements for building DTS docs:
* The same Python version as DTS or higher, because Sphinx import the
  code.
* Also the same Python packages as DTS, for the same reason.

[0] https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings

Signed-off-by: Juraj Linkeš 
---
 doc/api/meson.build  |  1 +
 doc/guides/conf.py   | 22 ++
 doc/guides/meson.build   |  1 +
 doc/guides/tools/dts.rst | 29 +++
 dts/doc/doc-index.rst| 20 
 dts/doc/meson.build  | 50 
 dts/meson.build  | 16 +
 meson.build  |  1 +
 meson_options.txt|  2 ++
 9 files changed, 137 insertions(+), 5 deletions(-)
 create mode 100644 dts/doc/doc-index.rst
 create mode 100644 dts/doc/meson.build
 create mode 100644 dts/meson.build

diff --git a/doc/api/meson.build b/doc/api/meson.build
index 2876a78a7e..1f0c725a94 100644
--- a/doc/api/meson.build
+++ b/doc/api/meson.build
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2018 Luca Boccassi 
 
+doc_api_build_dir = meson.current_build_dir()
 doxygen = find_program('doxygen', required: get_option('enable_docs'))
 
 if not doxygen.found()
diff --git a/doc/guides/conf.py b/doc/guides/conf.py
index a55ce38800..04c842b67a 100644
--- a/doc/guides/conf.py
+++ b/doc/guides/conf.py
@@ -7,10 +7,9 @@
 from sphinx import __version__ as sphinx_version
 from os import listdir
 from os import environ
-from os.path import basename
-from os.path import dirname
+from os.path import basename, dirname
 from os.path import join as path_join
-from sys import argv, stderr
+from sys import argv, stderr, path
 
 import configparser
 
@@ -24,6 +23,19 @@
   file=stderr)
 pass
 
+extensions = ['sphinx.ext.napoleon']
+
+# Python docstring options
+autodoc_member_order = 'bysource'
+autodoc_typehints = 'both'
+autodoc_typehints_format = 'short'
+napoleon_numpy_docstring = False
+napoleon_attr_annotations = True
+napoleon_use_ivar = True
+napoleon_use_rtype = False
+add_module_names = False
+toc_object_entries_show_parents = 'hide'
+
 stop_on_error = ('-W' in argv)
 
 project = 'Data Plane Development Kit'
@@ -35,8 +47,8 @@
 html_show_copyright = False
 highlight_language = 'none'
 
-release = environ.setdefault('DPDK_VERSION', "None")
-version = release
+path.append(environ.setdefault('DTS_ROOT', '.'))
+version = environ.setdefault('DPDK_VERSION', "None")
 
 master_doc = 'index'
 
diff --git a/doc/guides/meson.build b/doc/guides/meson.build
index 51f81da2e3..8933d75f6b 100644
--- a/doc/guides/meson.build
+++ b/doc/guides/meson.build
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2018 Intel Corporation
 
+doc_guides_source_dir = meson.current_source_dir()
 sphinx = find_program('sphinx-build', required: get_option('enable_docs'))
 
 if not sphinx.found()
diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
index ebd6dceb6a..a547da2017 100644
--- a/doc/guides/tools/dts.rst
+++ b/doc/guides/tools/dts.rst
@@ -282,3 +282,32 @@ There are three tools used in DTS to help with code 
checking, style and formatti
 These three tools are all used in ``devtools/dts-check-format.sh``,
 the DTS code check and format script.
 Refer to the script for usage: ``devtools/dts-check-format.sh -h``.
+
+
+Building DTS API docs
+-
+
+To build DTS API docs, install the dependencies with Poetry, then enter its 
shell:
+
+   .. code-block:: console
+
+   poetry install --with docs
+   poetry shell
+
+
+Build commands
+~~
+
+The documentation is built using the standard DPDK build system.
+
+After entering Poetry's shell, build the documentation with:
+
+   .. code-block:: console
+
+   ninja -C build dts/doc
+
+The output is generated in ``build/doc/api/dts/html``.
+
+.. Note::
+
+   Make sure to fix any Sphinx warnings when adding or updating docstrings.
diff --git a/dts/doc/doc-index.rst b/dts/doc/doc-index.rst
new file mode 100644
index 00..10151c6851
--- /dev/null
+++ b/dts/doc/doc-index.rst
@@ -0,0 +1,20 @@
+.. DPDK Test Suite documentation master file, created by
+   sphinx-quickstart on Tue Mar 14 12:23:52 2023.
+   You can adapt this file completely to your liking, but it should at least
+   contain the root `toctree` directive.
+
+Welcome to DPDK Test Suite's documentation!
+===
+
+.. toctree::
+   :maxdepth: 4
+   :caption: Contents:
+
+   modules
+
+Indices and tables
+==
+
+* :ref:`genindex`
+* :ref:`modindex`
+* :ref:`search`
diff --git a/dts/doc/meson.build b/dts/doc/meson.build
n

[RFC PATCH v2 4/4] dts: format docstrigs to google format

2023-05-04 Thread Juraj Linkeš
WIP: only one module is reformatted to serve as a demonstration.

The google format is documented here [0].

[0]: https://google.github.io/styleguide/pyguide.html

Signed-off-by: Juraj Linkeš 
---
 dts/framework/testbed_model/node.py | 152 +++-
 1 file changed, 103 insertions(+), 49 deletions(-)

diff --git a/dts/framework/testbed_model/node.py 
b/dts/framework/testbed_model/node.py
index 90467981c3..ad8ef442af 100644
--- a/dts/framework/testbed_model/node.py
+++ b/dts/framework/testbed_model/node.py
@@ -3,8 +3,13 @@
 # Copyright(c) 2022-2023 PANTHEON.tech s.r.o.
 # Copyright(c) 2022-2023 University of New Hampshire
 
-"""
-A node is a generic host that DTS connects to and manages.
+"""Common functionality for node management.
+
+There's a base class, Node, that's supposed to be extended by other classes
+with functionality specific to that node type.
+The only part that can be used standalone is the Node.skip_setup static method,
+which is a decorator used to skip method execution
+if skip_setup is passed by the user on the cmdline or in an env variable.
 """
 
 from typing import Any, Callable
@@ -26,10 +31,25 @@
 
 
 class Node(object):
-"""
-Basic class for node management. This class implements methods that
-manage a node, such as information gathering (of CPU/PCI/NIC) and
-environment setup.
+"""The base class for node management.
+
+It shouldn't be instantiated, but rather extended.
+It implements common methods to manage any node:
+
+   * connection to the node
+   * information gathering of CPU
+   * hugepages setup
+
+Arguments:
+node_config: The config from the input configuration file.
+
+Attributes:
+main_session: The primary OS-agnostic remote session used
+to communicate with the node.
+config: The configuration used to create the node.
+name: The name of the node.
+lcores: The list of logical cores that DTS can use on the node.
+It's derived from logical cores present on the node and user 
configuration.
 """
 
 main_session: OSSession
@@ -56,65 +76,89 @@ def __init__(self, node_config: NodeConfiguration):
 self._logger.info(f"Created node: {self.name}")
 
 def set_up_execution(self, execution_config: ExecutionConfiguration) -> 
None:
-"""
-Perform the execution setup that will be done for each execution
-this node is part of.
+"""Execution setup steps.
+
+Configure hugepages and call self._set_up_execution where
+the rest of the configuration steps (if any) are implemented.
+
+Args:
+execution_config: The execution configuration according to which
+the setup steps will be taken.
 """
 self._setup_hugepages()
 self._set_up_execution(execution_config)
 
 def _set_up_execution(self, execution_config: ExecutionConfiguration) -> 
None:
-"""
-This method exists to be optionally overwritten by derived classes and
-is not decorated so that the derived class doesn't have to use the 
decorator.
+"""Optional additional execution setup steps for derived classes.
+
+Derived classes should overwrite this
+if they want to add additional execution setup steps.
 """
 
 def tear_down_execution(self) -> None:
-"""
-Perform the execution teardown that will be done after each execution
-this node is part of concludes.
+"""Execution teardown steps.
+
+There are currently no common execution teardown steps
+common to all DTS node types.
 """
 self._tear_down_execution()
 
 def _tear_down_execution(self) -> None:
-"""
-This method exists to be optionally overwritten by derived classes and
-is not decorated so that the derived class doesn't have to use the 
decorator.
+"""Optional additional execution teardown steps for derived classes.
+
+Derived classes should overwrite this
+if they want to add additional execution teardown steps.
 """
 
 def set_up_build_target(
 self, build_target_config: BuildTargetConfiguration
 ) -> None:
-"""
-Perform the build target setup that will be done for each build target
-tested on this node.
+"""Build target setup steps.
+
+There are currently no common build target setup steps
+common to all DTS node types.
+
+Args:
+build_target_config: The build target configuration according to 
which
+the setup steps will be taken.
 """
 self._set_up_build_target(build_target_config)
 
 def _set_up_build_target(
 self, build_target_config: BuildTargetConfiguration
 ) -> None:
-"""
-This method exists to be optionally overwritten by derived classes and
-is not decorated so that the derived class

Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Qi,

Thanks for your review.

On Fri, Apr 28, 2023 at 11:43AM, Zhang, Qi Z wrote:



-Original Message-
From: Min Zhou 
Sent: Monday, April 24, 2023 5:06 PM
To: Yang, Qiming ; Wu, Wenjun1
; zhou...@loongson.cn
Cc: dev@dpdk.org; maob...@loongson.cn
Subject: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx
functions

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as the first
packet has the EOP bit set, and the length of this packet is less than or equal
to rxq->crc_len, the segmentation fault will definitely happen even though
on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be NULL, if at
the same time this packet has the EOP bit set and its length is less than or
equal to rxq->crc_len, the following loop will be excecuted:

 for (lp = first_seg; lp->next != rxm; lp = lp->next)
 ;

We know that the first_seg->next will be NULL under this condition. So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be greater than
rxq->crc_len. However, the out-of-order execution of CPU may make the
read ordering of the status and the rest of the descriptor fields in this
function not be correct. The related codes are as following:

 rxdp = &rx_ring[rx_id];
  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

 if (!(staterr & IXGBE_RXDADV_STAT_DD))
 break;

  #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is likely to
make the ready packet zero length. If the packet is the first packet and has
the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct. We also
did the same thing in the ixgbe_recv_pkts() function to make the rxd data be
valid even thougth we did not find segmentation fault in this function.

Signed-off-by: Min Zhou 
---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
**rx_pkts,
staterr = rxdp->wb.upper.status_error;
if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
break;
+
+   rte_rmb();

So "volatile" does not prevent re-order with Loongson compiler?


The memory consistency model of the LoongArch [1] uses the Weak 
Consistency model in which memory operations can be reordered.


[1] 
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN#overview-of-memory-consistency



rxd = *rxdp;

/*
@@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct
rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & IXGBE_RXDADV_STAT_DD))
break;

+   rte_rmb();
rxd = *rxdp;

PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.31.1


Best regards,

Min



Re: [RFC PATCH v2 3/4] dts: add doc generation

2023-05-04 Thread Bruce Richardson
On Thu, May 04, 2023 at 02:37:48PM +0200, Juraj Linkeš wrote:
> The tool used to generate developer docs is sphinx, which is already
> used in DPDK. The configuration is kept the same to preserve the style.
> 
> Sphinx generates the documentation from Python docstrings. The docstring
> format most suitable for DTS seems to be the Google format [0] which
> requires the sphinx.ext.napoleon extension.
> 
> There are two requirements for building DTS docs:
> * The same Python version as DTS or higher, because Sphinx import the
>   code.
> * Also the same Python packages as DTS, for the same reason.
> 
> [0] 
> https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings
> 
> Signed-off-by: Juraj Linkeš 
> ---
>  doc/api/meson.build  |  1 +
>  doc/guides/conf.py   | 22 ++
>  doc/guides/meson.build   |  1 +
>  doc/guides/tools/dts.rst | 29 +++
>  dts/doc/doc-index.rst| 20 
>  dts/doc/meson.build  | 50 
>  dts/meson.build  | 16 +
>  meson.build  |  1 +
>  meson_options.txt|  2 ++
>  9 files changed, 137 insertions(+), 5 deletions(-)
>  create mode 100644 dts/doc/doc-index.rst
>  create mode 100644 dts/doc/meson.build
>  create mode 100644 dts/meson.build
> 



> diff --git a/dts/doc/meson.build b/dts/doc/meson.build
> new file mode 100644
> index 00..db2bb0bed9
> --- /dev/null
> +++ b/dts/doc/meson.build
> @@ -0,0 +1,50 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2023 PANTHEON.tech s.r.o.
> +
> +sphinx = find_program('sphinx-build', required: 
> get_option('enable_dts_docs'))
> +sphinx_apidoc = find_program('sphinx-apidoc', required: 
> get_option('enable_dts_docs'))
> +
> +if sphinx.found() and sphinx_apidoc.found()
> +endif
> +
> +dts_api_framework_dir = join_paths(dts_dir, 'framework')
> +dts_api_build_dir = join_paths(doc_api_build_dir, 'dts')
> +dts_api_src = custom_target('dts_api_src',
> +output: 'modules.rst',
> +command: ['SPHINX_APIDOC_OPTIONS=members,show-inheritance',
> +sphinx_apidoc, '--append-syspath', '--force',
> +'--module-first', '--separate',
> +'--doc-project', 'DTS', '-V', meson.project_version(),
> +'-o', dts_api_build_dir,
> +dts_api_framework_dir],
> +build_by_default: get_option('enable_dts_docs'))
> +doc_targets += dts_api_src
> +doc_target_names += 'DTS_API_sphinx_sources'
> +
> +cp = find_program('cp', required: get_option('enable_docs'))

This should probably be "enable_dts_docs"



Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Morten,

Thanks for your comments.

On Fri, Apr 28, 2023 at 2:27PM, Morten Brørup wrote:

From: Zhang, Qi Z [mailto:qi.z.zh...@intel.com]
Sent: Friday, 28 April 2023 05.44


From: Min Zhou 
Sent: Monday, April 24, 2023 5:06 PM

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as the first
packet has the EOP bit set, and the length of this packet is less than or

equal

to rxq->crc_len, the segmentation fault will definitely happen even though
on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be NULL, if

at

the same time this packet has the EOP bit set and its length is less than or
equal to rxq->crc_len, the following loop will be excecuted:

 for (lp = first_seg; lp->next != rxm; lp = lp->next)
 ;

We know that the first_seg->next will be NULL under this condition. So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be greater

than

rxq->crc_len. However, the out-of-order execution of CPU may make the
read ordering of the status and the rest of the descriptor fields in this
function not be correct. The related codes are as following:

 rxdp = &rx_ring[rx_id];
  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

 if (!(staterr & IXGBE_RXDADV_STAT_DD))
 break;

  #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is likely to
make the ready packet zero length. If the packet is the first packet and has
the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct. We also
did the same thing in the ixgbe_recv_pkts() function to make the rxd data be
valid even thougth we did not find segmentation fault in this function.

Signed-off-by: Min Zhou 
---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
**rx_pkts,
staterr = rxdp->wb.upper.status_error;
if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
break;
+
+   rte_rmb();

So "volatile" does not prevent re-order with Loongson compiler?

"Volatile" does not prevent re-ordering on any compiler. "Volatile" only 
prevents caching of the variable marked volatile.

https://wiki.sei.cmu.edu/confluence/display/c/CON02-C.+Do+not+use+volatile+as+a+synchronization+primitive

Thinking out loud: I don't know the performance cost of rte_rmb(); perhaps 
using atomic accesses with the optimal memory ordering would be a better 
solution in the long term.
Yes, rte_rmb() probably had side effects on the performance. I will use 
a better solution to solve the problem in the V2 patch.



rxd = *rxdp;

/*
@@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct
rte_mbuf **rx_pkts, uint16_t nb_pkts,
if (!(staterr & IXGBE_RXDADV_STAT_DD))
break;

+   rte_rmb();
rxd = *rxdp;

PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.31.1


Best regards,

Min



Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Konstantin,

Thanks for your  comments.

On 2023/5/1 下午9:29, Konstantin Ananyev wrote:

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

From the ixgbe_recv_pkts_lro() function, we found that as long as the 
first
packet has the EOP bit set, and the length of this packet is less 
than or
equal to rxq->crc_len, the segmentation fault will definitely happen 
even

though on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be 
NULL, if

at the same time this packet has the EOP bit set and its length is less
than or equal to rxq->crc_len, the following loop will be excecuted:

    for (lp = first_seg; lp->next != rxm; lp = lp->next)
    ;

We know that the first_seg->next will be NULL under this condition. 
So the

expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be 
greater
than rxq->crc_len. However, the out-of-order execution of CPU may 
make the
read ordering of the status and the rest of the descriptor fields in 
this

function not be correct. The related codes are as following:

    rxdp = &rx_ring[rx_id];
 #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

    if (!(staterr & IXGBE_RXDADV_STAT_DD))
    break;

 #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is 
likely
to make the ready packet zero length. If the packet is the first 
packet and

has the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct. 
We also
did the same thing in the ixgbe_recv_pkts() function to make the rxd 
data
be valid even thougth we did not find segmentation fault in this 
function.


Signed-off-by: Min Zhou 
---
v2:
- Make the calling of rte_rmb() for all platforms
---
 drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c 
b/drivers/net/ixgbe/ixgbe_rxtx.c

index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,

 staterr = rxdp->wb.upper.status_error;
 if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
 break;
+
+    rte_rmb();
 rxd = *rxdp;




Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.
The LoongArch architecture uses the Weak Consistency model which can 
cause the problem, especially in scenario with many cores, such as 
Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot 
reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores.

About a fix - looks right, but a bit excessive to me -
as I understand all we need here is to prevent re-ordering by CPU itself.

Yes, thanks for cc-ing.

So rte_smp_rmb() seems enough here.
Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error, 
__ATOMIC_ACQUIRE);



Does __atomic_load_n() work on Windows if we use it to solve this problem ?



 /*
@@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct 
rte_mbuf **rx_pkts, uint16_t nb_pkts,

 if (!(staterr & IXGBE_RXDADV_STAT_DD))
 break;

+    rte_rmb();
 rxd = *rxdp;

 PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.31.1


Best regards,

Min



RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread Morten Brørup
> From: zhoumin [mailto:zhou...@loongson.cn]
> Sent: Thursday, 4 May 2023 15.17
> 
> Hi Konstantin,
> 
> Thanks for your  comments.
> 
> On 2023/5/1 下午9:29, Konstantin Ananyev wrote:
> >> Segmentation fault has been observed while running the
> >> ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
> >> processor which has 64 cores and 4 NUMA nodes.
> >>
> >> From the ixgbe_recv_pkts_lro() function, we found that as long as the
> >> first
> >> packet has the EOP bit set, and the length of this packet is less
> >> than or
> >> equal to rxq->crc_len, the segmentation fault will definitely happen
> >> even
> >> though on the other platforms, such as X86.
> >>
> >> Because when processd the first packet the first_seg->next will be
> >> NULL, if
> >> at the same time this packet has the EOP bit set and its length is less
> >> than or equal to rxq->crc_len, the following loop will be excecuted:
> >>
> >>     for (lp = first_seg; lp->next != rxm; lp = lp->next)
> >>     ;
> >>
> >> We know that the first_seg->next will be NULL under this condition.
> >> So the
> >> expression of lp->next->next will cause the segmentation fault.
> >>
> >> Normally, the length of the first packet with EOP bit set will be
> >> greater
> >> than rxq->crc_len. However, the out-of-order execution of CPU may
> >> make the
> >> read ordering of the status and the rest of the descriptor fields in
> >> this
> >> function not be correct. The related codes are as following:
> >>
> >>     rxdp = &rx_ring[rx_id];
> >>  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
> >>
> >>     if (!(staterr & IXGBE_RXDADV_STAT_DD))
> >>     break;
> >>
> >>  #2 rxd = *rxdp;
> >>
> >> The sentence #2 may be executed before sentence #1. This action is
> >> likely
> >> to make the ready packet zero length. If the packet is the first
> >> packet and
> >> has the EOP bit set, the above segmentation fault will happen.
> >>
> >> So, we should add rte_rmb() to ensure the read ordering be correct.
> >> We also
> >> did the same thing in the ixgbe_recv_pkts() function to make the rxd
> >> data
> >> be valid even thougth we did not find segmentation fault in this
> >> function.
> >>
> >> Signed-off-by: Min Zhou 
> >> ---
> >> v2:
> >> - Make the calling of rte_rmb() for all platforms
> >> ---
> >>  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
> >>  1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> index c9d6ca9efe..302a5ab7ff 100644
> >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> >> @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
> >> **rx_pkts,
> >>  staterr = rxdp->wb.upper.status_error;
> >>  if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
> >>  break;
> >> +
> >> +    rte_rmb();
> >>  rxd = *rxdp;
> >
> >
> >
> > Indeed, looks like a problem to me on systems with relaxed MO.
> > Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.
> The LoongArch architecture uses the Weak Consistency model which can
> cause the problem, especially in scenario with many cores, such as
> Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot
> reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores.
> > About a fix - looks right, but a bit excessive to me -
> > as I understand all we need here is to prevent re-ordering by CPU itself.
> Yes, thanks for cc-ing.
> > So rte_smp_rmb() seems enough here.
> > Or might be just:
> > staterr = __atomic_load_n(&rxdp->wb.upper.status_error,
> > __ATOMIC_ACQUIRE);
> >
> Does __atomic_load_n() work on Windows if we use it to solve this problem ?

Yes, __atomic_load_n() works on Windows too.




RE: [PATCH v2] net/ixgbe: consider DCB/VMDq conf when getting RSS conf

2023-05-04 Thread Zhang, Qi Z



> -Original Message-
> From: Min Zhou 
> Sent: Friday, April 28, 2023 4:43 PM
> To: Zhang, Qi Z ; Yang, Qiming
> ; Wu, Wenjun1 ;
> zhou...@loongson.cn
> Cc: dev@dpdk.org; maob...@loongson.cn
> Subject: [PATCH v2] net/ixgbe: consider DCB/VMDq conf when getting RSS
> conf
> 
> The mrqe field of MRQC register is an enum. From the Intel 82599 datasheet,
> we know that these values below for the mrqe field are all related to RSS
> configuration:
>   b = RSS disabled.
>   0001b = RSS only -- Single set of RSS 16 queues.
>   0010b = DCB enabled and RSS disabled -- 8 TCs, each allocated 1
> queue.
>   0011b = DCB enabled and RSS disabled -- 4 TCs, each allocated 1
> queue.
>   0100b = DCB and RSS -- 8 TCs, each allocated 16 RSS queues.
>   0101b = DCB and RSS -- 4 TCs, each allocated 16 RSS queues.
>   1000b = Virtualization only -- 64 pools, no RSS, each pool allocated
>   2 queues.
>   1010b = Virtualization and RSS -- 32 pools, each allocated 4 RSS
> queues.
>   1011b = Virtualization and RSS -- 64 pools, each allocated 2 RSS
> queues.
> 
> The ixgbe pmd will check whether the rss is enabled or not when getting rss
> conf. So, beside comparing the value of mrqe field with xxx0b and xxx1b, we
> also need to consider the other configurations, such as DCB + RSS or VMDQ +
> RSS. Otherwise, we may not get the correct rss conf in some cases, such as
> when we use DCB and RSS with 8 TCs which corresponds to 0100b for the
> mrqe field.
> 
> Signed-off-by: Min Zhou 

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi



RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread Zhang, Qi Z


> -Original Message-
> From: Morten Brørup 
> Sent: Thursday, May 4, 2023 9:22 PM
> To: zhoumin ; Konstantin Ananyev
> 
> Cc: dev@dpdk.org; maob...@loongson.cn; Yang, Qiming
> ; Wu, Wenjun1 ;
> ruifeng.w...@arm.com; d...@linux.vnet.ibm.com; Tyler Retzlaff
> 
> Subject: RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx
> functions
> 
> > From: zhoumin [mailto:zhou...@loongson.cn]
> > Sent: Thursday, 4 May 2023 15.17
> >
> > Hi Konstantin,
> >
> > Thanks for your  comments.
> >
> > On 2023/5/1 下午9:29, Konstantin Ananyev wrote:
> > >> Segmentation fault has been observed while running the
> > >> ixgbe_recv_pkts_lro() function to receive packets on the Loongson
> > >> 3C5000 processor which has 64 cores and 4 NUMA nodes.
> > >>
> > >> From the ixgbe_recv_pkts_lro() function, we found that as long as
> > >> the first packet has the EOP bit set, and the length of this packet
> > >> is less than or equal to rxq->crc_len, the segmentation fault will
> > >> definitely happen even though on the other platforms, such as X86.

Sorry to interrupt, but I am curious why this issue still exists on x86 
architecture. Can volatile be used to instruct the compiler to generate read 
instructions in a specific order, and does x86 guarantee not to reorder load 
operations?

> > >>
> > >> Because when processd the first packet the first_seg->next will be
> > >> NULL, if at the same time this packet has the EOP bit set and its
> > >> length is less than or equal to rxq->crc_len, the following loop
> > >> will be excecuted:
> > >>
> > >>     for (lp = first_seg; lp->next != rxm; lp = lp->next)
> > >>     ;
> > >>
> > >> We know that the first_seg->next will be NULL under this condition.
> > >> So the
> > >> expression of lp->next->next will cause the segmentation fault.
> > >>
> > >> Normally, the length of the first packet with EOP bit set will be
> > >> greater than rxq->crc_len. However, the out-of-order execution of
> > >> CPU may make the read ordering of the status and the rest of the
> > >> descriptor fields in this function not be correct. The related
> > >> codes are as following:
> > >>
> > >>     rxdp = &rx_ring[rx_id];
> > >>  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
> > >>
> > >>     if (!(staterr & IXGBE_RXDADV_STAT_DD))
> > >>     break;
> > >>
> > >>  #2 rxd = *rxdp;
> > >>
> > >> The sentence #2 may be executed before sentence #1. This action is
> > >> likely to make the ready packet zero length. If the packet is the
> > >> first packet and has the EOP bit set, the above segmentation fault
> > >> will happen.
> > >>
> > >> So, we should add rte_rmb() to ensure the read ordering be correct.
> > >> We also
> > >> did the same thing in the ixgbe_recv_pkts() function to make the
> > >> rxd data be valid even thougth we did not find segmentation fault
> > >> in this function.
> > >>
> > >> Signed-off-by: Min Zhou 
> > >> ---
> > >> v2:
> > >> - Make the calling of rte_rmb() for all platforms
> > >> ---
> > >>  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
> > >>  1 file changed, 3 insertions(+)
> > >>
> > >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff
> > >> 100644
> > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> > >> @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct
> > >> rte_mbuf **rx_pkts,
> > >>  staterr = rxdp->wb.upper.status_error;
> > >>  if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
> > >>  break;
> > >> +
> > >> +    rte_rmb();
> > >>  rxd = *rxdp;
> > >
> > >
> > >
> > > Indeed, looks like a problem to me on systems with relaxed MO.
> > > Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.
> > The LoongArch architecture uses the Weak Consistency model which can
> > cause the problem, especially in scenario with many cores, such as
> > Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot
> > reproduce it on Loongson 3C5000 with one NUMA node, which just has 16
> cores.
> > > About a fix - looks right, but a bit excessive to me - as I
> > > understand all we need here is to prevent re-ordering by CPU itself.
> > Yes, thanks for cc-ing.
> > > So rte_smp_rmb() seems enough here.
> > > Or might be just:
> > > staterr = __atomic_load_n(&rxdp->wb.upper.status_error,
> > > __ATOMIC_ACQUIRE);
> > >
> > Does __atomic_load_n() work on Windows if we use it to solve this
> problem ?
> 
> Yes, __atomic_load_n() works on Windows too.
> 



Re: [PATCH] net/gve: Check whether the driver is compatible with the device presented.

2023-05-04 Thread Ferruh Yigit
On 4/26/2023 10:37 PM, Rushil Gupta wrote:
> Change gve_driver_info fields to report DPDK as OS type and DPDK RTE
> version as OS version, reserving driver_version fields for GVE driver
> version based on features.
> 

'./devtools/check-git-log.sh' is giving some warnings, can you please
check them?

Contribution guide documentation may help to fix reported issues:
https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-subject-line

> Depends-on: series-27687
> 
> Signed-off-by: Rushil Gupta 
> Signed-off-by: Joshua Washington 
> Signed-off-by: Junfeng Guo 
> Signed-off-by: Jeroen de Borst 
> ---
>  drivers/net/gve/base/gve.h|  3 --
>  drivers/net/gve/base/gve_adminq.c | 19 +
>  drivers/net/gve/base/gve_adminq.h | 49 ++-
>  drivers/net/gve/base/gve_osdep.h  | 36 +
>  drivers/net/gve/gve_ethdev.c  | 65 +--
>  drivers/net/gve/gve_ethdev.h  |  2 +-
>  drivers/net/gve/gve_version.c | 14 +++
>  drivers/net/gve/gve_version.h | 25 
>  drivers/net/gve/meson.build   |  1 +
>  9 files changed, 198 insertions(+), 16 deletions(-)
>  create mode 100644 drivers/net/gve/gve_version.c
>  create mode 100644 drivers/net/gve/gve_version.h
> 
> diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h
> index 2dc4507acb..89f9654a72 100644
> --- a/drivers/net/gve/base/gve.h
> +++ b/drivers/net/gve/base/gve.h
> @@ -8,9 +8,6 @@
>  
>  #include "gve_desc.h"
>  
> -#define GVE_VERSION  "1.3.0"
> -#define GVE_VERSION_PREFIX   "GVE-"
> -
>  #ifndef GOOGLE_VENDOR_ID
>  #define GOOGLE_VENDOR_ID 0x1ae0
>  #endif
> diff --git a/drivers/net/gve/base/gve_adminq.c 
> b/drivers/net/gve/base/gve_adminq.c
> index e745b709b2..2e5099a5b0 100644
> --- a/drivers/net/gve/base/gve_adminq.c
> +++ b/drivers/net/gve/base/gve_adminq.c
> @@ -401,6 +401,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
>   case GVE_ADMINQ_GET_PTYPE_MAP:
>   priv->adminq_get_ptype_map_cnt++;
>   break;
> + case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
> + priv->adminq_verify_driver_compatibility_cnt++;
> + break;
>   default:
>   PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode);
>   }
> @@ -465,6 +468,22 @@ int gve_adminq_configure_device_resources(struct 
> gve_priv *priv,
>   return gve_adminq_execute_cmd(priv, &cmd);
>  }
>  
> +int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
> +u64 driver_info_len,
> +dma_addr_t driver_info_addr)
> +{
> + union gve_adminq_command cmd;
> +
> + memset(&cmd, 0, sizeof(cmd));
> + cmd.opcode = cpu_to_be32(GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY);
> + cmd.verify_driver_compatibility = (struct 
> gve_adminq_verify_driver_compatibility) {
> + .driver_info_len = cpu_to_be64(driver_info_len),
> + .driver_info_addr = cpu_to_be64(driver_info_addr),
> + };
> +
> + return gve_adminq_execute_cmd(priv, &cmd);
> +}
> +
>  int gve_adminq_deconfigure_device_resources(struct gve_priv *priv)
>  {
>   union gve_adminq_command cmd;
> diff --git a/drivers/net/gve/base/gve_adminq.h 
> b/drivers/net/gve/base/gve_adminq.h
> index 05550119de..edac32f031 100644
> --- a/drivers/net/gve/base/gve_adminq.h
> +++ b/drivers/net/gve/base/gve_adminq.h
> @@ -1,6 +1,6 @@
>  /* SPDX-License-Identifier: MIT
>   * Google Virtual Ethernet (gve) driver
> - * Copyright (C) 2015-2022 Google, Inc.
> + * Copyright (C) 2015-2023 Google, Inc.
>   */
>  
>  #ifndef _GVE_ADMINQ_H
> @@ -23,6 +23,7 @@ enum gve_adminq_opcodes {
>   GVE_ADMINQ_REPORT_STATS = 0xC,
>   GVE_ADMINQ_REPORT_LINK_SPEED= 0xD,
>   GVE_ADMINQ_GET_PTYPE_MAP= 0xE,
> + GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY  = 0xF,
>  };
>  
>  /* Admin queue status codes */
> @@ -145,6 +146,47 @@ enum gve_sup_feature_mask {
>  };
>  
>  #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
> +enum gve_driver_capbility {
> + gve_driver_capability_gqi_qpl = 0,
> + gve_driver_capability_gqi_rda = 1,
> + gve_driver_capability_dqo_qpl = 2, /* reserved for future use */
> + gve_driver_capability_dqo_rda = 3,
> +};
> +
> +#define GVE_CAP1(a) BIT((int)a)
> +#define GVE_CAP2(a) BIT(((int)a) - 64)
> +#define GVE_CAP3(a) BIT(((int)a) - 128)
> +#define GVE_CAP3(a) BIT(((int)a) - 192)

GVE_CAP2, GVE_CAP3 & GVE_CAP4 seems not used, do we need to add them now?
And I guess '(((int)a) - 64)' usage is to unset some bits, will it be
better to '(a & ~(1 << 6))' ?

> +
> +#define GVE_DRIVER_CAPABILITY_FLAGS1 \
> + (GVE_CAP1(gve_driver_capability_gqi_qpl) | \
> +  GVE_CAP1(gve_driver_capability_gqi_rda) | \
> +  GVE_CAP1(gve_driver_capability_dqo_rda))
> +
> +#define GVE_DRIVER_CAPABILITY_FLAGS2 0x0
> +#define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
> +#define GVE_DRIVER_CAPABILITY_FLAGS4 0x

Re: [PATCH 1/2] config/arm: Do not require processor information

2023-05-04 Thread Akihiko Odaki

On 2023/05/04 18:43, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Thursday, May 4, 2023 3:47 PM
To: Ruifeng Wang ; Bruce Richardson 
;
Juraj Linkeš 
Cc: dev@dpdk.org; nd 
Subject: Re: [PATCH 1/2] config/arm: Do not require processor information

On 2023/04/20 16:12, Akihiko Odaki wrote:

On 2023/04/20 16:10, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Thursday, April 20, 2023 9:40 AM
To: Ruifeng Wang ; Bruce Richardson
; Juraj Linkeš

Cc: dev@dpdk.org; nd 
Subject: Re: [PATCH 1/2] config/arm: Do not require processor
information

On 2023/04/17 16:41, Ruifeng Wang wrote:

-Original Message-
From: Akihiko Odaki 
Sent: Friday, April 14, 2023 8:42 PM
To: Ruifeng Wang ; Bruce Richardson

Cc: dev@dpdk.org; Akihiko Odaki 
Subject: [PATCH 1/2] config/arm: Do not require processor
information

DPDK can be built even without exact processor information for x86
and ppc so allow to build for Arm even if we don't know the
targeted processor is

unknown.


Hi Akihiko,

The design idea was to require an explicit generic build.
Default/native build doesn't fall back to generic build when SoC
info is not on the list.
So the user has less chance to generate a suboptimal binary by
accident.


Hi,

It is true that the suboptimal binary can result, but the rationale
here is that we tolerate that for x86 and ppc so it should not
really matter for Arm too. On x86 and ppc you don't need to modify
meson.build just to run dts on a development machine.


What modification do you need for a development machine?
I suppose "meson setup build -Dplatform=generic" will generate a
binary that can run on your development machine.


I didn't describe the situation well. I use DPDK Test Suite for
testing and it determines what flags to be passed to Meson. You need
to modify DPDK's meson.build or DTS to get it built.





Regards,
Akihiko Odaki


Hi,

Can you have a look at this again?


Thanks for the clarification of your use case.
Changes to DTS are in planning. It will allow the user to choose
the type of the build.
Your use case will be fulfilled then.


Such a feature indeed satisfies my requirement. Thanks in advance,
Akihiko Odaki


Re: [PATCH v2] build: announce requirement for C11

2023-05-04 Thread Tyler Retzlaff
On Thu, May 04, 2023 at 09:48:59AM +0100, Bruce Richardson wrote:
> On Thu, May 04, 2023 at 09:50:09AM +0200, Mattias Rönnblom wrote:
> > On 2023-05-03 19:30, Bruce Richardson wrote:
> > > Add a deprecation notice informing users that we will require a C11
> > > compiler from 23.11 release onwards. This requirement was agreed by
> > > technical board to enable use of newer C language features, e.g.
> > > standard atomics. [1]
> > > 
> > > [1] 
> > > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/
> > > 
> > > Signed-off-by: Bruce Richardson 
> > > Acked-by: Tyler Retzlaff 
> > > 
> > > ---
> > > 
> > > V2:
> > > - add requirement for stdatomics
> > > - fix sphinx formatting
> > > ---
> > >   doc/guides/rel_notes/deprecation.rst | 9 +
> > >   1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/doc/guides/rel_notes/deprecation.rst 
> > > b/doc/guides/rel_notes/deprecation.rst
> > > index dcc1ca1696..70c6019d26 100644
> > > --- a/doc/guides/rel_notes/deprecation.rst
> > > +++ b/doc/guides/rel_notes/deprecation.rst
> > > @@ -11,6 +11,15 @@ here.
> > >   Deprecation Notices
> > >   ---
> > > +* C Compiler: From DPDK 23.11 onwards,
> > > +  building DPDK will require a C compiler which supports the C11 
> > > standard,
> > > +  including support for C11 standard atomics.
> > 
> > The whole of C11, or just the mandatory parts (+atomics)?
> > 
> I assume we only need mandatory + atomics, however perhaps someone more
> knowledgable about what the optional parts of the spec are, can correct me
> on this. Once clarified, I maybe should reword this yet again to call it
> out even more specificallly.

mandatory + atomics, no more than that.

so -std=c11 and __STDC_NO_ATOMICS__ is not defined, that is all.

> 
> /Bruce


Re: [PATCH v3 5/5] config/arm: add AMD CDX

2023-05-04 Thread Ferruh Yigit
On 4/21/2023 3:54 PM, Nipun Gupta wrote:
> Adding support for AMD CDX devices
> 
> Signed-off-by: Nipun Gupta 
> ---
>  config/arm/arm64_cdx_linux_gcc | 17 +
>  config/arm/meson.build | 14 ++
>  2 files changed, 31 insertions(+)
>  create mode 100644 config/arm/arm64_cdx_linux_gcc
> 
> diff --git a/config/arm/arm64_cdx_linux_gcc b/config/arm/arm64_cdx_linux_gcc
> new file mode 100644
> index 00..8e6d619dae
> --- /dev/null
> +++ b/config/arm/arm64_cdx_linux_gcc
> @@ -0,0 +1,17 @@
> +[binaries]
> +c = ['ccache', 'aarch64-linux-gnu-gcc']
> +cpp = ['ccache', 'aarch64-linux-gnu-g++']
> +ar = 'aarch64-linux-gnu-ar'
> +as = 'aarch64-linux-gnu-as'
> +strip = 'aarch64-linux-gnu-strip'
> +pkgconfig = 'aarch64-linux-gnu-pkg-config'
> +pcap-config = ''
> +
> +[host_machine]
> +system = 'linux'
> +cpu_family = 'aarch64'
> +cpu = 'armv8-a'
> +endian = 'little'
> +
> +[properties]
> +platform = 'cdx'
> diff --git a/config/arm/meson.build b/config/arm/meson.build
> index 5213434ca4..39b8929534 100644
> --- a/config/arm/meson.build
> +++ b/config/arm/meson.build
> @@ -305,6 +305,18 @@ soc_bluefield = {
>  'numa': false
>  }
>  
> +soc_cdx = {
> +'description': 'AMD CDX',
> +'implementer': '0x41',
> +'part_number': '0xd42',
> +'flags': [
> +['RTE_MACHINE', '"cdx"'],
> +['RTE_MAX_LCORE', 16],
> +['RTE_MAX_NUMA_NODES', 1]
> +],
> +'numa': false
> +}

Hi Nipun,

Why we need a new arm platform/config? Is it because of above flags?
If it can work with default values, I think we can drop this patch.


RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-04 Thread Coyle, David
Hi Morten

> -Original Message-
> From: Morten Brørup 
> 



> Power saving is important for the environment (to save the planet and all
> that), so everyone should contribute, if they have a good solution. So even if
> our algorithm had a significant degree of innovation, we would probably
> choose to make it public anyway. Open sourcing it also makes it possible for
> chip vendors like Intel to fine tune it more than we can ourselves, which also
> comes back to benefit us. All products need some sort of power saving in to
> stay competitive, but power saving algorithms is not an area we want to
> pursue for competitive purposes in our products.
> 
> Our algorithm is too simple to make a library at this point, but I have been
> thinking about how we can make it a generic library when it has matured
> some more. I will take your information about the many customers' need to
> have it invisibly injected into consideration in this regard.
> 
> Our current algorithm works like this:
> 
> while (running) {
> int more = 0;
> more += stage1();
> more += stage2();
> more += stage3();
> if (!more) sleep();
> }
> 
> Each pipeline stage only returns 1 if it processed a full burst. Furthermore, 
> if a
> pipeline stage processed a full burst, but happens to know that no more data
> is readily available for it, it returns 0 instead.
> 
> Obviously, the sleep() duration must be short enough to avoid that the NIC
> RX descriptor rings overflow before the ingress pipeline stage is serviced
> again.
> 
> Changing the algorithm to "more" (1 = more work expected by the pipeline
> stage) from "busy" (1 = some work done by the pipeline stage) has the
> consequence that sleep() is called more often, which has the follow-on
> consequence that the ingress stage is called less often, and thus more often
> has a full burst to process.
> 
> We know from our in-house profiler that processing a full burst provides
> *much* higher execution efficiency (cycles/packet) than processing a few
> packets. This is public knowledge - after all, this is the whole point of 
> DPDK's
> vector packet processing design! Nonetheless, it might surprise some people
> how much the efficiency (cycles/packet) increases when processing a full
> burst compared to processing just a few packets. I will leave it up to the
> readers to make their own experiments. :-)
> 
> Our initial "busy" algorithm behaved like this:
> Process a few packets (at low efficiency), don't sleep, Process a few packets
> (at low efficiency), don't sleep, Process a few packets (at low efficiency),
> don't sleep, Process a few packets (at low efficiency), don't sleep, Process a
> few packets (at low efficiency), don't sleep, Process a few packets (at low
> efficiency), don't sleep, Process a few packets (at low efficiency), don't
> sleep, Process a few packets (at low efficiency), don't sleep, No packets to
> process (we are lucky this time!), sleep briefly, Repeat.
> 
> So we switched to our "more" algorithm, which behaves like this:
> Process a few packets (at low efficiency), sleep briefly, Process a full 
> burst of
> packets (at high efficiency), don't sleep, Repeat.
> 
> Instead of processing e.g. 8 small bursts per sleep, we now process only 2
> bursts per sleep. And the big of the two bursts is processed at higher
> efficiency.
> 
> We can improve this algorithm in some areas...
> 
> E.g. some of our pipeline stages also know that they are not going to do
> anymore work for the next X amount of nanoseconds; but we don't use that
> information in our power management algorithm yet. The sleep duration
> could depend on this.
> 
> Also, we don't use the CPU power management states yet. I assume that
> doing some work for 20 us at half clock speed is more power conserving than
> doing the same work at full speed for 10 us and then sleeping for 10 us.
> That's another potential improvement.
> 
> 
> What we need in generic a power management helper library are functions
> to feed it with the application's perception of how much work is being done,
> and functions to tell if we can sleep and/or if we should change the power
> management states of the individual CPU cores.
> 
> Such a unified power management helper (or "busyness") library could
> perhaps also be fed with data directly from the drivers and libraries to
> support the customer use cases you described.

[DC] Thank you for that detailed description, very interesting. There may
well be merit in upstreaming such an algorithm as a library once it has
matured as you said.

Configuration could include specifying what a "full burst"
actually is. Different stages of a pipeline may also have different definitions
of busyness, so that may also need to considered:
- Some stages may perform an operation (e.g. running an acl rule check) on a
burst of packets and then it is complete
- Other stages may be more asynchronous in nature e.g. enqueuing and
dequeuing to/from a crypto device or a QoS scheduler. The dequeue might
not

Re: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-04 Thread Stephen Hemminger
On Thu, 4 May 2023 16:11:31 +
"Coyle, David"  wrote:

> Hi Morten
> 
> > -Original Message-
> > From: Morten Brørup 
> >   
> 
> 
> 
> > Power saving is important for the environment (to save the planet and all
> > that), so everyone should contribute, if they have a good solution. So even 
> > if
> > our algorithm had a significant degree of innovation, we would probably
> > choose to make it public anyway. Open sourcing it also makes it possible for
> > chip vendors like Intel to fine tune it more than we can ourselves, which 
> > also
> > comes back to benefit us. All products need some sort of power saving in to
> > stay competitive, but power saving algorithms is not an area we want to
> > pursue for competitive purposes in our products.
> > 
> > Our algorithm is too simple to make a library at this point, but I have been
> > thinking about how we can make it a generic library when it has matured
> > some more. I will take your information about the many customers' need to
> > have it invisibly injected into consideration in this regard.
> > 
> > Our current algorithm works like this:
> > 
> > while (running) {
> > int more = 0;
> > more += stage1();
> > more += stage2();
> > more += stage3();
> > if (!more) sleep();
> > }
> > 
> > Each pipeline stage only returns 1 if it processed a full burst. 
> > Furthermore, if a
> > pipeline stage processed a full burst, but happens to know that no more data
> > is readily available for it, it returns 0 instead.
> > 
> > Obviously, the sleep() duration must be short enough to avoid that the NIC
> > RX descriptor rings overflow before the ingress pipeline stage is serviced
> > again.
> > 
> > Changing the algorithm to "more" (1 = more work expected by the pipeline
> > stage) from "busy" (1 = some work done by the pipeline stage) has the
> > consequence that sleep() is called more often, which has the follow-on
> > consequence that the ingress stage is called less often, and thus more often
> > has a full burst to process.
> > 
> > We know from our in-house profiler that processing a full burst provides
> > *much* higher execution efficiency (cycles/packet) than processing a few
> > packets. This is public knowledge - after all, this is the whole point of 
> > DPDK's
> > vector packet processing design! Nonetheless, it might surprise some people
> > how much the efficiency (cycles/packet) increases when processing a full
> > burst compared to processing just a few packets. I will leave it up to the
> > readers to make their own experiments. :-)
> > 
> > Our initial "busy" algorithm behaved like this:
> > Process a few packets (at low efficiency), don't sleep, Process a few 
> > packets
> > (at low efficiency), don't sleep, Process a few packets (at low efficiency),
> > don't sleep, Process a few packets (at low efficiency), don't sleep, 
> > Process a
> > few packets (at low efficiency), don't sleep, Process a few packets (at low
> > efficiency), don't sleep, Process a few packets (at low efficiency), don't
> > sleep, Process a few packets (at low efficiency), don't sleep, No packets to
> > process (we are lucky this time!), sleep briefly, Repeat.
> > 
> > So we switched to our "more" algorithm, which behaves like this:
> > Process a few packets (at low efficiency), sleep briefly, Process a full 
> > burst of
> > packets (at high efficiency), don't sleep, Repeat.
> > 
> > Instead of processing e.g. 8 small bursts per sleep, we now process only 2
> > bursts per sleep. And the big of the two bursts is processed at higher
> > efficiency.
> > 
> > We can improve this algorithm in some areas...
> > 
> > E.g. some of our pipeline stages also know that they are not going to do
> > anymore work for the next X amount of nanoseconds; but we don't use that
> > information in our power management algorithm yet. The sleep duration
> > could depend on this.
> > 
> > Also, we don't use the CPU power management states yet. I assume that
> > doing some work for 20 us at half clock speed is more power conserving than
> > doing the same work at full speed for 10 us and then sleeping for 10 us.
> > That's another potential improvement.
> > 
> > 
> > What we need in generic a power management helper library are functions
> > to feed it with the application's perception of how much work is being done,
> > and functions to tell if we can sleep and/or if we should change the power
> > management states of the individual CPU cores.
> > 
> > Such a unified power management helper (or "busyness") library could
> > perhaps also be fed with data directly from the drivers and libraries to
> > support the customer use cases you described.  
> 
> [DC] Thank you for that detailed description, very interesting. There may
> well be merit in upstreaming such an algorithm as a library once it has
> matured as you said.
> 
> Configuration could include specifying what a "full burst"
> actually is. Different stages of a pipeline may also have different 
> definitions
> of busyness, so that 

[Bug 1225] Salesforce Integration

2023-05-04 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1225

Stephen Hemminger (step...@networkplumber.org) changed:

   What|Removed |Added

 CC||step...@networkplumber.org
 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Stephen Hemminger (step...@networkplumber.org) ---
Spam

-- 
You are receiving this mail because:
You are the assignee for the bug.

Re: DPDK Release Status Meeting 2023-05-04

2023-05-04 Thread Stephen Hemminger
On Thu, 4 May 2023 10:24:45 +
"Mcnamara, John"  wrote:

> Release status meeting minutes 2023-05-04
> =
> 
> Agenda:
> * Release Dates
> * Subtrees
> * Roadmaps
> * LTS
> * Defects
> * Opens
> 
> Participants:
> * AMD
> * ARM
> * Intel
> * Marvell
> * Nvidia
> * Red Hat
> 
> 
> Release Dates
> -
> 
> The following are the proposed current dates for 23.07:
> 
> * V1:  22 April 2023
> * RC1: 31   May 2023
> * RC2: 21  June 2023
> * RC3: 28  June 2023
> * Release: 12  July 2023
> 
> 
> Subtrees
> 
> 
> * next-net
>   * Under review.
>   * ~120 patches in backlog.
>   * GVE license update
> 
> * next-net-intel
>   * Starting to apply patches, about 20 patches are waiting for merge.
> 
> * next-net-mlx
>   * No update
> 
> * next-net-mvl
>   * Some patches merged.
>   * ~30 patches remaining.
> 
> * next-eventdev
>   * Some patches merged.
>   * AMD patches under review
> 
> * next-baseband
>   * No major features so far.
> 
> * next-virtio
>   * Intel series for port mirroring
> * Some concerns on code duplication and API complexity
>   * Report of performance degradation in vhost driver
>   * New series on guest notifications in slow path
>   * VDUSE - Vhost VDPA in userspace. https://lwn.net/Articles/841054/
> * Under review
> 
> * next-crypto
>   * 60-70 patches in backlog
>   * Patches to add new algorithms
>   * Reviewing PDCP patches
>   * Reviews and merges ongoing
> 
> * main
>   * Starting to merge patches to improve the Windows port
>   * Reviewing ethdev patches
>   * Patches from ARM for PMU
> * Postponed from last release
> * For tracing
> * Awaiting feedback on comments
>   * Power Management feature from AMD
>   * Memarea feature
>   * CDX bus patches from AMD
> * Virtual PCI like bus
>   * New checkpatch checks for atomic functions
>   * Issues with Fedora 38 and GCC 13
>   * Issue with Coverity automated analysis
>   * Call for updates to the external Roadmap:
> https://core.dpdk.org/roadmap/
> 
> 
> Proposed Schedule for 2023
> --
> 
> See also http://core.dpdk.org/roadmap/#dates
> 
> 23.07
>   * Proposal deadline (RFC/v1 patches): 22 April 2023
>   * API freeze (-rc1): 31 May 2023
>   * PMD features freeze (-rc2): 21 June 2023
>   * Builtin applications features freeze (-rc3): 28 June 2023
>   * Release: 12 July 2023
> 
> 23.11
>   * Proposal deadline (RFC/v1 patches): 12 August 2023
>   * API freeze (-rc1): 29 September 2023
>   * PMD features freeze (-rc2): 20 October 2023
>   * Builtin applications features freeze (-rc3): 27 October 2023
>   * Release: 15 November 2023
> 
> 
> LTS
> ---
> 
> Next LTS releases will be:
> 
> * 22.11.1
>   * RC1 sent - Scheduled for May 5
> * 21.11.4
>   * Got test feedback from Nvidia, Red Hat and Intel
> * 20.11.8
>   * RC1 sent and first test report received
> * 19.11.15
>   * CVE and critical fixes only.
> 
> 
> * Distros
>   * v20.11 in Debian 11
>   * Ubuntu 22.04-LTS contains 21.11
>   * Ubuntu 23.04 contains 22.11
> 
> Defects
> ---
> 
> * Bugzilla links, 'Bugs',  added for hosted projects
>   * https://www.dpdk.org/hosted-projects/

Could we add current bugzilla statistics to the this status report?


RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue

2023-05-04 Thread Morten Brørup
> -Original Message-
> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> 
> On Thu, 4 May 2023 16:11:31 +
> "Coyle, David"  wrote:
> 
> > Hi Morten
> >
> > > -Original Message-
> > > From: Morten Brørup 
> > >
> >
> > 
> >
> > > Power saving is important for the environment (to save the planet and all
> > > that), so everyone should contribute, if they have a good solution. So
> even if
> > > our algorithm had a significant degree of innovation, we would probably
> > > choose to make it public anyway. Open sourcing it also makes it possible
> for
> > > chip vendors like Intel to fine tune it more than we can ourselves, which
> also
> > > comes back to benefit us. All products need some sort of power saving in
> to
> > > stay competitive, but power saving algorithms is not an area we want to
> > > pursue for competitive purposes in our products.
> > >
> > > Our algorithm is too simple to make a library at this point, but I have
> been
> > > thinking about how we can make it a generic library when it has matured
> > > some more. I will take your information about the many customers' need to
> > > have it invisibly injected into consideration in this regard.
> > >
> > > Our current algorithm works like this:
> > >
> > > while (running) {
> > > int more = 0;
> > > more += stage1();
> > > more += stage2();
> > > more += stage3();
> > > if (!more) sleep();
> > > }
> > >
> > > Each pipeline stage only returns 1 if it processed a full burst.
> Furthermore, if a
> > > pipeline stage processed a full burst, but happens to know that no more
> data
> > > is readily available for it, it returns 0 instead.
> > >
> > > Obviously, the sleep() duration must be short enough to avoid that the NIC
> > > RX descriptor rings overflow before the ingress pipeline stage is serviced
> > > again.
> > >
> > > Changing the algorithm to "more" (1 = more work expected by the pipeline
> > > stage) from "busy" (1 = some work done by the pipeline stage) has the
> > > consequence that sleep() is called more often, which has the follow-on
> > > consequence that the ingress stage is called less often, and thus more
> often
> > > has a full burst to process.
> > >
> > > We know from our in-house profiler that processing a full burst provides
> > > *much* higher execution efficiency (cycles/packet) than processing a few
> > > packets. This is public knowledge - after all, this is the whole point of
> DPDK's
> > > vector packet processing design! Nonetheless, it might surprise some
> people
> > > how much the efficiency (cycles/packet) increases when processing a full
> > > burst compared to processing just a few packets. I will leave it up to the
> > > readers to make their own experiments. :-)
> > >
> > > Our initial "busy" algorithm behaved like this:
> > > Process a few packets (at low efficiency), don't sleep, Process a few
> packets
> > > (at low efficiency), don't sleep, Process a few packets (at low
> efficiency),
> > > don't sleep, Process a few packets (at low efficiency), don't sleep,
> Process a
> > > few packets (at low efficiency), don't sleep, Process a few packets (at
> low
> > > efficiency), don't sleep, Process a few packets (at low efficiency), don't
> > > sleep, Process a few packets (at low efficiency), don't sleep, No packets
> to
> > > process (we are lucky this time!), sleep briefly, Repeat.
> > >
> > > So we switched to our "more" algorithm, which behaves like this:
> > > Process a few packets (at low efficiency), sleep briefly, Process a full
> burst of
> > > packets (at high efficiency), don't sleep, Repeat.
> > >
> > > Instead of processing e.g. 8 small bursts per sleep, we now process only 2
> > > bursts per sleep. And the big of the two bursts is processed at higher
> > > efficiency.
> > >
> > > We can improve this algorithm in some areas...
> > >
> > > E.g. some of our pipeline stages also know that they are not going to do
> > > anymore work for the next X amount of nanoseconds; but we don't use that
> > > information in our power management algorithm yet. The sleep duration
> > > could depend on this.
> > >
> > > Also, we don't use the CPU power management states yet. I assume that
> > > doing some work for 20 us at half clock speed is more power conserving
> than
> > > doing the same work at full speed for 10 us and then sleeping for 10 us.
> > > That's another potential improvement.
> > >
> > >
> > > What we need in generic a power management helper library are functions
> > > to feed it with the application's perception of how much work is being
> done,
> > > and functions to tell if we can sleep and/or if we should change the power
> > > management states of the individual CPU cores.
> > >
> > > Such a unified power management helper (or "busyness") library could
> > > perhaps also be fed with data directly from the drivers and libraries to
> > > support the customer use cases you described.
> >
> > [DC] Thank you for that detailed description, very interesting. There may
> > w

[PATCH v3 00/11] sync Truflow support with latest release

2023-05-04 Thread Ajit Khaparde
Update Truflow support to latest release, deprecating code, updating
the copyright date and hsi structure, syncing the truflow core,
adding ULP shared session support, RSS action support, Queue
action support, rte meter support, and more.

Please apply.

v2->v3:
- update some commit messages
- removed some empty lines in the patches
- removed some dead and unnecessary code
- fixed some checkpatch errors

Version 2 fixes:
- misspellings
- whitespace issues
- signed off issues


Kishore Padmanabha (1):
  net/bnxt: fix multi-root card support

Randy Schacher (8):
  net/bnxt: remove deprecated features
  net/bnxt: update bnxt hsi structure
  net/bnxt: update copyright date and cleanup whitespace
  net/bnxt: update Truflow core
  net/bnxt: update ULP shared session support
  net/bnxt: add RSS and Queue action in TruFLow
  net/bnxt: add support for rte meter
  net/bnxt: add support for eCPRI packet parsing

Shuanglin Wang (1):
  net/bnxt: set RSS config based on RSS mode

Somnath Kotur (1):
  net/bnxt: update PTP support on Thor

 .mailmap  | 1 +
 doc/guides/nics/features/bnxt.ini | 3 +
 drivers/net/bnxt/bnxt.h   |66 +-
 drivers/net/bnxt/bnxt_cpr.c   | 2 +-
 drivers/net/bnxt/bnxt_cpr.h   | 2 +-
 drivers/net/bnxt/bnxt_ethdev.c|   209 +-
 drivers/net/bnxt/bnxt_filter.c| 2 +-
 drivers/net/bnxt/bnxt_filter.h| 6 +-
 drivers/net/bnxt/bnxt_flow.c  |75 +-
 drivers/net/bnxt/bnxt_hwrm.c  |   272 +-
 drivers/net/bnxt/bnxt_hwrm.h  |40 +-
 drivers/net/bnxt/bnxt_irq.c   | 2 +-
 drivers/net/bnxt/bnxt_irq.h   | 3 +-
 drivers/net/bnxt/bnxt_nvm_defs.h  | 3 +-
 drivers/net/bnxt/bnxt_reps.c  | 4 +-
 drivers/net/bnxt/bnxt_reps.h  | 2 +-
 drivers/net/bnxt/bnxt_ring.c  | 7 +-
 drivers/net/bnxt/bnxt_ring.h  | 3 +-
 drivers/net/bnxt/bnxt_rxq.c   |   159 +-
 drivers/net/bnxt/bnxt_rxq.h   | 2 +-
 drivers/net/bnxt/bnxt_rxr.c   |15 +-
 drivers/net/bnxt/bnxt_rxr.h   | 3 +-
 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_common.h   | 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_sse.c  | 2 +-
 drivers/net/bnxt/bnxt_stats.c | 2 +-
 drivers/net/bnxt/bnxt_stats.h | 2 +-
 drivers/net/bnxt/bnxt_txq.c   | 3 +-
 drivers/net/bnxt/bnxt_txq.h   | 2 +-
 drivers/net/bnxt/bnxt_txr.c   |55 +-
 drivers/net/bnxt/bnxt_txr.h   | 4 +-
 drivers/net/bnxt/bnxt_util.c  | 2 +-
 drivers/net/bnxt/bnxt_util.h  | 3 +-
 drivers/net/bnxt/bnxt_vnic.c  |   974 +-
 drivers/net/bnxt/bnxt_vnic.h  |80 +-
 drivers/net/bnxt/hsi_struct_def_dpdk.h|  5723 ++-
 drivers/net/bnxt/meson.build  | 5 +-
 drivers/net/bnxt/rte_pmd_bnxt.c   | 2 +-
 drivers/net/bnxt/rte_pmd_bnxt.h   | 2 +-
 drivers/net/bnxt/tf_core/bitalloc.c   | 3 +-
 drivers/net/bnxt/tf_core/bitalloc.h   | 3 +-
 drivers/net/bnxt/tf_core/cfa_resource_types.h | 5 +-
 drivers/net/bnxt/tf_core/cfa_tcam_mgr.c   |  2116 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr.h   |   523 +
 .../net/bnxt/tf_core/cfa_tcam_mgr_device.h|   101 +
 .../net/bnxt/tf_core/cfa_tcam_mgr_hwop_msg.c  |   201 +
 .../net/bnxt/tf_core/cfa_tcam_mgr_hwop_msg.h  |28 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr_p4.c|   921 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr_p4.h|20 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr_p58.c   |   926 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr_p58.h   |20 +
 drivers/net/bnxt/tf_core/cfa_tcam_mgr_sbmp.h  |   126 +
 .../net/bnxt/tf_core/cfa_tcam_mgr_session.c   |   377 +
 .../net/bnxt/tf_core/cfa_tcam_mgr_session.h   |54 +
 drivers/net/bnxt/tf_core/dpool.c  | 3 +-
 drivers/net/bnxt/tf_core/dpool.h  | 3 +-
 drivers/net/bnxt/tf_core/ll.c | 2 +-
 drivers/net/bnxt/tf_core/ll.h | 2 +-
 drivers/net/bnxt/tf_core/lookup3.h| 1 -
 drivers/net/bnxt/tf_core/meson.build  |38 +-
 drivers/net/bnxt/tf_core/rand.c   | 2 +-
 drivers/net/bnxt/tf_core/rand.h   | 3 +-
 drivers/net/bnxt/tf_core/stack.c  | 2 +-
 drivers/net/bnxt/tf_core/stack.h  | 3 +-
 drivers/net/bnxt/tf_core/tf_common.h  | 3 +-
 drivers/net/bnxt/tf_core/tf_core.c|56 +-
 drivers/net/bnxt/tf_core/tf_core.h|   189 +-
 drivers/net/bnxt/tf_core/tf_

[PATCH v3 01/11] net/bnxt: remove deprecated features

2023-05-04 Thread Ajit Khaparde
From: Randy Schacher 

- Deprecate shadow identifier
- Deprecate shadow TCAM
- Remove files which are not needed anymore.

Signed-off-by: Randy Schacher 
Signed-off-by: Kishore Padmanabha 
Reviewed-by: Peter Spreadborough 
Reviewed-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c  |  53 --
 drivers/net/bnxt/bnxt_hwrm.h  |  10 -
 drivers/net/bnxt/tf_core/meson.build  |   2 -
 drivers/net/bnxt/tf_core/tf_core.c|   2 -
 drivers/net/bnxt/tf_core/tf_core.h|  91 +-
 drivers/net/bnxt/tf_core/tf_device.c  |  35 -
 drivers/net/bnxt/tf_core/tf_device.h  |   6 -
 drivers/net/bnxt/tf_core/tf_device_p4.c   |  10 -
 drivers/net/bnxt/tf_core/tf_device_p58.c  |  10 -
 drivers/net/bnxt/tf_core/tf_identifier.c  | 108 ---
 drivers/net/bnxt/tf_core/tf_identifier.h  |   4 -
 drivers/net/bnxt/tf_core/tf_if_tbl.h  |   8 -
 drivers/net/bnxt/tf_core/tf_session.c |   9 +-
 drivers/net/bnxt/tf_core/tf_session.h |  18 +-
 .../net/bnxt/tf_core/tf_shadow_identifier.c   | 190 
 .../net/bnxt/tf_core/tf_shadow_identifier.h   | 229 -
 drivers/net/bnxt/tf_core/tf_shadow_tcam.c | 837 --
 drivers/net/bnxt/tf_core/tf_shadow_tcam.h | 195 
 drivers/net/bnxt/tf_core/tf_tcam.c| 243 -
 drivers/net/bnxt/tf_core/tf_tcam.h|  38 +-
 drivers/net/bnxt/tf_core/tf_util.c|   2 -
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c|   3 -
 22 files changed, 8 insertions(+), 2095 deletions(-)
 delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_identifier.c
 delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_identifier.h
 delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_tcam.c
 delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_tcam.h

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index d86ac73293..3f273df6f3 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -407,59 +407,6 @@ int bnxt_hwrm_tf_message_direct(struct bnxt *bp,
return rc;
 }
 
-int bnxt_hwrm_tf_message_tunneled(struct bnxt *bp,
- bool use_kong_mb,
- uint16_t tf_type,
- uint16_t tf_subtype,
- uint32_t *tf_response_code,
- void *msg,
- uint32_t msg_len,
- void *response,
- uint32_t response_len)
-{
-   int rc = 0;
-   struct hwrm_cfa_tflib_input req = { .req_type = 0 };
-   struct hwrm_cfa_tflib_output *resp = bp->hwrm_cmd_resp_addr;
-   bool mailbox = BNXT_USE_CHIMP_MB;
-
-   if (msg_len > sizeof(req.tf_req))
-   return -ENOMEM;
-
-   if (use_kong_mb)
-   mailbox = BNXT_USE_KONG(bp);
-
-   HWRM_PREP(&req, HWRM_TF, mailbox);
-   /* Build request using the user supplied request payload.
-* TLV request size is checked at build time against HWRM
-* request max size, thus no checking required.
-*/
-   req.tf_type = tf_type;
-   req.tf_subtype = tf_subtype;
-   memcpy(req.tf_req, msg, msg_len);
-
-   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req), mailbox);
-   HWRM_CHECK_RESULT();
-
-   /* Copy the resp to user provided response buffer */
-   if (response != NULL)
-   /* Post process response data. We need to copy only
-* the 'payload' as the HWRM data structure really is
-* HWRM header + msg header + payload and the TFLIB
-* only provided a payload place holder.
-*/
-   if (response_len != 0) {
-   memcpy(response,
-  resp->tf_resp,
-  response_len);
-   }
-
-   /* Extract the internal tflib response code */
-   *tf_response_code = resp->tf_resp_code;
-   HWRM_UNLOCK();
-
-   return rc;
-}
-
 int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp, struct bnxt_vnic_info 
*vnic)
 {
int rc = 0;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index a82d9fb3ef..f9d9fe0ef2 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -79,16 +79,6 @@ struct hwrm_func_qstats_output;
bp->rx_cos_queue[x].profile =   \
resp->queue_id##x##_service_profile
 
-int bnxt_hwrm_tf_message_tunneled(struct bnxt *bp,
- bool use_kong_mb,
- uint16_t tf_type,
- uint16_t tf_subtype,
- uint32_t *tf_response_code,
- void *msg,
- uint32_t msg_len,
- void *response,
- uint32_t response_len);
-

[PATCH v3 03/11] net/bnxt: update copyright date and cleanup whitespace

2023-05-04 Thread Ajit Khaparde
From: Randy Schacher 

Update the Copyright to 2023
Clean up extra blank lines
Clean up other whitespace issues

Signed-off-by: Randy Schacher 
Reviewed-by: Kishore Padmanabha 
Reviewed-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_cpr.c| 2 +-
 drivers/net/bnxt/bnxt_cpr.h| 2 +-
 drivers/net/bnxt/bnxt_filter.c | 2 +-
 drivers/net/bnxt/bnxt_irq.c| 2 +-
 drivers/net/bnxt/bnxt_irq.h| 2 +-
 drivers/net/bnxt/bnxt_nvm_defs.h   | 2 +-
 drivers/net/bnxt/bnxt_reps.h   | 2 +-
 drivers/net/bnxt/bnxt_ring.h   | 2 +-
 drivers/net/bnxt/bnxt_rxq.h| 2 +-
 drivers/net/bnxt/bnxt_rxr.h| 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_avx2.c  | 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_common.h| 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_neon.c  | 2 +-
 drivers/net/bnxt/bnxt_rxtx_vec_sse.c   | 2 +-
 drivers/net/bnxt/bnxt_stats.c  | 2 +-
 drivers/net/bnxt/bnxt_stats.h  | 2 +-
 drivers/net/bnxt/bnxt_txq.h| 2 +-
 drivers/net/bnxt/bnxt_util.c   | 2 +-
 drivers/net/bnxt/bnxt_util.h   | 2 +-
 drivers/net/bnxt/meson.build   | 2 +-
 drivers/net/bnxt/rte_pmd_bnxt.c| 2 +-
 drivers/net/bnxt/rte_pmd_bnxt.h| 2 +-
 drivers/net/bnxt/tf_core/bitalloc.c| 3 +--
 drivers/net/bnxt/tf_core/bitalloc.h| 3 +--
 drivers/net/bnxt/tf_core/cfa_resource_types.h  | 3 +--
 drivers/net/bnxt/tf_core/dpool.c   | 3 ++-
 drivers/net/bnxt/tf_core/dpool.h   | 3 +--
 drivers/net/bnxt/tf_core/ll.c  | 2 +-
 drivers/net/bnxt/tf_core/ll.h  | 2 +-
 drivers/net/bnxt/tf_core/lookup3.h | 1 -
 drivers/net/bnxt/tf_core/rand.c| 2 +-
 drivers/net/bnxt/tf_core/rand.h| 3 +--
 drivers/net/bnxt/tf_core/stack.c   | 2 +-
 drivers/net/bnxt/tf_core/stack.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_common.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_core.h | 1 -
 drivers/net/bnxt/tf_core/tf_device.h   | 1 -
 drivers/net/bnxt/tf_core/tf_device_p4.h| 3 +--
 drivers/net/bnxt/tf_core/tf_device_p58.h   | 2 +-
 drivers/net/bnxt/tf_core/tf_em.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_em_common.c| 8 +---
 drivers/net/bnxt/tf_core/tf_em_common.h| 4 +---
 drivers/net/bnxt/tf_core/tf_em_hash_internal.c | 2 +-
 drivers/net/bnxt/tf_core/tf_em_host.c  | 3 +--
 drivers/net/bnxt/tf_core/tf_em_internal.c  | 3 ++-
 drivers/net/bnxt/tf_core/tf_ext_flow_handle.h  | 4 +---
 drivers/net/bnxt/tf_core/tf_global_cfg.c   | 2 +-
 drivers/net/bnxt/tf_core/tf_global_cfg.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_hash.c | 2 +-
 drivers/net/bnxt/tf_core/tf_hash.h | 3 +--
 drivers/net/bnxt/tf_core/tf_identifier.c   | 2 +-
 drivers/net/bnxt/tf_core/tf_identifier.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_if_tbl.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_msg_common.h   | 3 +--
 drivers/net/bnxt/tf_core/tf_project.h  | 3 +--
 drivers/net/bnxt/tf_core/tf_resources.h| 3 +--
 drivers/net/bnxt/tf_core/tf_rm.h   | 6 +-
 drivers/net/bnxt/tf_core/tf_session.h  | 1 -
 drivers/net/bnxt/tf_core/tf_sram_mgr.h | 1 -
 drivers/net/bnxt/tf_core/tf_tbl.h  | 4 +---
 drivers/net/bnxt/tf_core/tf_tbl_sram.h | 6 +-
 drivers/net/bnxt/tf_core/tf_tcam.h | 3 +--
 drivers/net/bnxt/tf_core/tf_tcam_shared.h  | 1 -
 drivers/net/bnxt/tf_core/tf_util.c | 3 +--
 drivers/net/bnxt/tf_core/tf_util.h | 3 +--
 drivers/net/bnxt/tf_core/tfp.c | 2 +-
 drivers/net/bnxt/tf_core/tfp.h | 4 +---
 drivers/net/bnxt/tf_ulp/bnxt_tf_common.h   | 3 +--
 drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.h | 1 -
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h | 1 -
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h   | 1 -
 drivers/net/bnxt/tf_ulp/ulp_flow_db.h  | 1 -
 drivers/net/bnxt/tf_ulp/ulp_gen_hash.c | 2 +-
 drivers/net/bnxt/tf_ulp/ulp_gen_hash.h | 3 +--
 drivers/net/bnxt/tf_ulp/ulp_gen_tbl.h  | 1 -
 drivers/net/bnxt/tf_ulp/ulp_ha_mgr.h   | 1 -
 drivers/net/bnxt/tf_ulp/ulp_mapper.h   | 1 -
 drivers/net/bnxt/tf_ulp/ulp_mark_mgr.c | 2 +-
 drivers/net/bnxt/tf_ulp/ulp_mark_mgr.h | 3 +--
 drivers/net/bnxt/tf_ulp/ulp_matcher.h  | 3 +--
 drivers/net/bnxt/tf_ulp/ulp_port_db.h  | 1 -
 drivers/net/bnxt/tf_ulp/ulp_rte_parser.h   | 1 -
 drivers/net/bnxt/tf_ulp/ulp_template_struct.h  | 1 -
 drivers/net/bnxt/tf_ulp/ulp_tun.c  | 2 +-
 drivers/net/bnxt/tf_ulp/ulp_tun.h  | 3 +--
 drivers/net/bnxt/tf_ulp/ulp_utils.c| 2 +-
 drivers/net/bnxt/tf_ulp/ul

[PATCH v3 07/11] net/bnxt: add support for rte meter

2023-05-04 Thread Ajit Khaparde
From: Randy Schacher 

Add RTE meter support into the ULP layer.
Currently:
- Chaining of meters is not supported
- Meter can be shared by multiple flows
- srtcm_rfc2697 type is supported
- Stats are not supported in the implementation yet

Signed-off-by: Randy Schacher 
Signed-off-by: Jay Ding 
Acked-by: Ajit Khaparde 
---
 doc/guides/nics/features/bnxt.ini |   1 +
 drivers/net/bnxt/bnxt.h   |   2 +
 drivers/net/bnxt/bnxt_ethdev.c|   1 +
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c|   8 +
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h|   3 +
 drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c  | 909 ++
 drivers/net/bnxt/tf_ulp/meson.build   |   1 +
 drivers/net/bnxt/tf_ulp/ulp_rte_handler_tbl.c |   4 +-
 drivers/net/bnxt/tf_ulp/ulp_rte_parser.c  |  29 +
 drivers/net/bnxt/tf_ulp/ulp_rte_parser.h  |   5 +
 10 files changed, 961 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c

diff --git a/doc/guides/nics/features/bnxt.ini 
b/doc/guides/nics/features/bnxt.ini
index 9c456fa863..2809beb629 100644
--- a/doc/guides/nics/features/bnxt.ini
+++ b/doc/guides/nics/features/bnxt.ini
@@ -77,6 +77,7 @@ dec_ttl  = Y
 drop = Y
 jump = Y
 mark = Y
+meter= Y
 of_pop_vlan  = Y
 of_push_vlan = Y
 of_set_vlan_pcp  = Y
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 6dd3c8b87c..7d508c7c23 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -1014,6 +1014,7 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev,
struct rte_eth_udp_tunnel *udp_tunnel);
 
 extern const struct rte_flow_ops bnxt_flow_ops;
+extern const struct rte_flow_ops bnxt_flow_meter_ops;
 
 #define bnxt_acquire_flow_lock(bp) \
pthread_mutex_lock(&(bp)->flow_lock)
@@ -1065,6 +1066,7 @@ int bnxt_flow_ops_get_op(struct rte_eth_dev *dev,
 int bnxt_dev_start_op(struct rte_eth_dev *eth_dev);
 int bnxt_dev_stop_op(struct rte_eth_dev *eth_dev);
 void bnxt_handle_vf_cfg_change(void *arg);
+int bnxt_flow_meter_ops_get(struct rte_eth_dev *eth_dev, void *arg);
 struct bnxt_vnic_info *bnxt_get_default_vnic(struct bnxt *bp);
 struct tf *bnxt_get_tfp_session(struct bnxt *bp, enum bnxt_session_type type);
 #endif
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 4d84aaee0c..7bceb0524a 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -4091,6 +4091,7 @@ static const struct eth_dev_ops bnxt_dev_ops = {
.timesync_adjust_time = bnxt_timesync_adjust_time,
.timesync_read_rx_timestamp = bnxt_timesync_read_rx_timestamp,
.timesync_read_tx_timestamp = bnxt_timesync_read_tx_timestamp,
+   .mtr_ops_get = bnxt_flow_meter_ops_get,
 };
 
 static uint32_t bnxt_map_reset_regs(struct bnxt *bp, uint32_t reg)
diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
index 08eb0c6063..3459140f18 100644
--- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
@@ -1681,6 +1681,14 @@ bnxt_ulp_init(struct bnxt *bp,
return rc;
}
 
+   if (ulp_dev_id == BNXT_ULP_DEVICE_ID_THOR) {
+   rc = bnxt_flow_meter_init(bp);
+   if (rc) {
+   BNXT_TF_DBG(ERR, "Failed to config meter\n");
+   goto jump_to_error;
+   }
+   }
+
BNXT_TF_DBG(DEBUG, "ulp ctx has been initialized\n");
return rc;
 
diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
index 53d76e1465..a6ad5c1eaa 100644
--- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
@@ -371,6 +371,9 @@ bnxt_ulp_vxlan_ip_port_set(struct bnxt_ulp_context *ulp_ctx,
 unsigned int
 bnxt_ulp_vxlan_ip_port_get(struct bnxt_ulp_context *ulp_ctx);
 
+int32_t
+bnxt_flow_meter_init(struct bnxt *bp);
+
 uint32_t
 bnxt_ulp_cntxt_convert_dev_id(uint32_t ulp_dev_id);
 
diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c
new file mode 100644
index 00..2461c46f90
--- /dev/null
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c
@@ -0,0 +1,909 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2014-2023 Broadcom
+ * All rights reserved.
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "bnxt.h"
+#include "bnxt_filter.h"
+#include "bnxt_hwrm.h"
+#include "bnxt_ring.h"
+#include "bnxt_rxq.h"
+#include "bnxt_rxr.h"
+#include "bnxt_vnic.h"
+#include "hsi_struct_def_dpdk.h"
+
+#include "tfp.h"
+#include "bnxt_tf_common.h"
+#include "ulp_rte_parser.h"
+#include "ulp_matcher.h"
+#include "ulp_flow_db.h"
+#include "ulp_mapper.h"
+#include "ulp_fc_mgr.h"
+#include "ulp_port_db.h"
+#include "ulp_ha_mgr.h"
+#include "ulp_tun.h"
+#

[PATCH v3 08/11] net/bnxt: update PTP support on Thor

2023-05-04 Thread Ajit Khaparde
From: Somnath Kotur 

add locking and time stamp checks to ptp feature

Signed-off-by: Somnath Kotur 
Signed-off-by: Randy Schacher 
Reviewed-by: Kalesh AP 
Reviewed-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|  5 
 drivers/net/bnxt/bnxt_ethdev.c | 11 +
 drivers/net/bnxt/bnxt_hwrm.c   | 11 -
 drivers/net/bnxt/bnxt_ring.c   |  3 +++
 drivers/net/bnxt/bnxt_rxr.c|  8 --
 drivers/net/bnxt/bnxt_txq.c|  1 +
 drivers/net/bnxt/bnxt_txr.c| 45 +++---
 drivers/net/bnxt/bnxt_txr.h|  1 +
 8 files changed, 79 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 7d508c7c23..dadd0bd95a 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -349,6 +349,7 @@ struct bnxt_ptp_cfg {
 BNXT_PTP_MSG_PDELAY_RESP)
uint8_t tx_tstamp_en:1;
int rx_filter;
+   uint8_t filter_all;
 
 #define BNXT_PTP_RX_TS_L   0
 #define BNXT_PTP_RX_TS_H   1
@@ -372,6 +373,8 @@ struct bnxt_ptp_cfg {
/* On P5, the Rx timestamp is present in the Rx completion record */
uint64_trx_timestamp;
uint64_tcurrent_time;
+   uint64_told_time;
+   rte_spinlock_t  ptp_lock;
 };
 
 struct bnxt_coal {
@@ -722,6 +725,7 @@ struct bnxt {
 #define BNXT_FW_CAP_LINK_ADMIN BIT(7)
 #define BNXT_FW_CAP_TRUFLOW_EN BIT(8)
 #define BNXT_FW_CAP_VLAN_TX_INSERT BIT(9)
+#define BNXT_FW_CAP_RX_ALL_PKT_TS  BIT(10)
 #define BNXT_TRUFLOW_EN(bp)((bp)->fw_cap & BNXT_FW_CAP_TRUFLOW_EN &&\
 (bp)->app_id != 0xFF)
 
@@ -849,6 +853,7 @@ struct bnxt {
struct bnxt_led_info*leds;
uint8_t ieee_1588;
struct bnxt_ptp_cfg *ptp_cfg;
+   uint8_t ptp_all_rx_tstamp;
uint16_tvf_resv_strategy;
struct bnxt_ctx_mem_info*ctx;
 
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 7bceb0524a..ea817e1fdd 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1441,8 +1441,11 @@ static void bnxt_ptp_get_current_time(void *arg)
if (!ptp)
return;
 
+   rte_spinlock_lock(&ptp->ptp_lock);
+   ptp->old_time = ptp->current_time;
bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_CURRENT_TIME,
&ptp->current_time);
+   rte_spinlock_unlock(&ptp->ptp_lock);
rc = rte_eal_alarm_set(US_PER_S, bnxt_ptp_get_current_time, (void *)bp);
if (rc != 0) {
PMD_DRV_LOG(ERR, "Failed to re-schedule PTP alarm\n");
@@ -1458,8 +1461,11 @@ static int bnxt_schedule_ptp_alarm(struct bnxt *bp)
if (bp->flags2 & BNXT_FLAGS2_PTP_ALARM_SCHEDULED)
return 0;
 
+   rte_spinlock_lock(&ptp->ptp_lock);
bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_CURRENT_TIME,
&ptp->current_time);
+   ptp->old_time = ptp->current_time;
+   rte_spinlock_unlock(&ptp->ptp_lock);
 
 
rc = rte_eal_alarm_set(US_PER_S, bnxt_ptp_get_current_time, (void *)bp);
@@ -3611,12 +3617,15 @@ bnxt_timesync_enable(struct rte_eth_dev *dev)
 
ptp->rx_filter = 1;
ptp->tx_tstamp_en = 1;
+   ptp->filter_all = 1;
ptp->rxctl = BNXT_PTP_MSG_EVENTS;
 
rc = bnxt_hwrm_ptp_cfg(bp);
if (rc)
return rc;
 
+   rte_spinlock_init(&ptp->ptp_lock);
+   bp->ptp_all_rx_tstamp = 1;
memset(&ptp->tc, 0, sizeof(struct rte_timecounter));
memset(&ptp->rx_tstamp_tc, 0, sizeof(struct rte_timecounter));
memset(&ptp->tx_tstamp_tc, 0, sizeof(struct rte_timecounter));
@@ -3653,9 +3662,11 @@ bnxt_timesync_disable(struct rte_eth_dev *dev)
ptp->rx_filter = 0;
ptp->tx_tstamp_en = 0;
ptp->rxctl = 0;
+   ptp->filter_all = 0;
 
bnxt_hwrm_ptp_cfg(bp);
 
+   bp->ptp_all_rx_tstamp = 0;
if (!BNXT_CHIP_P5(bp))
bnxt_unmap_ptp_regs(bp);
else
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 77588bdf49..82679d1b32 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -669,6 +669,11 @@ int bnxt_hwrm_ptp_cfg(struct bnxt *bp)
flags |=
HWRM_PORT_MAC_CFG_INPUT_FLAGS_PTP_TX_TS_CAPTURE_DISABLE;
 
+   if (ptp->filter_all)
+   flags |=  
HWRM_PORT_MAC_CFG_INPUT_FLAGS_ALL_RX_TS_CAPTURE_ENABLE;
+   else if (bp->fw_cap & BNXT_FW_CAP_RX_ALL_PKT_TS)
+   flags |= 
HWRM_PORT_MAC_CFG_INPUT_FLAGS_ALL_RX_TS_CAPTURE_DISABLE;
+
req.flags = rte_cpu_to_le_32(flags);
req.enables = rte_cpu_to_le_32
(HWRM_PORT_MAC_CFG_INPUT_ENABLES_RX_TS_CAPTURE_PTP_MSG_TYPE);
@@ -810,7 +

[PATCH v3 09/11] net/bnxt: fix multi-root card support

2023-05-04 Thread Ajit Khaparde
From: Kishore Padmanabha 

Changed the logic to use device serial number to identify that
different ports belong to same physical card instead of the PCI
domain address.

Fixes: 34a7ff5a920e ("net/bnxt: support multi root capability")
Cc: sta...@dpdk.org

Signed-off-by: Kishore Padmanabha 
Reviewed-by: Shahaji Bhosle 
Reviewed-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt.h|  3 +++
 drivers/net/bnxt/bnxt_hwrm.c   |  1 +
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c | 11 ---
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h |  2 ++
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index dadd0bd95a..08791b8a17 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -138,6 +138,7 @@
 #define BNXT_NUM_CMPL_DMA_AGGR 36
 #define BNXT_CMPL_AGGR_DMA_TMR_DURING_INT  50
 #define BNXT_NUM_CMPL_DMA_AGGR_DURING_INT  12
+#define BNXT_DEVICE_SERIAL_NUM_SIZE8
 
 #defineBNXT_DEFAULT_VNIC_STATE_MASK\

HWRM_ASYNC_EVENT_CMPL_DEFAULT_VNIC_CHANGE_EVENT_DATA1_DEF_VNIC_STATE_MASK
@@ -863,6 +864,8 @@ struct bnxt {
uint16_tnum_reps;
struct bnxt_rep_info*rep_info;
uint16_t*cfa_code_map;
+   /* Device Serial Number */
+   uint8_t dsn[BNXT_DEVICE_SERIAL_NUM_SIZE];
/* Struct to hold adapter error recovery related info */
struct bnxt_error_recovery_info *recovery_info;
 #define BNXT_MARK_TABLE_SZ (sizeof(struct bnxt_mark_info)  * 64 * 1024)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 82679d1b32..edad84c262 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -863,6 +863,7 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp)
bp->max_l2_ctx, bp->max_vnics);
bp->max_stat_ctx = rte_le_to_cpu_16(resp->max_stat_ctx);
bp->max_mcast_addr = rte_le_to_cpu_32(resp->max_mcast_filters);
+   memcpy(bp->dsn, resp->device_serial_number, sizeof(bp->dsn));
 
if (BNXT_PF(bp))
bp->pf->total_vnics = rte_le_to_cpu_16(resp->max_vnics);
diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
index 3459140f18..500c177039 100644
--- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c
@@ -1318,9 +1318,13 @@ ulp_get_session(struct bnxt *bp, struct rte_pci_addr 
*pci_addr)
 
/* if multi root capability is enabled, then ignore the pci bus id */
STAILQ_FOREACH(session, &bnxt_ulp_session_list, next) {
-   if (session->pci_info.domain == pci_addr->domain &&
-   (BNXT_MULTIROOT_EN(bp) ||
-   session->pci_info.bus == pci_addr->bus)) {
+   if (BNXT_MULTIROOT_EN(bp)) {
+   if (!memcmp(bp->dsn, session->dsn,
+   sizeof(session->dsn))) {
+   return session;
+   }
+   } else if (session->pci_info.domain == pci_addr->domain &&
+  session->pci_info.bus == pci_addr->bus) {
return session;
}
}
@@ -1364,6 +1368,7 @@ ulp_session_init(struct bnxt *bp,
/* Add it to the queue */
session->pci_info.domain = pci_addr->domain;
session->pci_info.bus = pci_addr->bus;
+   memcpy(session->dsn, bp->dsn, sizeof(session->dsn));
rc = pthread_mutex_init(&session->bnxt_ulp_mutex, NULL);
if (rc) {
BNXT_TF_DBG(ERR, "mutex create failed\n");
diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h 
b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
index a6ad5c1eaa..92db7751fe 100644
--- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
+++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h
@@ -131,11 +131,13 @@ struct bnxt_ulp_pci_info {
uint8_t bus;
 };
 
+#define BNXT_ULP_DEVICE_SERIAL_NUM_SIZE 8
 struct bnxt_ulp_session_state {
STAILQ_ENTRY(bnxt_ulp_session_state)next;
boolbnxt_ulp_init;
pthread_mutex_t bnxt_ulp_mutex;
struct bnxt_ulp_pci_infopci_info;
+   uint8_t dsn[BNXT_ULP_DEVICE_SERIAL_NUM_SIZE];
struct bnxt_ulp_data*cfg_data;
struct tf   *g_tfp[BNXT_ULP_SESSION_MAX];
uint32_tsession_opened[BNXT_ULP_SESSION_MAX];
-- 
2.39.2 (Apple Git-143)



smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH v3 10/11] net/bnxt: add support for eCPRI packet parsing

2023-05-04 Thread Ajit Khaparde
From: Randy Schacher 

Add eCPRI parsing and offload support in the TruFlow ULP layer.

Signed-off-by: Randy Schacher 
Signed-off-by: Shahaji Bhosle 
Reviewed-by: Manish Kurup 
Reviewed-by: Ajit Khaparde 
---
 .mailmap  |   1 +
 doc/guides/nics/features/bnxt.ini |   1 +
 drivers/net/bnxt/bnxt.h   |   4 +
 drivers/net/bnxt/bnxt_ethdev.c|  35 +
 drivers/net/bnxt/bnxt_hwrm.c  |  17 +++
 drivers/net/bnxt/bnxt_txr.c   |  10 +-
 drivers/net/bnxt/bnxt_vnic.c  |   5 +-
 drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c|   7 +-
 drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.h|   1 +
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c|  24 
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h|   9 +-
 drivers/net/bnxt/tf_ulp/ulp_mapper.c  |  18 +++
 drivers/net/bnxt/tf_ulp/ulp_rte_handler_tbl.c |   4 +
 drivers/net/bnxt/tf_ulp/ulp_rte_parser.c  | 120 +-
 drivers/net/bnxt/tf_ulp/ulp_rte_parser.h  |   5 +
 drivers/net/bnxt/tf_ulp/ulp_template_struct.h |   2 +
 16 files changed, 256 insertions(+), 7 deletions(-)

diff --git a/.mailmap b/.mailmap
index aded19d181..cc22746f36 100644
--- a/.mailmap
+++ b/.mailmap
@@ -813,6 +813,7 @@ Malvika Gupta 
 Mandal Purna Chandra 
 Mandeep Rohilla 
 Manish Chopra 
+Manish Kurup 
 Manish Tomar 
 Mao Jiang 
 Mao YingMing 
diff --git a/doc/guides/nics/features/bnxt.ini 
b/doc/guides/nics/features/bnxt.ini
index 2809beb629..7b3030a58c 100644
--- a/doc/guides/nics/features/bnxt.ini
+++ b/doc/guides/nics/features/bnxt.ini
@@ -57,6 +57,7 @@ Perf doc = Y
 
 [rte_flow items]
 any  = Y
+ecpri= Y
 eth  = P
 ipv4 = Y
 ipv6 = Y
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 08791b8a17..ed21ba7f29 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -844,10 +844,14 @@ struct bnxt {
uint8_t port_cnt;
uint8_t vxlan_port_cnt;
uint8_t geneve_port_cnt;
+   uint8_t ecpri_port_cnt;
uint16_tvxlan_port;
uint16_tgeneve_port;
+   uint16_tecpri_port;
uint16_tvxlan_fw_dst_port_id;
uint16_tgeneve_fw_dst_port_id;
+   uint16_tecpri_fw_dst_port_id;
+   uint16_tecpri_upar_in_use;
uint32_tfw_ver;
uint32_thwrm_spec_code;
 
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index ea817e1fdd..ee1552452a 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -2405,6 +2405,20 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev,
tunnel_type =
HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_GENEVE;
break;
+   case RTE_ETH_TUNNEL_TYPE_ECPRI:
+   if (bp->ecpri_port_cnt) {
+   PMD_DRV_LOG(ERR, "Tunnel Port %d already programmed\n",
+   udp_tunnel->udp_port);
+   if (bp->ecpri_port != udp_tunnel->udp_port) {
+   PMD_DRV_LOG(ERR, "Only one port allowed\n");
+   return -ENOSPC;
+   }
+   bp->ecpri_port_cnt++;
+   return 0;
+   }
+   tunnel_type =
+   HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_ECPRI;
+   break;
default:
PMD_DRV_LOG(ERR, "Tunnel type is not supported\n");
return -ENOTSUP;
@@ -2423,6 +2437,10 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev,
HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_GENEVE)
bp->geneve_port_cnt++;
 
+   if (tunnel_type ==
+   HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_ECPRI)
+   bp->ecpri_port_cnt++;
+
return rc;
 }
 
@@ -2474,6 +2492,23 @@ bnxt_udp_tunnel_port_del_op(struct rte_eth_dev *eth_dev,
HWRM_TUNNEL_DST_PORT_FREE_INPUT_TUNNEL_TYPE_GENEVE;
port = bp->geneve_fw_dst_port_id;
break;
+   case RTE_ETH_TUNNEL_TYPE_ECPRI:
+   if (!bp->ecpri_port_cnt) {
+   PMD_DRV_LOG(ERR, "No Tunnel port configured yet\n");
+   return -EINVAL;
+   }
+   if (bp->ecpri_port != udp_tunnel->udp_port) {
+   PMD_DRV_LOG(ERR, "Req Port: %d. Configured port: %d\n",
+   udp_tunnel->udp_port, bp->ecpri_port);
+   return -EINVAL;
+   }
+   if (--bp->ecpri_port_cnt)
+   return 0;
+
+   tunnel_type =
+

[PATCH v3 11/11] net/bnxt: set RSS config based on RSS mode

2023-05-04 Thread Ajit Khaparde
From: Shuanglin Wang 

Avoid submitting hwrm RSS request when rss mode disabled.
On WH+, if rss mode isn't enabled, then there is no rss context.
Submitting HWRM_VNIC_RSS_CFG request to firmware would hit a failure.

The fix is to check the rss context. If no rss context, then don't
submit the hwrm request.

Signed-off-by: Shuanglin Wang 
Signed-off-by: Kishore Padmanabha 
Reviewed-by: Mike Baucom 
Reviewed-by: Ajit Khaparde 
---
 drivers/net/bnxt/bnxt_hwrm.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index b944547656..06f196760f 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -2405,6 +2405,9 @@ bnxt_hwrm_vnic_rss_cfg_non_p5(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
struct hwrm_vnic_rss_cfg_output *resp = bp->hwrm_cmd_resp_addr;
int rc = 0;
 
+   if (vnic->num_lb_ctxts == 0)
+   return rc;
+
HWRM_PREP(&req, HWRM_VNIC_RSS_CFG, BNXT_USE_CHIMP_MB);
 
req.hash_type = rte_cpu_to_le_32(vnic->hash_type);
-- 
2.39.2 (Apple Git-143)



smime.p7s
Description: S/MIME Cryptographic Signature


RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release

2023-05-04 Thread wangyunjian


> -Original Message-
> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> Sent: Thursday, May 4, 2023 7:46 AM
> To: Konstantin Ananyev ; wangyunjian
> 
> Cc: dev@dpdk.org; luyicai ; sta...@dpdk.org; nd
> ; nd 
> Subject: RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release
> 
> 
> 
> > >
> > 
> > 
> > 
> > > After the memzone is freed, it is not removed from the 
> > > 'rte_ring_tailq'.
> > > If rte_ring_lookup is called at this time, it will cause a
> > > use-after-free problem. This change prevents that from happening.
> > >
> > > Fixes: 4e32101f9b01 ("ring: support freeing")
> > > Cc: sta...@dpdk.org
> > >
> > > Suggested-by: Honnappa Nagarahalli
> > 
> > > Signed-off-by: Yunjian Wang 
> > > ---
> > > v2: update code suggested by Honnappa Nagarahalli
> > > ---
> > >lib/ring/rte_ring.c | 8 +++-
> > >1 file changed, 3 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index
> > > 8ed455043d..2755323b8a 100644
> > > --- a/lib/ring/rte_ring.c
> > > +++ b/lib/ring/rte_ring.c
> > > @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r)
> > >   return;
> > >   }
> > >
> > > - if (rte_memzone_free(r->memzone) != 0) {
> > > - RTE_LOG(ERR, RING, "Cannot free memory\n");
> > > - return;
> > > - }
> > > -
> > >   ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head,
> > rte_ring_list);
> > >   rte_mcfg_tailq_write_lock();
> > >
> > > @@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r)
> > >
> > >   TAILQ_REMOVE(ring_list, te, next);
> > >
> > > + if (rte_memzone_free(r->memzone) != 0)
> > > + RTE_LOG(ERR, RING, "Cannot free memory\n");
> > > +
> > 
> >  I nit: I think it is a bit better to first release the lock and
> >  then free the memzone.
> > >>> I think both of our suggestions are contradictory. Any reason why
> > >>> you want
> > >> to free outside the locked region?
> > >>
> > >>
> > >> Don't know what you mean by 'both suggestions' here.
> > > I wrote 'both of our suggestions'. Essentially, in v1, freeing the
> > > memzone
> > was outside of the lock. I suggested to move it inside and you are
> > suggesting to move it inside.
> >
> >
> > Ah ok, I missed v1 and your comments for it.
> > As I said before, I don't think that we need to hold qlock here while
> > calling mmezone_free().
> > Though I don't see any harm with it either.
> > I'd personally would move memzone_free() after releasing qlock, but if
> > you guys prefer to keep it as it is - I wouldn't insist.
> I looked at other libraries, stack library is the closest. Stack library 
> frees the
> memzone outside the lock. I think we should keep it consistent.
> I am fine to move the free outside the lock.

Thanks, Konstantin Ananyev and Honnappa Nagarahalli.
I will update the patch in v3 and move the free outside the lock.

> 
> >
> > >
> > >> I think I gave only one - move memzone_free() after tailq_write_unlock().
> > >> To be more precise:
> > >> 1) rte_mcfg_tailq_write_lock();
> > >> ...
> > >> 2) TAILQ_REMOVE(...);
> > >> 3) rte_mcfg_tailq_write_unlock();
> > >> 4) rte_memzone_free(r->memzone);
> > >>
> > >> As I remember, memzones are protected by their own lock (mlock), so
> > >> we don't need to hold qlock to free a memzone, after ring was
> > >> already removed from the ring_list.
> > >>
> > >>>
> > >>> I thought, since it belongs to the ring being freed, it makes
> > >>> sense to free it
> > >> while holding the lock to avoid any race conditions (though, I have
> > >> not checked what those are).
> > >>
> > >>
> > >> As I understand, it is ok with current design to grab mlock while
> > >> holding
> > qlock.
> > >> So, there is nothing wrong with current patch, I just think that in
> > >> that case it is excessive, and can be safely avoided.
> > >>
> > >>>
> >  Apart from that, LGTM.
> >  Acked-by: Konstantin Ananyev 
> > 
> > >   rte_mcfg_tailq_write_unlock();
> > >
> > >   rte_free(te);
> > > --
> > > 2.33.0
> > 
> > >>>
> > >



RE: 21.11.4 patches review and test

2023-05-04 Thread Xu, HailinX
> -Original Message-
> From: Kevin Traynor 
> Sent: Thursday, May 4, 2023 6:11 PM
> To: Xu, HailinX ; sta...@dpdk.org
> Cc: Stokes, Ian ; Mcnamara, John
> ; Luca Boccassi ; Xu, Qian Q
> ; Thomas Monjalon ; Peng,
> Yuan ; Chen, Zhaoyan ;
> dev@dpdk.org
> Subject: Re: 21.11.4 patches review and test
> 
> On 04/05/2023 03:13, Xu, HailinX wrote:
> >> -Original Message-
> >> From: Kevin Traynor 
> >> Sent: Tuesday, May 2, 2023 5:35 PM
> >> To: Xu, HailinX ; sta...@dpdk.org
> >> Cc: Stokes, Ian ; Mcnamara, John
> >> ; Luca Boccassi ; Xu, Qian
> >> Q ; Thomas Monjalon ;
> Peng,
> >> Yuan ; Chen, Zhaoyan ;
> >> dev@dpdk.org
> >> Subject: Re: 21.11.4 patches review and test
> >>
> >> On 20/04/2023 11:32, Kevin Traynor wrote:
> >>> On 20/04/2023 03:40, Xu, HailinX wrote:
> > -Original Message-
> > From: Xu, HailinX 
> > Sent: Thursday, April 13, 2023 2:13 PM
> > To: Kevin Traynor ; sta...@dpdk.org
> > Cc: dev@dpdk.org; Abhishek Marathe
> >> ;
> > Ali Alnubani ; Walker, Benjamin
> > ; David Christensen
> > ; Hemant Agrawal
> >> ;
> > Stokes, Ian ; Jerin Jacob
> > ; Mcnamara, John ;
> > Ju-Hyoung Lee ; Luca Boccassi
> > ; Pei Zhang ; Xu, Qian Q
> > ; Raslan Darawsheh ;
> >> Thomas
> > Monjalon ; yangh...@redhat.com; Peng, Yuan
> > ; Chen, Zhaoyan 
> > Subject: RE: 21.11.4 patches review and test
> >
> >> -Original Message-
> >> From: Kevin Traynor 
> >> Sent: Thursday, April 6, 2023 7:38 PM
> >> To: sta...@dpdk.org
> >> Cc: dev@dpdk.org; Abhishek Marathe
> >> ; Ali Alnubani
> >> ; Walker, Benjamin
> >> ; David Christensen
> >> ; Hemant Agrawal
> >> ; Stokes, Ian ;
> >> Jerin Jacob ; Mcnamara, John
> >> ; Ju-Hyoung Lee
> ;
> >> Kevin Traynor ; Luca Boccassi
> >> ; Pei Zhang ; Xu, Qian Q
> >> ; Raslan Darawsheh ;
> > Thomas
> >> Monjalon ; yangh...@redhat.com; Peng, Yuan
> >> ; Chen, Zhaoyan 
> >> Subject: 21.11.4 patches review and test
> >>
> >> Hi all,
> >>
> >> Here is a list of patches targeted for stable release 21.11.4.
> >>
> >> The planned date for the final release is 25th April.
> >>
> >> Please help with testing and validation of your use cases and
> >> report any issues/results with reply-all to this mail. For the
> >> final release the fixes and reported validations will be added to
> >> the
> >> release notes.
> >>
> >> A release candidate tarball can be found at:
> >>
> >>https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1
> >>
> >> These patches are located at branch 21.11 of dpdk-stable repo:
> >>https://dpdk.org/browse/dpdk-stable/
> >>
> >> Thanks.
> >>
> >> Kevin
> >
> > HI All,
> >
> > Update the test status for Intel part. Till now dpdk21.11.4-rc1
> > validation test rate is 85%. No critical issue is found.
> > 2 new bugs are found, 1 new issue is under confirming by Intel Dev.
> > New bugs:   --20.11.8-rc1 also has these two issues
> >  1.
> >> pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac:
> > performance drop about 23.5% when send small packets
> > https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet
> >  2. some of the virtio tests are failing:-- Intel dev is under
> >> investigating
> > # Basic Intel(R) NIC testing
> > * Build & CFLAG compile: cover the build test combination with
> > latest GCC/Clang version and the popular OS revision such as
> >  Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6,
> > RHEL8.4, FreeBSD13.1, SUSE15, CentOS7.9, etc.
> > - All test done. No new dpdk issue is found.
> > * PF(i40e, ixgbe): test scenarios including
> > RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> > - All test done. No new dpdk issue is found.
> > * VF(i40e, ixgbe): test scenarios including
> > VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> > - All test done. No new dpdk issue is found.
> > * PF/VF(ice): test scenarios including Switch features/Package
> > Management/Flow Director/Advanced Tx/Advanced
> RSS/ACL/DCF/Flexible
> > Descriptor, etc.
> > - All test done. No new dpdk issue is found.
> > * Intel NIC single core/NIC performance: test scenarios including
> > PF/VF single core performance test, etc.
> > - All test done. No new dpdk issue is found.
> > * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library
> > basic test - QAT&SW/FIB library, etc.
> > - On going.
> >
> > # Basic cryptodev and virtio testing
> > * Virtio: both function and performance test are covered. Such as
> > PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
> > testing/VMAWARE ESXI 8.0, etc.
> > - All test done. found bug1.
> > * Cryptodev:
> >  *Function test: test scenarios 

Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Ruifeng,

Thanks for your review.

On Thur, May 4, 2023 at 2:13PM, Ruifeng Wang wrote:

-Original Message-
From: Konstantin Ananyev 
Sent: Monday, May 1, 2023 9:29 PM
To: zhou...@loongson.cn
Cc: dev@dpdk.org; maob...@loongson.cn; qiming.y...@intel.com; 
wenjun1...@intel.com;
Ruifeng Wang ; d...@linux.vnet.ibm.com
Subject: Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx 
functions


Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson
3C5000 processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as the
first packet has the EOP bit set, and the length of this packet is
less than or equal to rxq->crc_len, the segmentation fault will
definitely happen even though on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be
NULL, if at the same time this packet has the EOP bit set and its
length is less than or equal to rxq->crc_len, the following loop will be 
excecuted:

 for (lp = first_seg; lp->next != rxm; lp = lp->next)
 ;

We know that the first_seg->next will be NULL under this condition. So
the expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be
greater than rxq->crc_len. However, the out-of-order execution of CPU
may make the read ordering of the status and the rest of the
descriptor fields in this function not be correct. The related codes are as 
following:

 rxdp = &rx_ring[rx_id];
  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

 if (!(staterr & IXGBE_RXDADV_STAT_DD))
 break;

  #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is
likely to make the ready packet zero length. If the packet is the
first packet and has the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct. We
also did the same thing in the ixgbe_recv_pkts() function to make the
rxd data be valid even thougth we did not find segmentation fault in this 
function.

Signed-off-by: Min Zhou 

"Fixes" tag for backport.

OK, I will add the "Fixes" tag in the V3 patch.
  

---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
staterr = rxdp->wb.upper.status_error;
if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
break;
+
+   rte_rmb();
rxd = *rxdp;



Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.

Thanks, Konstantin.


About a fix - looks right, but a bit excessive to me - as I understand all we 
need here is
to prevent re-ordering by CPU itself.
So rte_smp_rmb() seems enough here.

Agree that rte_rmb() is excessive.
rte_smp_rmb() or rte_atomic_thread_fence(__ATOMIC_ACQUIRE) is enough.
Thanks for your advice. I will compare the rte_smp_rmb(), 
__atomic_load_n() and rte_atomic_thread_fence() to choose a better one.

And it is better to add a comment to justify the barrier.

OK, I will add a comment for this change.

Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE);



/*
@@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf 
**rx_pkts,

uint16_t nb_pkts,

With the proper barrier in place, I think the long comments at the beginning of 
this loop can be removed.
Yes, I think the long comments can be simplified when the proper barrier 
is already in place.

if (!(staterr & IXGBE_RXDADV_STAT_DD))
break;

+   rte_rmb();
rxd = *rxdp;

PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.31.1


Best regards,

Min



Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Morten,

On Thur, May 4, 2023 at 9:21PM, Morten Brørup wrote:

From: zhoumin [mailto:zhou...@loongson.cn]
Sent: Thursday, 4 May 2023 15.17

Hi Konstantin,

Thanks for your  comments.

On 2023/5/1 下午9:29, Konstantin Ananyev wrote:

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000
processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as the
first
packet has the EOP bit set, and the length of this packet is less
than or
equal to rxq->crc_len, the segmentation fault will definitely happen
even
though on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be
NULL, if
at the same time this packet has the EOP bit set and its length is less
than or equal to rxq->crc_len, the following loop will be excecuted:

     for (lp = first_seg; lp->next != rxm; lp = lp->next)
     ;

We know that the first_seg->next will be NULL under this condition.
So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be
greater
than rxq->crc_len. However, the out-of-order execution of CPU may
make the
read ordering of the status and the rest of the descriptor fields in
this
function not be correct. The related codes are as following:

     rxdp = &rx_ring[rx_id];
  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

     if (!(staterr & IXGBE_RXDADV_STAT_DD))
     break;

  #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is
likely
to make the ready packet zero length. If the packet is the first
packet and
has the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct.
We also
did the same thing in the ixgbe_recv_pkts() function to make the rxd
data
be valid even thougth we did not find segmentation fault in this
function.

Signed-off-by: Min Zhou 
---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf
**rx_pkts,
  staterr = rxdp->wb.upper.status_error;
  if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
  break;
+
+    rte_rmb();
  rxd = *rxdp;



Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.

The LoongArch architecture uses the Weak Consistency model which can
cause the problem, especially in scenario with many cores, such as
Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot
reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores.

About a fix - looks right, but a bit excessive to me -
as I understand all we need here is to prevent re-ordering by CPU itself.

Yes, thanks for cc-ing.

So rte_smp_rmb() seems enough here.
Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error,
__ATOMIC_ACQUIRE);


Does __atomic_load_n() work on Windows if we use it to solve this problem ?

Yes, __atomic_load_n() works on Windows too.


Thank you, Morten. I got it.

I will compare those barriers and choose a proper one for this problem.


Best regards,

Min



RE: [EXT] [PATCH v5 02/15] graph: split graph worker into common and default model

2023-05-04 Thread Yan, Zhirun



> -Original Message-
> From: Pavan Nikhilesh Bhagavatula 
> Sent: Thursday, April 27, 2023 10:11 PM
> To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob
> Kollanukkaran ; Kiran Kumar Kokkilagadda
> ; Nithin Kumar Dabilpuram
> ; step...@networkplumber.org
> Cc: Liang, Cunming ; Wang, Haiyue
> 
> Subject: RE: [EXT] [PATCH v5 02/15] graph: split graph worker into common and
> default model
> 
> 
> 
> > -Original Message-
> > From: Zhirun Yan 
> > Sent: Friday, March 31, 2023 9:33 AM
> > To: dev@dpdk.org; Jerin Jacob Kollanukkaran ;
> > Kiran Kumar Kokkilagadda ; Nithin Kumar
> > Dabilpuram ; step...@networkplumber.org
> > Cc: cunming.li...@intel.com; haiyue.w...@intel.com; Zhirun Yan
> > 
> > Subject: [EXT] [PATCH v5 02/15] graph: split graph worker into common
> > and default model
> >
> > External Email
> >
> > --
> > To support multiple graph worker model, split graph into common and
> > default. Naming the current walk function as rte_graph_model_rtc cause
> > the default model is RTC(Run-to-completion).
> >
> > Signed-off-by: Haiyue Wang 
> > Signed-off-by: Cunming Liang 
> > Signed-off-by: Zhirun Yan 
> > ---
> >  lib/graph/graph_pcap.c  |  2 +-
> >  lib/graph/graph_private.h   |  2 +-
> >  lib/graph/meson.build   |  2 +-
> >  lib/graph/rte_graph_model_rtc.h | 61
> > +
> >  lib/graph/rte_graph_worker.h| 34 
> >  lib/graph/rte_graph_worker_common.h | 57 ---
> >  6 files changed, 98 insertions(+), 60 deletions(-)  create mode
> > 100644 lib/graph/rte_graph_model_rtc.h  create mode 100644
> > lib/graph/rte_graph_worker.h
> >
> > diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c index
> > 8a220370fa..6c43330029 100644
> > --- a/lib/graph/graph_pcap.c
> > +++ b/lib/graph/graph_pcap.c
> > @@ -10,7 +10,7 @@
> >  #include 
> >  #include 
> >
> > -#include "rte_graph_worker_common.h"
> > +#include "rte_graph_worker.h"
> >
> >  #include "graph_pcap_private.h"
> >
> > diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
> > index f08dbc7e9d..7d1b30b8ac 100644
> > --- a/lib/graph/graph_private.h
> > +++ b/lib/graph/graph_private.h
> > @@ -12,7 +12,7 @@
> >  #include 
> >
> >  #include "rte_graph.h"
> > -#include "rte_graph_worker_common.h"
> > +#include "rte_graph_worker.h"
> >
> >  extern int rte_graph_logtype;
> >
> > diff --git a/lib/graph/meson.build b/lib/graph/meson.build index
> > 4e2b612ad3..3526d1b5d4 100644
> > --- a/lib/graph/meson.build
> > +++ b/lib/graph/meson.build
> > @@ -16,6 +16,6 @@ sources = files(
> >  'graph_populate.c',
> >  'graph_pcap.c',
> >  )
> > -headers = files('rte_graph.h', 'rte_graph_worker_common.h')
> > +headers = files('rte_graph.h', 'rte_graph_worker.h')
> >
> >  deps += ['eal', 'pcapng']
> > diff --git a/lib/graph/rte_graph_model_rtc.h
> > b/lib/graph/rte_graph_model_rtc.h new file mode 100644 index
> > 00..665560f831
> > --- /dev/null
> > +++ b/lib/graph/rte_graph_model_rtc.h
> > @@ -0,0 +1,61 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2023 Intel Corporation  */
> > +
> 
> Please retain Marvell copyright too.
> 
Yes, I will do in next version. Thanks for reminding me.

> > +#include "rte_graph_worker_common.h"
> > +
> > +/**
> > + * Perform graph walk on the circular buffer and invoke the process
> > function
> > + * of the nodes and collect the stats.
> > + *
> > + * @param graph
> > + *   Graph pointer returned from rte_graph_lookup function.
> > + *
> > + * @see rte_graph_lookup()
> > + */
> > +static inline void
> > +rte_graph_walk_rtc(struct rte_graph *graph) {
> > +   const rte_graph_off_t *cir_start = graph->cir_start;
> > +   const rte_node_t mask = graph->cir_mask;
> > +   uint32_t head = graph->head;
> > +   struct rte_node *node;
> > +   uint64_t start;
> > +   uint16_t rc;
> > +   void **objs;
> > +
> > +   /*
> > +* Walk on the source node(s) ((cir_start - head) -> cir_start) and
> > then
> > +* on the pending streams (cir_start -> (cir_start + mask) -> cir_start)
> > +* in a circular buffer fashion.
> > +*
> > +*  +-+ <= cir_start - head [number of source nodes]
> > +*  | |
> > +*  | ... | <= source nodes
> > +*  | |
> > +*  +-+ <= cir_start [head = 0] [tail = 0]
> > +*  | |
> > +*  | ... | <= pending streams
> > +*  | |
> > +*  +-+ <= cir_start + mask
> > +*/
> > +   while (likely(head != graph->tail)) {
> > +   node = (struct rte_node *)RTE_PTR_ADD(graph,
> > cir_start[(int32_t)head++]);
> > +   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
> > +   objs = node->objs;
> > +   rte_prefetch0(objs);
> > +
> > +   if (rte_graph_has_stats_feature()) {
> > +   start = rte_rdtsc();
> > +   rc = node->process(graph, node

RE: [EXT] [PATCH v5 11/15] graph: introduce graph walk by cross-core dispatch

2023-05-04 Thread Yan, Zhirun



> -Original Message-
> From: Pavan Nikhilesh Bhagavatula 
> Sent: Thursday, April 27, 2023 10:59 PM
> To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob
> Kollanukkaran ; Kiran Kumar Kokkilagadda
> ; Nithin Kumar Dabilpuram
> ; step...@networkplumber.org
> Cc: Liang, Cunming ; Wang, Haiyue
> 
> Subject: RE: [EXT] [PATCH v5 11/15] graph: introduce graph walk by cross-core
> dispatch
> 
> > This patch introduces the task scheduler mechanism to enable
> > dispatching tasks to another worker cores. Currently, there is only a
> > local work queue for one graph to walk. We introduce a scheduler
> > worker queue in each worker core for dispatching tasks. It will
> > perform the walk on scheduler work queue first, then handle the local work
> queue.
> >
> > Signed-off-by: Haiyue Wang 
> > Signed-off-by: Cunming Liang 
> > Signed-off-by: Zhirun Yan 
> > ---
> >  lib/graph/rte_graph_model_dispatch.h | 42
> > 
> >  1 file changed, 42 insertions(+)
> >
> > diff --git a/lib/graph/rte_graph_model_dispatch.h
> > b/lib/graph/rte_graph_model_dispatch.h
> > index 18fa7ce0ab..65b2cc6d87 100644
> > --- a/lib/graph/rte_graph_model_dispatch.h
> > +++ b/lib/graph/rte_graph_model_dispatch.h
> > @@ -73,6 +73,48 @@ __rte_experimental
> >  int rte_graph_model_dispatch_lcore_affinity_set(const char *name,
> > unsigned int lcore_id);
> >
> > +/**
> > + * Perform graph walk on the circular buffer and invoke the process
> > function
> > + * of the nodes and collect the stats.
> > + *
> > + * @param graph
> > + *   Graph pointer returned from rte_graph_lookup function.
> > + *
> > + * @see rte_graph_lookup()
> > + */
> > +__rte_experimental
> > +static inline void
> > +rte_graph_walk_mcore_dispatch(struct rte_graph *graph) {
> > +   const rte_graph_off_t *cir_start = graph->cir_start;
> > +   const rte_node_t mask = graph->cir_mask;
> > +   uint32_t head = graph->head;
> > +   struct rte_node *node;
> 
> I think we should add a RTE_ASSERT here to make sure that the graph object is 
> a
> cloned graph.
> 
Ok, I will add RTE_ASSERT in next version. 

> > +
> > +   if (graph->wq != NULL)
> > +   __rte_graph_sched_wq_process(graph);
> > +
> > +   while (likely(head != graph->tail)) {
> > +   node = (struct rte_node *)RTE_PTR_ADD(graph,
> > cir_start[(int32_t)head++]);
> > +
> > +   /* skip the src nodes which not bind with current worker */
> > +   if ((int32_t)head < 0 && node->lcore_id != graph->lcore_id)
> > +   continue;
> > +
> > +   /* Schedule the node until all task/objs are done */
> > +   if (node->lcore_id != RTE_MAX_LCORE &&
> > +   graph->lcore_id != node->lcore_id && graph->rq != NULL
> > &&
> > +   __rte_graph_sched_node_enqueue(node, graph->rq))
> > +   continue;
> > +
> > +   __rte_node_process(graph, node);
> > +
> > +   head = likely((int32_t)head > 0) ? head & mask : head;
> > +   }
> > +
> > +   graph->tail = 0;
> > +}
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > --
> > 2.37.2



RE: [EXT] [PATCH v5 09/15] graph: introduce stream moving cross cores

2023-05-04 Thread Yan, Zhirun



> -Original Message-
> From: Pavan Nikhilesh Bhagavatula 
> Sent: Thursday, April 27, 2023 10:53 PM
> To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob
> Kollanukkaran ; Kiran Kumar Kokkilagadda
> ; Nithin Kumar Dabilpuram
> ; step...@networkplumber.org
> Cc: Liang, Cunming ; Wang, Haiyue
> 
> Subject: RE: [EXT] [PATCH v5 09/15] graph: introduce stream moving cross cores
> 
> > This patch introduces key functions to allow a worker thread to enable
> > enqueue and move streams of objects to the next nodes over different
> > cores.
> >
> > Signed-off-by: Haiyue Wang 
> > Signed-off-by: Cunming Liang 
> > Signed-off-by: Zhirun Yan 
> > ---
> >  lib/graph/graph_private.h|  27 +
> >  lib/graph/meson.build|   2 +-
> >  lib/graph/rte_graph_model_dispatch.c | 145
> > +++
> >  lib/graph/rte_graph_model_dispatch.h |  37 +++
> >  lib/graph/version.map|   2 +
> >  5 files changed, 212 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
> > index b66b18ebbc..e1a2a4bfd8 100644
> > --- a/lib/graph/graph_private.h
> > +++ b/lib/graph/graph_private.h
> > @@ -366,4 +366,31 @@ void graph_dump(FILE *f, struct graph *g);
> >   */
> >  void node_dump(FILE *f, struct node *n);
> >
> > +/**
> > + * @internal
> > + *
> > + * Create the graph schedule work queue. And all cloned graphs
> > +attached to
> > the
> > + * parent graph MUST be destroyed together for fast schedule design
> > limitation.
> > + *
> > + * @param _graph
> > + *   The graph object
> > + * @param _parent_graph
> > + *   The parent graph object which holds the run-queue head.
> > + *
> > + * @return
> > + *   - 0: Success.
> > + *   - <0: Graph schedule work queue related error.
> > + */
> > +int graph_sched_wq_create(struct graph *_graph, struct graph
> > *_parent_graph);
> > +
> > +/**
> > + * @internal
> > + *
> > + * Destroy the graph schedule work queue.
> > + *
> > + * @param _graph
> > + *   The graph object
> > + */
> > +void graph_sched_wq_destroy(struct graph *_graph);
> > +
> >  #endif /* _RTE_GRAPH_PRIVATE_H_ */
> > diff --git a/lib/graph/meson.build b/lib/graph/meson.build index
> > c729d984b6..e21affa280 100644
> > --- a/lib/graph/meson.build
> > +++ b/lib/graph/meson.build
> > @@ -20,4 +20,4 @@ sources = files(
> >  )
> >  headers = files('rte_graph.h', 'rte_graph_worker.h')
> >
> > -deps += ['eal', 'pcapng']
> > +deps += ['eal', 'pcapng', 'mempool', 'ring']
> > diff --git a/lib/graph/rte_graph_model_dispatch.c
> > b/lib/graph/rte_graph_model_dispatch.c
> > index 4a2f99496d..a300fefb85 100644
> > --- a/lib/graph/rte_graph_model_dispatch.c
> > +++ b/lib/graph/rte_graph_model_dispatch.c
> > @@ -5,6 +5,151 @@
> >  #include "graph_private.h"
> >  #include "rte_graph_model_dispatch.h"
> >
> > +int
> > +graph_sched_wq_create(struct graph *_graph, struct graph
> > *_parent_graph)
> > +{
> > +   struct rte_graph *parent_graph = _parent_graph->graph;
> > +   struct rte_graph *graph = _graph->graph;
> > +   unsigned int wq_size;
> > +
> > +   wq_size = GRAPH_SCHED_WQ_SIZE(graph->nb_nodes);
> > +   wq_size = rte_align32pow2(wq_size + 1);
> 
> Hi Zhirun,
> 
> We should introduce a new function `rte_graph_configure` which can help
> application to control the ring size and mempool size of the work queue?
> We could fallback to default values if nothing is configured.
> 
> rte_graph_configure should take a
> struct rte_graph_config {
>   struct {
>   u64 rsvd[8];
>   } rtc;
>   struct {
>   u16 wq_size;
>   ...
>   } dispatch;
> };
> 
> This will help future graph models to have their own configuration.
> 
> We can have a rte_graph_config_init() function to initialize the 
> rte_graph_config
> structure.
> 

Hi Pavan,

Thanks for your comments. I am agree with you. It would be more friendly for 
user/developer.
And for ring and mempool, there are some limitations(must be a power of 2) 
about the size. So
I prefer to use u16 wq_size_max and u32 mp_size_max for user if they have 
limited resources.

> 
> > +
> > +   graph->wq = rte_ring_create(graph->name, wq_size, graph->socket,
> > +   RING_F_SC_DEQ);
> > +   if (graph->wq == NULL)
> > +   SET_ERR_JMP(EIO, fail, "Failed to allocate graph WQ");
> > +
> > +   graph->mp = rte_mempool_create(graph->name, wq_size,
> > +  sizeof(struct graph_sched_wq_node),
> > +  0, 0, NULL, NULL, NULL, NULL,
> > +  graph->socket, MEMPOOL_F_SP_PUT);
> > +   if (graph->mp == NULL)
> > +   SET_ERR_JMP(EIO, fail_mp,
> > +   "Failed to allocate graph WQ schedule entry");
> > +
> > +   graph->lcore_id = _graph->lcore_id;
> > +
> > +   if (parent_graph->rq == NULL) {
> > +   parent_graph->rq = &parent_graph->rq_head;
> > +   SLIST_INIT(parent_graph->rq);
> > +   }
> > +
> > +   graph-

RE: [EXT] [PATCH v5 03/15] graph: move node process into inline function

2023-05-04 Thread Yan, Zhirun



> -Original Message-
> From: Pavan Nikhilesh Bhagavatula 
> Sent: Thursday, April 27, 2023 11:03 PM
> To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob
> Kollanukkaran ; Kiran Kumar Kokkilagadda
> ; Nithin Kumar Dabilpuram
> ; step...@networkplumber.org
> Cc: Liang, Cunming ; Wang, Haiyue
> 
> Subject: RE: [EXT] [PATCH v5 03/15] graph: move node process into inline
> function
> 
> > Node process is a single and reusable block, move the code into an
> > inline function.
> >
> > Signed-off-by: Haiyue Wang 
> > Signed-off-by: Cunming Liang 
> > Signed-off-by: Zhirun Yan 
> > ---
> >  lib/graph/rte_graph_model_rtc.h | 20 ++---
> >  lib/graph/rte_graph_worker_common.h | 33
> > +
> >  2 files changed, 35 insertions(+), 18 deletions(-)
> >
> > diff --git a/lib/graph/rte_graph_model_rtc.h
> > b/lib/graph/rte_graph_model_rtc.h index 665560f831..0dcb7151e9 100644
> > --- a/lib/graph/rte_graph_model_rtc.h
> > +++ b/lib/graph/rte_graph_model_rtc.h
> > @@ -20,9 +20,6 @@ rte_graph_walk_rtc(struct rte_graph *graph)
> > const rte_node_t mask = graph->cir_mask;
> > uint32_t head = graph->head;
> > struct rte_node *node;
> > -   uint64_t start;
> > -   uint16_t rc;
> > -   void **objs;
> >
> > /*
> >  * Walk on the source node(s) ((cir_start - head) -> cir_start) and
> > then @@ -41,21 +38,8 @@ rte_graph_walk_rtc(struct rte_graph *graph)
> >  */
> > while (likely(head != graph->tail)) {
> > node = (struct rte_node *)RTE_PTR_ADD(graph,
> > cir_start[(int32_t)head++]);
> > -   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
> > -   objs = node->objs;
> > -   rte_prefetch0(objs);
> > -
> > -   if (rte_graph_has_stats_feature()) {
> > -   start = rte_rdtsc();
> 
> Since we are refactoring this function could you change rte_rdtsc() to
> rte_rdtsc_precise().

Sure, I will do in next version.

> 
> > -   rc = node->process(graph, node, objs, node->idx);
> > -   node->total_cycles += rte_rdtsc() - start;
> > -   node->total_calls++;
> > -   node->total_objs += rc;
> > -   } else {
> > -   node->process(graph, node, objs, node->idx);
> > -   }
> > -   node->idx = 0;
> > -   head = likely((int32_t)head > 0) ? head & mask :
> > head;
> > +   __rte_node_process(graph, node);
> > +   head = likely((int32_t)head > 0) ? head & mask : head;
> > }
> > graph->tail = 0;
> >  }
> > diff --git a/lib/graph/rte_graph_worker_common.h
> > b/lib/graph/rte_graph_worker_common.h
> > index b58f8f6947..41428974db 100644
> > --- a/lib/graph/rte_graph_worker_common.h
> > +++ b/lib/graph/rte_graph_worker_common.h
> > @@ -130,6 +130,39 @@ void __rte_node_stream_alloc_size(struct
> > rte_graph *graph,
> >
> >  /* Fast path helper functions */
> >
> > +/**
> > + * @internal
> > + *
> > + * Enqueue a given node to the tail of the graph reel.
> > + *
> > + * @param graph
> > + *   Pointer Graph object.
> > + * @param node
> > + *   Pointer to node object to be enqueued.
> > + */
> > +static __rte_always_inline void
> > +__rte_node_process(struct rte_graph *graph, struct rte_node *node) {
> > +   uint64_t start;
> > +   uint16_t rc;
> > +   void **objs;
> > +
> > +   RTE_ASSERT(node->fence == RTE_GRAPH_FENCE);
> > +   objs = node->objs;
> > +   rte_prefetch0(objs);
> > +
> > +   if (rte_graph_has_stats_feature()) {
> > +   start = rte_rdtsc();
> > +   rc = node->process(graph, node, objs, node->idx);
> > +   node->total_cycles += rte_rdtsc() - start;
> > +   node->total_calls++;
> > +   node->total_objs += rc;
> > +   } else {
> > +   node->process(graph, node, objs, node->idx);
> > +   }
> > +   node->idx = 0;
> > +}
> > +
> >  /**
> >   * @internal
> >   *
> > --
> > 2.37.2



RE: [PATCH 09/10] net/gve: add maintainers for GVE

2023-05-04 Thread Guo, Junfeng


> -Original Message-
> From: Ferruh Yigit 
> Sent: Thursday, May 4, 2023 19:01
> To: Guo, Junfeng ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing,
> Beilei 
> Cc: dev@dpdk.org; Rushil Gupta ; Jeroen de Borst
> ; Joshua Washington 
> Subject: Re: [PATCH 09/10] net/gve: add maintainers for GVE
> 
> On 4/13/2023 7:16 AM, Junfeng Guo wrote:
> > Add maintainers from Google for GVE.
> >
> > Signed-off-by: Junfeng Guo 
> > Signed-off-by: Rushil Gupta 
> > ---
> >  MAINTAINERS | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 8df23e5099..08001751b0 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini
> >
> >  Google Virtual Ethernet
> >  M: Junfeng Guo 
> > +M: Jeroen de Borst 
> > +M: Rushil Gupta 
> > +M: Joshua Washington 
> >  F: drivers/net/gve/
> >  F: doc/guides/nics/gve.rst
> >  F: doc/guides/nics/features/gve.ini
> 
> Requested ack from new added maintainers not received (to other
> version
> of this patch [2]). This patch is not dependency for rest of the set, so
> dropping it from the set.
> 
> Also there is a standalone version of this patch [1], can you please
> send a new version to standalone one, instead of including same patch in
> various sets?
> 

Sure! Will send out a standalone version with in-reply-to [1].

Thanks for the careful review!

> 
> 
> [1]
> https://patches.dpdk.org/project/dpdk/patch/20221109072352.1387300-
> 1-junfeng@intel.com/
> 
> [2]
> https://patches.dpdk.org/project/dpdk/patch/20230328094512.1796648-
> 4-junfeng@intel.com/


[PATCH v2] net/gve: add maintainers for GVE

2023-05-04 Thread Junfeng Guo
Add maintainers from Google for GVE.

Signed-off-by: Junfeng Guo 
Signed-off-by: Jeroen de Borst 
Signed-off-by: Rushil Gupta 
Signed-off-by: Joshua Washington 
---
 MAINTAINERS | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df23e5099..08001751b0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini
 
 Google Virtual Ethernet
 M: Junfeng Guo 
+M: Jeroen de Borst 
+M: Rushil Gupta 
+M: Joshua Washington 
 F: drivers/net/gve/
 F: doc/guides/nics/gve.rst
 F: doc/guides/nics/features/gve.ini
-- 
2.34.1



[PATCH v2] net/iavf: add devargs to control watchdog

2023-05-04 Thread Zhichao Zeng
This patch enables the watchdog to detect VF FLR when the link state
changes to down, and the default period is 2000us as defined by
IAVF_DEV_WATCHDOG_PERIOD.

In addition, the 'watchdog_period' devargs was added to adjust the watchdog
period(microseconds), or set to 0 to disable the watchdog.

Signed-off-by: Zhichao Zeng 

---
v2: enables watchdog when link status changes to down
---
 drivers/net/iavf/iavf.h|  5 ++-
 drivers/net/iavf/iavf_ethdev.c | 81 +-
 drivers/net/iavf/iavf_vchnl.c  | 21 +++--
 3 files changed, 81 insertions(+), 26 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index aa18650ffa..98861e4242 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -31,7 +31,7 @@
 
 #define IAVF_NUM_MACADDR_MAX  64
 
-#define IAVF_DEV_WATCHDOG_PERIOD 0
+#define IAVF_DEV_WATCHDOG_PERIOD 2000 /* microseconds, set 0 to disable*/
 
 #define IAVF_DEFAULT_RX_PTHRESH  8
 #define IAVF_DEFAULT_RX_HTHRESH  8
@@ -304,6 +304,7 @@ struct iavf_devargs {
uint8_t proto_xtr_dflt;
uint8_t proto_xtr[IAVF_MAX_QUEUE_NUM];
uint16_t quanta_size;
+   uint32_t watchdog_period;
 };
 
 struct iavf_security_ctx;
@@ -498,4 +499,6 @@ int iavf_flow_unsub(struct iavf_adapter *adapter,
struct iavf_fsub_conf *filter);
 int iavf_flow_sub_check(struct iavf_adapter *adapter,
struct iavf_fsub_conf *filter);
+void iavf_dev_watchdog_enable(struct iavf_adapter *adapter);
+void iavf_dev_watchdog_disable(struct iavf_adapter *adapter);
 #endif /* _IAVF_ETHDEV_H_ */
diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c
index f6d68403ce..4cf4eb20cd 100644
--- a/drivers/net/iavf/iavf_ethdev.c
+++ b/drivers/net/iavf/iavf_ethdev.c
@@ -36,6 +36,7 @@
 /* devargs */
 #define IAVF_PROTO_XTR_ARG "proto_xtr"
 #define IAVF_QUANTA_SIZE_ARG   "quanta_size"
+#define IAVF_RESET_WATCHDOG_ARG"watchdog_period"
 
 uint64_t iavf_timestamp_dynflag;
 int iavf_timestamp_dynfield_offset = -1;
@@ -43,6 +44,7 @@ int iavf_timestamp_dynfield_offset = -1;
 static const char * const iavf_valid_args[] = {
IAVF_PROTO_XTR_ARG,
IAVF_QUANTA_SIZE_ARG,
+   IAVF_RESET_WATCHDOG_ARG,
NULL
 };
 
@@ -301,40 +303,46 @@ iavf_dev_watchdog(void *cb_arg)
 
/* enter reset state with VFLR event */
adapter->vf.vf_reset = true;
+   adapter->vf.link_up = false;
 
rte_eth_dev_callback_process(adapter->vf.eth_dev,
RTE_ETH_EVENT_INTR_RESET, NULL);
}
}
 
-   /* re-alarm watchdog */
-   rc = rte_eal_alarm_set(IAVF_DEV_WATCHDOG_PERIOD,
-   &iavf_dev_watchdog, cb_arg);
+   if (adapter->devargs.watchdog_period) {
+   /* re-alarm watchdog */
+   rc = rte_eal_alarm_set(adapter->devargs.watchdog_period,
+   &iavf_dev_watchdog, cb_arg);
 
-   if (rc)
-   PMD_DRV_LOG(ERR, "Failed \"%s\" to reset device watchdog alarm",
-   adapter->vf.eth_dev->data->name);
+   if (rc)
+   PMD_DRV_LOG(ERR, "Failed \"%s\" to reset device 
watchdog alarm",
+   adapter->vf.eth_dev->data->name);
+   }
 }
 
-static void
-iavf_dev_watchdog_enable(struct iavf_adapter *adapter __rte_unused)
-{
-#if (IAVF_DEV_WATCHDOG_PERIOD > 0)
-   PMD_DRV_LOG(INFO, "Enabling device watchdog");
-   adapter->vf.watchdog_enabled = true;
-   if (rte_eal_alarm_set(IAVF_DEV_WATCHDOG_PERIOD,
-   &iavf_dev_watchdog, (void *)adapter))
-   PMD_DRV_LOG(ERR, "Failed to enabled device watchdog");
-#endif
+void
+iavf_dev_watchdog_enable(struct iavf_adapter *adapter)
+{
+   if (adapter->devargs.watchdog_period && !adapter->vf.watchdog_enabled) {
+   PMD_DRV_LOG(INFO, "Enabling device watchdog, period is %dμs",
+   adapter->devargs.watchdog_period);
+   adapter->vf.watchdog_enabled = true;
+   if (rte_eal_alarm_set(adapter->devargs.watchdog_period,
+   &iavf_dev_watchdog, (void *)adapter))
+   PMD_DRV_LOG(ERR, "Failed to enabled device watchdog");
+   } else {
+   PMD_DRV_LOG(INFO, "Device watchdog is disabled");
+   }
 }
 
-static void
-iavf_dev_watchdog_disable(struct iavf_adapter *adapter __rte_unused)
+void
+iavf_dev_watchdog_disable(struct iavf_adapter *adapter)
 {
-#if (IAVF_DEV_WATCHDOG_PERIOD > 0)
-   PMD_DRV_LOG(INFO, "Disabling device watchdog");
-   adapter->vf.watchdog_enabled = false;
-#endif
+   if (adapter->devargs.watchdog_period && adapter->vf.watchdog_enabled) {
+   PMD_DRV_LOG(INFO, "Disabling device watchdog");
+   adapter->vf.watchdog_enabled = 

RE: [PATCH v3] net/iavf: fix iavf query stats in intr thread

2023-05-04 Thread Deng, KaiwenX


> -Original Message-
> From: Ferruh Yigit 
> Sent: Monday, March 27, 2023 8:38 PM
> To: Deng, KaiwenX ; dev@dpdk.org
> Cc: sta...@dpdk.org; Yang, Qiming ; Zhou, YidingX
> ; Chas Williams ; Min Hu (Connor)
> ; Wu, Jingjing ; Xing, Beilei
> ; Mike Pattrick ; Zhang, Qi Z
> ; Doherty, Declan ;
> Mrzyglod, Daniel T ; Dapeng Yu
> ; Zhang, Helin ;
> Mcnamara, John ; Thomas Monjalon
> 
> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> 
> On 3/27/2023 1:31 PM, Ferruh Yigit wrote:
> > On 3/27/2023 6:31 AM, Deng, KaiwenX wrote:
> >>
> >>
> >>> -Original Message-
> >>> From: Ferruh Yigit 
> >>> Sent: Thursday, March 23, 2023 11:39 PM
> >>> To: Deng, KaiwenX ; dev@dpdk.org
> >>> Cc: sta...@dpdk.org; Yang, Qiming ; Zhou,
> >>> YidingX ; Chas Williams ; Min
> >>> Hu (Connor) ; Wu, Jingjing
> >>> ; Xing, Beilei ; Mike
> >>> Pattrick ; Zhang, Qi Z ;
> >>> Doherty, Declan ; Mrzyglod, Daniel T
> >>> ; Dapeng Yu 
> >>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr
> >>> thread
> >>>
> >>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote:
>  When iavf send query-stats command in eal-intr-thread through
>  virtual channel, there will be no response received from
>  iavf_dev_virtchnl_handler for this command during block and wait.
>  Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread.
> 
>  When vf device is bonded as BONDING_MODE_TLB mode, the slave
> device
>  update callback will registered in alarm and called by
>  eal-intr-thread, it would also raise the above issue.
> 
>  This commit add to poll the response for VIRTCHNL_OP_GET_STATS
> when
> >>> it
>  is called by eal-intr-thread to fix this issue.
> 
>  Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands")
>  Fixes: 22b123a36d07 ("net/avf: initialize PMD")
>  Fixes: 7c76a747e68c ("bond: add mode 5")
>  Fixes: 435d523112cc ("net/iavf: fix multi-process shared data")
>  Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks")
> >>>
> >>>
> >>> Hi Kaiwen,
> >>>
> >>> Above commit already seems trying to address same issue, it creates
> >>> "iavf- event-thread" control thread to asyncroniously handle the
> >>> interrupts, in non- interrupt context, why it is not working?
> >>>
> >>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure
> >>> all interrupts handled in control tread?
> >>>
> >>> And can you please provide a stack trace in commit log, to describe
> >>> the issue better?
> >> Hi Ferru,
> >> Sorry for my late reply, And thanks for your review.
> >>
> >> The above commit does not fix this issue when we need to get the
> returned data.
> >> If we call iavf_query_stats and wait for response statistics in the intr-
> thread.
> >> iavf_handle_virtchnl_msg is also registered in the intr_thread and
> >> will not be executed while waiting.
> >>
> >
> > Got it, since return value is required, API can't be called asyncroniously.
> >
> >
> >
> > I think 'rte_thread_is_intr()' checks may cause more trouble for you
> > in long term,
> >
> > - why 'iavf_query_stats()' is called in the iterrupt thread, can it be
> > prevented?
> >
> > - does it make sense to allways poll messages from PF (for simplification)?
> >
> >
> > If answer to both are 'No', I am OK to continue with current proposal
> > if you are happy with it.
> >
> 
> 
> btw, how critical is this issue?
> 
> If it is critical, I am OK to get it as it is for this release and 
> investigate it further
> for next release, since only a few days left for this release.
>
Hi Ferruh,

I didn't find a more suitable solution after consideration, if you have a 
better suggestion, 
please let me know, thanks.
> 
> >
> >> This commit I changed it to polling for replies to commands executed in
> the interrupt thread.
> >>
> >> main thread
> >> interrupt
> thread
> >>  | 
> >>   |
> >>  | 
> >>   |
> >> iavf_query_stats   
> >>  |
> >> iavf_execute_vf_cmd
> >>|
> >> iavf_aq_send_msg_to_pf  and wait handle complete   
> >> |
> >>  | 
> >>  |
> >>  
> >> |-
> --->|
> >>  | 
> >>  

Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions

2023-05-04 Thread zhoumin

Hi Qi,

On Thur, May 4, 2023 at 9:33PM, Zhang, Qi Z wrote:



-Original Message-
From: Morten Brørup 
Sent: Thursday, May 4, 2023 9:22 PM
To: zhoumin ; Konstantin Ananyev

Cc: dev@dpdk.org; maob...@loongson.cn; Yang, Qiming
; Wu, Wenjun1 ;
ruifeng.w...@arm.com; d...@linux.vnet.ibm.com; Tyler Retzlaff

Subject: RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx
functions


From: zhoumin [mailto:zhou...@loongson.cn]
Sent: Thursday, 4 May 2023 15.17

Hi Konstantin,

Thanks for your  comments.

On 2023/5/1 下午9:29, Konstantin Ananyev wrote:

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson
3C5000 processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as
the first packet has the EOP bit set, and the length of this packet
is less than or equal to rxq->crc_len, the segmentation fault will
definitely happen even though on the other platforms, such as X86.

Sorry to interrupt, but I am curious why this issue still exists on x86 
architecture. Can volatile be used to instruct the compiler to generate read 
instructions in a specific order, and does x86 guarantee not to reorder load 
operations?
Actually, I did not see the segmentation fault on X86. I just made the 
first packet which had the EOP bit set had a zero length, then the 
segmentation fault would happen on X86. So, I thought that the 
out-of-order access to the descriptor might be possible to make the 
ready packet zero length, and this case was more likely to cause the 
segmentation fault.

Because when processd the first packet the first_seg->next will be
NULL, if at the same time this packet has the EOP bit set and its
length is less than or equal to rxq->crc_len, the following loop
will be excecuted:

     for (lp = first_seg; lp->next != rxm; lp = lp->next)
     ;

We know that the first_seg->next will be NULL under this condition.
So the
expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be
greater than rxq->crc_len. However, the out-of-order execution of
CPU may make the read ordering of the status and the rest of the
descriptor fields in this function not be correct. The related
codes are as following:

     rxdp = &rx_ring[rx_id];
  #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

     if (!(staterr & IXGBE_RXDADV_STAT_DD))
     break;

  #2 rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is
likely to make the ready packet zero length. If the packet is the
first packet and has the EOP bit set, the above segmentation fault
will happen.

So, we should add rte_rmb() to ensure the read ordering be correct.
We also
did the same thing in the ixgbe_recv_pkts() function to make the
rxd data be valid even thougth we did not find segmentation fault
in this function.

Signed-off-by: Min Zhou 
---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff
100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct
rte_mbuf **rx_pkts,
  staterr = rxdp->wb.upper.status_error;
  if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
  break;
+
+    rte_rmb();
  rxd = *rxdp;



Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.

The LoongArch architecture uses the Weak Consistency model which can
cause the problem, especially in scenario with many cores, such as
Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot
reproduce it on Loongson 3C5000 with one NUMA node, which just has 16

cores.

About a fix - looks right, but a bit excessive to me - as I
understand all we need here is to prevent re-ordering by CPU itself.

Yes, thanks for cc-ing.

So rte_smp_rmb() seems enough here.
Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error,
__ATOMIC_ACQUIRE);


Does __atomic_load_n() work on Windows if we use it to solve this

problem ?

Yes, __atomic_load_n() works on Windows too.


Best regards,

Min




Re: [dpdk-dev][PATCH] ethdev: add send queue flow matching item

2023-05-04 Thread Jerin Jacob
On Thu, Apr 20, 2023 at 10:59 AM  wrote:
>
> From: Kiran Kumar K 
>
> Adding support for send queue flow matching item.

To be consistent, use Tx queue every where.(git commit subject too)

> This item is valid only for egress rules.
> An example use case would be that application can
> set different vlan insert rules with different PCP values
> based on tx queue number.
>
> Signed-off-by: Kiran Kumar K 
> ---
>  app/test-pmd/cmdline_flow.c | 28 +++
>  doc/guides/prog_guide/rte_flow.rst  |  7 +
>  doc/guides/rel_notes/release_23_07.rst  | 31 ++---
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++
>  lib/ethdev/rte_flow.c   |  1 +
>  lib/ethdev/rte_flow.h   | 26 +
>  6 files changed, 68 insertions(+), 29 deletions(-)
>
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
> index 58939ec321..a68a6080a8 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -496,6 +496,8 @@ enum index {
> ITEM_QUOTA_STATE_NAME,
> ITEM_AGGR_AFFINITY,
> ITEM_AGGR_AFFINITY_VALUE,
> +   ITEM_TX_QUEUE,
> +   ITEM_TX_QUEUE_VALUE,
>
> /* Validate/create actions. */
> ACTIONS,
> @@ -1452,6 +1454,7 @@ static const enum index next_item[] = {
> ITEM_METER,
> ITEM_QUOTA,
> ITEM_AGGR_AFFINITY,
> +   ITEM_TX_QUEUE,
> END_SET,
> ZERO,
>  };
> @@ -1953,6 +1956,12 @@ static const enum index item_aggr_affinity[] = {
> ZERO,
>  };
>
> +static const enum index item_tx_queue[] = {
> +   ITEM_TX_QUEUE_VALUE,
> +   ITEM_NEXT,
> +   ZERO,
> +};
> +
>  static const enum index next_action[] = {
> ACTION_END,
> ACTION_VOID,
> @@ -6945,6 +6954,22 @@ static const struct token token_list[] = {
> .args = ARGS(ARGS_ENTRY(struct rte_flow_item_aggr_affinity,
> affinity)),
> },
> +   [ITEM_TX_QUEUE] = {
> +   .name = "tx_queue",
> +   .help = "match on the tx queue of send packet",
> +   .priv = PRIV_ITEM(TX_QUEUE,
> + sizeof(struct rte_flow_item_tx_queue)),
> +   .next = NEXT(item_tx_queue),
> +   .call = parse_vc,
> +   },
> +   [ITEM_TX_QUEUE_VALUE] = {
> +   .name = "tx_queue_value",
> +   .help = "tx queue value",
> +   .next = NEXT(item_tx_queue, NEXT_ENTRY(COMMON_UNSIGNED),
> +item_param),
> +   .args = ARGS(ARGS_ENTRY(struct rte_flow_item_tx_queue,
> +   tx_queue)),
> +   },
>  };
>
>  /** Remove and return last entry from argument stack. */
> @@ -11849,6 +11874,9 @@ flow_item_default_mask(const struct rte_flow_item 
> *item)
> case RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY:
> mask = &rte_flow_item_aggr_affinity_mask;
> break;
> +   case RTE_FLOW_ITEM_TYPE_TX_QUEUE:
> +   mask = &rte_flow_item_tx_queue_mask;
> +   break;
> default:
> break;
> }
> diff --git a/doc/guides/prog_guide/rte_flow.rst 
> b/doc/guides/prog_guide/rte_flow.rst
> index 32fc45516a..7154b56330 100644
> --- a/doc/guides/prog_guide/rte_flow.rst
> +++ b/doc/guides/prog_guide/rte_flow.rst
> @@ -1486,6 +1486,13 @@ This item is meant to use the same structure as `Item: 
> PORT_REPRESENTOR`_.
>
>  See also `Action: REPRESENTED_PORT`_.
>
> +Item: ``TX_QUEUE``
> +^^^
> +
> +Matches on the tx queue of send packet .

tx -> Tx

> +
> +- ``tx_queue``: Tx queue.
> +
>  Item: ``AGGR_AFFINITY``
>  ^^^
>
> diff --git a/doc/guides/rel_notes/release_23_07.rst 
> b/doc/guides/rel_notes/release_23_07.rst
> index a9b1293689..631cbd2b58 100644
> --- a/doc/guides/rel_notes/release_23_07.rst
> +++ b/doc/guides/rel_notes/release_23_07.rst
> @@ -24,36 +24,9 @@ DPDK Release 23.07
>  New Features
>  
>
> -.. This section should contain new features added in this release.
> -   Sample format:
> +* **Added flow matching of tx queue.**

Same as below. Right?

>
> -   * **Add a title in the past tense with a full stop.**
> -
> - Add a short 1-2 sentence description in the past tense.
> - The description should be enough to allow someone scanning
> - the release notes to understand the new feature.
> -
> - If the feature adds a lot of sub-features you can use a bullet list
> - like this:
> -
> - * Added feature foo to do something.
> - * Enhanced feature bar to do something else.
> -
> - Refer to the previous release notes for examples.
> -
> - Suggested order in release notes items:
> - * Core libs (EAL, mempool, ring, mbuf, buses)
> - * Device abstraction libs and PMDs (ordered alphabetically by vendor 
> name)
> -   - ethdev (lib, PMDs)
> -   - c

RE: [RFC 10/27] vhost: retry translating IOVA after IOTLB miss

2023-05-04 Thread Xia, Chenbo
> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 10/27] vhost: retry translating IOVA after IOTLB miss
> 
> Vhost-user backend IOTLB misses and updates are
> asynchronous, so IOVA address translation function
> just fails after having sent an IOTLB miss update if needed
> entry was not in the IOTLB cache.
> 
> This is not the case for VDUSE, for which the needed IOTLB
> update is returned directly when sending an IOTLB miss.
> 
> This patch retry again finding the needed entry in the
> IOTLB cache after having sent an IOTLB miss.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/vhost/vhost.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index d35075b96c..4f16307e4d 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -96,6 +96,12 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>   vhost_user_iotlb_rd_lock(vq);
>   }
> 
> + tmp_size = *size;
> + /* Retry in case of VDUSE, as it is synchronous */
> + vva = vhost_user_iotlb_cache_find(dev, iova, &tmp_size, perm);
> + if (tmp_size == *size)
> + return vva;
> +
>   return 0;
>  }
> 
> --
> 2.39.2

Reviewed-by: Chenbo Xia  


RE: [RFC 11/27] vhost: introduce backend ops

2023-05-04 Thread Xia, Chenbo
> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 11/27] vhost: introduce backend ops
> 
> This patch introduces backend ops struct, that will enable
> calling backend specifics callbacks (Vhost-user, VDUSE), in
> shared code.
> 
> This is an empty shell for now, it will be filled in later
> patches.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/vhost/socket.c |  2 +-
>  lib/vhost/vhost.c  |  8 +++-
>  lib/vhost/vhost.h  | 10 +-
>  lib/vhost/vhost_user.c |  8 
>  lib/vhost/vhost_user.h |  1 +
>  5 files changed, 26 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 669c322e12..ba54263824 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -221,7 +221,7 @@ vhost_user_add_connection(int fd, struct
> vhost_user_socket *vsocket)
>   return;
>   }
> 
> - vid = vhost_new_device();
> + vid = vhost_user_new_device();
>   if (vid == -1) {
>   goto err;
>   }
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 4f16307e4d..41f212315e 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -676,11 +676,16 @@ reset_device(struct virtio_net *dev)
>   * there is a new virtio device being attached).
>   */
>  int
> -vhost_new_device(void)
> +vhost_new_device(struct vhost_backend_ops *ops)
>  {
>   struct virtio_net *dev;
>   int i;
> 
> + if (ops == NULL) {
> + VHOST_LOG_CONFIG("device", ERR, "missing backend ops.\n");
> + return -1;
> + }
> +
>   pthread_mutex_lock(&vhost_dev_lock);
>   for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) {
>   if (vhost_devices[i] == NULL)
> @@ -708,6 +713,7 @@ vhost_new_device(void)
>   dev->backend_req_fd = -1;
>   dev->postcopy_ufd = -1;
>   rte_spinlock_init(&dev->backend_req_lock);
> + dev->backend_ops = ops;
> 
>   return i;
>  }
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index 4ace5ab081..cc5c707205 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -89,6 +89,12 @@
>   for (iter = val; iter < num; iter++)
>  #endif
> 
> +/**
> + * Structure that contains backend-specific ops.
> + */
> +struct vhost_backend_ops {
> +};
> +
>  /**
>   * Structure contains buffer address, length and descriptor index
>   * from vring to do scatter RX.
> @@ -513,6 +519,8 @@ struct virtio_net {
>   void*extern_data;
>   /* pre and post vhost user message handlers for the device */
>   struct rte_vhost_user_extern_ops extern_ops;
> +
> + struct vhost_backend_ops *backend_ops;
>  } __rte_cache_aligned;
> 
>  static inline void
> @@ -812,7 +820,7 @@ get_device(int vid)
>   return dev;
>  }
> 
> -int vhost_new_device(void);
> +int vhost_new_device(struct vhost_backend_ops *ops);
>  void cleanup_device(struct virtio_net *dev, int destroy);
>  void reset_device(struct virtio_net *dev);
>  void vhost_destroy_device(int);
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index a989f2c46d..2d5dec5bc1 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -3464,3 +3464,11 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t
> qid, bool enable)
> 
>   return ret;
>  }
> +
> +static struct vhost_backend_ops vhost_user_backend_ops;
> +
> +int
> +vhost_user_new_device(void)
> +{
> + return vhost_new_device(&vhost_user_backend_ops);
> +}
> diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h
> index a0987a58f9..61456049c8 100644
> --- a/lib/vhost/vhost_user.h
> +++ b/lib/vhost/vhost_user.h
> @@ -185,5 +185,6 @@ int vhost_user_iotlb_miss(struct virtio_net *dev,
> uint64_t iova, uint8_t perm);
>  int read_fd_message(char *ifname, int sockfd, char *buf, int buflen, int
> *fds, int max_fds,
>   int *fd_num);
>  int send_fd_message(char *ifname, int sockfd, char *buf, int buflen, int
> *fds, int fd_num);
> +int vhost_user_new_device(void);
> 
>  #endif
> --
> 2.39.2

Reviewed-by: Chenbo Xia  


RE: [RFC 12/27] vhost: add IOTLB cache entry removal callback

2023-05-04 Thread Xia, Chenbo
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 12/27] vhost: add IOTLB cache entry removal callback
> 
> VDUSE will need to munmap() the IOTLB entry on removal
> from the cache, as it performs mmap() before insertion.
> 
> This patch introduces a callback that VDUSE layer will
> implement to achieve this.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/vhost/iotlb.c | 12 
>  lib/vhost/vhost.h |  4 
>  2 files changed, 16 insertions(+)
> 
> diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c
> index 188dfb8e38..86b0be62b4 100644
> --- a/lib/vhost/iotlb.c
> +++ b/lib/vhost/iotlb.c
> @@ -25,6 +25,15 @@ struct vhost_iotlb_entry {
> 
>  #define IOTLB_CACHE_SIZE 2048
> 
> +static void
> +vhost_user_iotlb_remove_notify(struct virtio_net *dev, struct
> vhost_iotlb_entry *entry)
> +{
> + if (dev->backend_ops->iotlb_remove_notify == NULL)
> + return;
> +
> + dev->backend_ops->iotlb_remove_notify(entry->uaddr, entry->uoffset,
> entry->size);
> +}
> +
>  static bool
>  vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct
> vhost_iotlb_entry *b)
>  {
> @@ -198,6 +207,7 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net
> *dev)
>   vhost_user_iotlb_set_dump(node);
> 
>   TAILQ_REMOVE(&dev->iotlb_list, node, next);
> + vhost_user_iotlb_remove_notify(dev, node);
>   vhost_user_iotlb_pool_put(dev, node);
>   }
> 
> @@ -223,6 +233,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net
> *dev)
>   vhost_user_iotlb_clear_dump(node, prev_node, next_node);
> 
>   TAILQ_REMOVE(&dev->iotlb_list, node, next);
> + vhost_user_iotlb_remove_notify(dev, node);
>   vhost_user_iotlb_pool_put(dev, node);
>   dev->iotlb_cache_nr--;
>   break;
> @@ -314,6 +325,7 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev,
> uint64_t iova, uint64_t si
>   vhost_user_iotlb_clear_dump(node, prev_node, next_node);
> 
>   TAILQ_REMOVE(&dev->iotlb_list, node, next);
> + vhost_user_iotlb_remove_notify(dev, node);
>   vhost_user_iotlb_pool_put(dev, node);
>   dev->iotlb_cache_nr--;
>   } else {
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index cc5c707205..2ad26f6951 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -89,10 +89,14 @@
>   for (iter = val; iter < num; iter++)
>  #endif
> 
> +struct virtio_net;

Adding this in patch 13 could be better since this patch is not using it.

Thanks,
Chenbo

> +typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off,
> uint64_t size);
> +
>  /**
>   * Structure that contains backend-specific ops.
>   */
>  struct vhost_backend_ops {
> + vhost_iotlb_remove_notify iotlb_remove_notify;
>  };
> 
>  /**
> --
> 2.39.2



RE: [RFC 15/27] vhost: add API to set max queue pairs

2023-05-04 Thread Xia, Chenbo
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 15/27] vhost: add API to set max queue pairs
> 
> This patch introduces a new rte_vhost_driver_set_max_queues
> API as preliminary work for multiqueue support with VDUSE.
> 
> Indeed, with VDUSE we need to pre-allocate the vrings at
> device creation time, so we need such API not to allocate
> the 128 queue pairs supported by the Vhost library.
> 
> Calling the API is optional, 128 queue pairs remaining the
> default.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  doc/guides/prog_guide/vhost_lib.rst |  4 
>  lib/vhost/rte_vhost.h   | 17 ++
>  lib/vhost/socket.c  | 36 +++--
>  lib/vhost/version.map   |  3 +++
>  4 files changed, 58 insertions(+), 2 deletions(-)

Also add changes in release notes? Btw: somewhere we should also mention vduse
support is added in release notes.

Thanks,
Chenbo

> 
> diff --git a/doc/guides/prog_guide/vhost_lib.rst
> b/doc/guides/prog_guide/vhost_lib.rst
> index e8bb8c9b7b..cd4b109139 100644
> --- a/doc/guides/prog_guide/vhost_lib.rst
> +++ b/doc/guides/prog_guide/vhost_lib.rst
> @@ -334,6 +334,10 @@ The following is an overview of some key Vhost API
> functions:
>Clean DMA vChannel finished to use. After this function is called,
>the specified DMA vChannel should no longer be used by the Vhost
> library.
> 
> +* ``rte_vhost_driver_set_max_queue_num(path, max_queue_pairs)``
> +
> +  Set the maximum number of queue pairs supported by the device.
> +
>  Vhost-user Implementations
>  --
> 
> diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
> index 58a5d4be92..44cbfcb469 100644
> --- a/lib/vhost/rte_vhost.h
> +++ b/lib/vhost/rte_vhost.h
> @@ -588,6 +588,23 @@ rte_vhost_driver_get_protocol_features(const char
> *path,
>  int
>  rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num);
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice.
> + *
> + * Set the maximum number of queue pairs supported by the device.
> + *
> + * @param path
> + *  The vhost-user socket file path
> + * @param max_queue_pairs
> + *  The maximum number of queue pairs
> + * @return
> + *  0 on success, -1 on failure
> + */
> +__rte_experimental
> +int
> +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t
> max_queue_pairs);
> +
>  /**
>   * Get the feature bits after negotiation
>   *
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index ba54263824..e95c3ffeac 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -56,6 +56,8 @@ struct vhost_user_socket {
> 
>   uint64_t protocol_features;
> 
> + uint32_t max_queue_pairs;
> +
>   struct rte_vdpa_device *vdpa_dev;
> 
>   struct rte_vhost_device_ops const *notify_ops;
> @@ -821,7 +823,7 @@ rte_vhost_driver_get_queue_num(const char *path,
> uint32_t *queue_num)
> 
>   vdpa_dev = vsocket->vdpa_dev;
>   if (!vdpa_dev) {
> - *queue_num = VHOST_MAX_QUEUE_PAIRS;
> + *queue_num = vsocket->max_queue_pairs;
>   goto unlock_exit;
>   }
> 
> @@ -831,7 +833,36 @@ rte_vhost_driver_get_queue_num(const char *path,
> uint32_t *queue_num)
>   goto unlock_exit;
>   }
> 
> - *queue_num = RTE_MIN((uint32_t)VHOST_MAX_QUEUE_PAIRS,
> vdpa_queue_num);
> + *queue_num = RTE_MIN(vsocket->max_queue_pairs, vdpa_queue_num);
> +
> +unlock_exit:
> + pthread_mutex_unlock(&vhost_user.mutex);
> + return ret;
> +}
> +
> +int
> +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t
> max_queue_pairs)
> +{
> + struct vhost_user_socket *vsocket;
> + int ret = 0;
> +
> + VHOST_LOG_CONFIG(path, INFO, "Setting max queue pairs to %u\n",
> max_queue_pairs);
> +
> + if (max_queue_pairs > VHOST_MAX_QUEUE_PAIRS) {
> + VHOST_LOG_CONFIG(path, ERR, "Library only supports up to %u
> queue pairs\n",
> + VHOST_MAX_QUEUE_PAIRS);
> + return -1;
> + }
> +
> + pthread_mutex_lock(&vhost_user.mutex);
> + vsocket = find_vhost_user_socket(path);
> + if (!vsocket) {
> + VHOST_LOG_CONFIG(path, ERR, "socket file is not registered
> yet.\n");
> + ret = -1;
> + goto unlock_exit;
> + }
> +
> + vsocket->max_queue_pairs = max_queue_pairs;
> 
>  unlock_exit:
>   pthread_mutex_unlock(&vhost_user.mutex);
> @@ -890,6 +921,7 @@ rte_vhost_driver_register(const char *path, uint64_t
> flags)
>   goto out_free;
>   }
>   vsocket->vdpa_dev = NULL;
> + vsocket->max_queue_pairs = VHOST_MAX_QUEUE_PAIRS;
>   vsocke

RE: [RFC 16/27] net/vhost: use API to set max queue pairs

2023-05-04 Thread Xia, Chenbo
> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 16/27] net/vhost: use API to set max queue pairs
> 
> In order to support multiqueue with VDUSE, we need to
> be able to limit the maximum number of queue pairs, to
> avoid unnecessary memory consumption since the maximum
> number of queue pairs need to be allocated at device
> creation time, as opposed to Vhost-user which allocate
> only when the frontend initialize them.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  drivers/net/vhost/rte_eth_vhost.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/vhost/rte_eth_vhost.c
> b/drivers/net/vhost/rte_eth_vhost.c
> index 62ef955ebc..8d37ec9775 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -1013,6 +1013,9 @@ vhost_driver_setup(struct rte_eth_dev *eth_dev)
>   goto drv_unreg;
>   }
> 
> + if (rte_vhost_driver_set_max_queue_num(internal->iface_name,
> internal->max_queues))
> + goto drv_unreg;
> +
>   if (rte_vhost_driver_callback_register(internal->iface_name,
>  &vhost_ops) < 0) {
>   VHOST_LOG(ERR, "Can't register callbacks\n");
> --
> 2.39.2

Reviewed-by: Chenbo Xia  


RE: [RFC 14/27] vhost: add helper for interrupt injection

2023-05-04 Thread Xia, Chenbo
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 14/27] vhost: add helper for interrupt injection
> 
> Vhost-user uses eventfd to inject IRQs, but VDUSE uses
> an ioctl.
> 
> This patch prepares vhost_vring_call_split() and
> vhost_vring_call_packed() to support VDUSE by introducing
> a new helper.
> 
> It also adds a new counter to for guest notification

to for -> for?

With this fixed:

Reviewed-by: Chenbo Xia  

> failures, which could happen in case of uninitialized call
> file descriptor for example.
> 
> Signed-off-by: Maxime Coquelin 
> ---
>  lib/vhost/vhost.c  |  6 +
>  lib/vhost/vhost.h  | 54 +++---
>  lib/vhost/vhost_user.c | 10 
>  3 files changed, 46 insertions(+), 24 deletions(-)
> 
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 790eb06b28..c07028f2b3 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -44,6 +44,7 @@ static const struct vhost_vq_stats_name_off
> vhost_vq_stat_strings[] = {
>   {"size_1024_1518_packets", offsetof(struct vhost_virtqueue,
> stats.size_bins[6])},
>   {"size_1519_max_packets",  offsetof(struct vhost_virtqueue,
> stats.size_bins[7])},
>   {"guest_notifications",offsetof(struct vhost_virtqueue,
> stats.guest_notifications)},
> + {"guest_notifications_error",offsetof(struct vhost_virtqueue,
> stats.guest_notifications_error)},
>   {"iotlb_hits", offsetof(struct vhost_virtqueue,
> stats.iotlb_hits)},
>   {"iotlb_misses",   offsetof(struct vhost_virtqueue,
> stats.iotlb_misses)},
>   {"inflight_submitted", offsetof(struct vhost_virtqueue,
> stats.inflight_submitted)},
> @@ -697,6 +698,11 @@ vhost_new_device(struct vhost_backend_ops *ops)
>   return -1;
>   }
> 
> + if (ops->inject_irq == NULL) {
> + VHOST_LOG_CONFIG("device", ERR, "missing IRQ injection backend
> op.\n");
> + return -1;
> + }
> +
>   pthread_mutex_lock(&vhost_dev_lock);
>   for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) {
>   if (vhost_devices[i] == NULL)
> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> index ee7640e901..8f0875b4e2 100644
> --- a/lib/vhost/vhost.h
> +++ b/lib/vhost/vhost.h
> @@ -90,16 +90,20 @@
>  #endif
> 
>  struct virtio_net;
> +struct vhost_virtqueue;
> +
>  typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off,
> uint64_t size);
> 
>  typedef int (*vhost_iotlb_miss_cb)(struct virtio_net *dev, uint64_t iova,
> uint8_t perm);
> 
> +typedef int (*vhost_vring_inject_irq_cb)(struct virtio_net *dev, struct
> vhost_virtqueue *vq);
>  /**
>   * Structure that contains backend-specific ops.
>   */
>  struct vhost_backend_ops {
>   vhost_iotlb_remove_notify iotlb_remove_notify;
>   vhost_iotlb_miss_cb iotlb_miss;
> + vhost_vring_inject_irq_cb inject_irq;
>  };
> 
>  /**
> @@ -149,6 +153,7 @@ struct virtqueue_stats {
>   /* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */
>   uint64_t size_bins[8];
>   uint64_t guest_notifications;
> + uint64_t guest_notifications_error;
>   uint64_t iotlb_hits;
>   uint64_t iotlb_misses;
>   uint64_t inflight_submitted;
> @@ -900,6 +905,24 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx,
> uint16_t old)
>   return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx -
> old);
>  }
> 
> +static __rte_always_inline void
> +vhost_vring_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq)
> +{
> + int ret;
> +
> + ret = dev->backend_ops->inject_irq(dev, vq);
> + if (ret) {
> + if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> + vq->stats.guest_notifications_error++;
> + return;
> + }
> +
> + if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> + vq->stats.guest_notifications++;
> + if (dev->notify_ops->guest_notified)
> + dev->notify_ops->guest_notified(dev->vid);
> +}
> +
>  static __rte_always_inline void
>  vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq)
>  {
> @@ -919,25 +942,13 @@ vhost_vring_call_split(struct virtio_net *dev,
> struct vhost_virtqueue *vq)
>   "%s: used_event_idx=%d, old=%d, new=%d\n",
>   __func__, vhost_used_event(vq), old, new);
> 
> - if ((vhost_need_event(vhost_used_event(vq), new, old) ||
> - unlikely(!signalled_used_valid)) &&
> - vq->callfd >= 0) {
> - eventfd_write(vq->callfd, (eventfd_t) 1);
> - if (dev->flags & VIRTIO_DEV_STATS_ENABLED)
> - 

RE: [RFC 00/27] Add VDUSE support to Vhost library

2023-05-04 Thread Xia, Chenbo
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Friday, March 31, 2023 11:43 PM
> To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo
> ; m...@redhat.com; f...@redhat.com;
> jasow...@redhat.com; Liang, Cunming ; Xie, Yongji
> ; echau...@redhat.com; epere...@redhat.com;
> amore...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [RFC 00/27] Add VDUSE support to Vhost library
> 
> This series introduces a new type of backend, VDUSE,
> to the Vhost library.
> 
> VDUSE stands for vDPA device in Userspace, it enables
> implementing a Virtio device in userspace and have it
> attached to the Kernel vDPA bus.
> 
> Once attached to the vDPA bus, it can be used either by
> Kernel Virtio drivers, like virtio-net in our case, via
> the virtio-vdpa driver. Doing that, the device is visible
> to the Kernel networking stack and is exposed to userspace
> as a regular netdev.
> 
> It can also be exposed to userspace thanks to the
> vhost-vdpa driver, via a vhost-vdpa chardev that can be
> passed to QEMU or Virtio-user PMD.
> 
> While VDUSE support is already available in upstream
> Kernel, a couple of patches are required to support
> network device type:
> 
> https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc
> 
> In order to attach the created VDUSE device to the vDPA
> bus, a recent iproute2 version containing the vdpa tool is
> required.
> --
> 2.39.2

Btw: when this series merged in future, will Redhat run all the
test cases of vduse in every release?

Thanks,
Chenbo



RE: 22.11.2 patches review and test

2023-05-04 Thread Xu, HailinX
> -Original Message-
> From: Xu, HailinX 
> Sent: Thursday, April 27, 2023 4:10 PM
> To: Xueming Li ; sta...@dpdk.org
> Cc: dev@dpdk.org; Abhishek Marathe ;
> Ali Alnubani ; Walker, Benjamin
> ; David Christensen ;
> Hemant Agrawal ; Stokes, Ian
> ; Jerin Jacob ; Mcnamara, John
> ; Ju-Hyoung Lee ; Kevin
> Traynor ; Luca Boccassi ; Pei
> Zhang ; Xu, Qian Q ; Raslan
> Darawsheh ; Thomas Monjalon
> ; Yanghang Liu ; Peng, Yuan
> ; Chen, Zhaoyan 
> Subject: RE: 22.11.2 patches review and test
> 
> 
> > -Original Message-
> > From: Xueming Li 
> > Sent: Sunday, April 23, 2023 5:34 PM
> > To: sta...@dpdk.org
> > Cc: xuemi...@nvidia.com; dev@dpdk.org; Abhishek Marathe
> > ; Ali Alnubani ;
> > Walker, Benjamin ; David Christensen
> > ; Hemant Agrawal ;
> > Stokes, Ian ; Jerin Jacob ;
> > Mcnamara, John ; Ju-Hyoung Lee
> > ; Kevin Traynor ; Luca
> > Boccassi ; Pei Zhang ; Xu, Qian
> > Q ; Raslan Darawsheh ;
> Thomas
> > Monjalon ; Yanghang Liu ;
> > Peng, Yuan ; Chen, Zhaoyan
> > 
> > Subject: 22.11.2 patches review and test
> >
> > Hi all,
> >
> > Here is a list of patches targeted for stable release 22.11.2.
> >
> > The planned date for the final release is 5th MAY.
> >
> > Please help with testing and validation of your use cases and report
> > any issues/results with reply-all to this mail. For the final release
> > the fixes and reported validations will be added to the release notes.
> >
> > A release candidate tarball can be found at:
> >
> > https://dpdk.org/browse/dpdk-stable/tag/?id=v22.11.2-rc1
> >
> > These patches are located at branch 22.11 of dpdk-stable repo:
> > https://dpdk.org/browse/dpdk-stable/
> >
> > Thanks.
> >
> > Xueming Li 
> 
> Update the test status for Intel part. Till now dpdk22.11.2-rc1 validation 
> test
> rate is 85%. No critical issue is found.
> # Basic Intel(R) NIC testing
> * Build & CFLAG compile: cover the build test combination with latest
> GCC/Clang version and the popular OS revision such as
>   Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4,
> FreeBSD13.1, SUSE15, CentOS7.9, openEuler22.03-SP1  etc.
> - All test done. No new dpdk issue is found.
> * PF(i40e, ixgbe): test scenarios including
> RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> - All test done. No new dpdk issue is found.
> * VF(i40e, ixgbe): test scenarios including
> VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc.
> - All test done. No new dpdk issue is found.
> * PF/VF(ice): test scenarios including Switch features/Package
> Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible
> Descriptor, etc.
> - All test done. No new dpdk issue is found.
> * Intel NIC single core/NIC performance: test scenarios including PF/VF single
> core performance test, etc.
> - All test done. No new dpdk issue is found.
> * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test -
> QAT&SW/FIB library, etc.
> - Execution rate is 90%. No new dpdk issue is found.
> 
> # Basic cryptodev and virtio testing
> * Virtio: both function and performance test are covered. Such as
> PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
> testing/VMAWARE ESXI 8.0, etc.
> - All test done. No new dpdk issue is found.
> * Cryptodev:
>   *Function test: test scenarios including Cryptodev API testing/CompressDev
> ISA-L/QAT/ZLIB PMD Testing/FIPS, etc.
> - Execution rate is 75%. No new dpdk issue is found.
>   *Performance test: test scenarios including Thoughput
> Performance/Cryptodev Latency, etc.
> - All test done. No new dpdk issue is found.
> 
> Regards,
> Xu, Hailin
> 
Update the test status for Intel part. dpdk22.11.2-rc1 all validation test 
done. No critical issue is found.
1 new issue is under confirming by Intel Dev.
New bugs: --20.11.8-rc1 and 21.11.4-rc1 also has this issue
1. some of the virtio tests are failing: -- Intel dev is under investigating
# Basic Intel(R) NIC testing
* Build & CFLAG compile: cover the build test combination with latest GCC/Clang 
version and the popular OS revision such as
  Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4, FreeBSD13.1, 
SUSE15, CentOS7.9, openEuler22.03-SP1 etc.
- All test done. No new dpdk issue is found.
* PF(i40e, ixgbe): test scenarios including RTE_FLOW/TSO/Jumboframe/checksum 
offload/VLAN/VXLAN, etc. 
- All test done. No new dpdk issue is found.
* VF(i40e, ixgbe): test scenarios including VF-RTE_FLOW/TSO/Jumboframe/checksum 
offload/VLAN/VXLAN, etc.
- All test done. No new dpdk issue is found.
* PF/VF(ice): test scenarios including Switch features/Package Management/Flow 
Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc.
- All test done. No new dpdk issue is found.
* Intel NIC single core/NIC performance: test scenarios including PF/VF single 
core performance test, etc.
- All test done. No new dpdk issue is found.
* IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - 
QAT&SW/FIB library, etc.
- All t

[dpdk-dev] [PATCH v3] ring: fix use after free in ring release

2023-05-04 Thread Yunjian Wang
After the memzone is freed, it is not removed from the 'rte_ring_tailq'.
If rte_ring_lookup is called at this time, it will cause a use-after-free
problem. This change prevents that from happening.

Fixes: 4e32101f9b01 ("ring: support freeing")
Cc: sta...@dpdk.org

Signed-off-by: Yunjian Wang 
Acked-by: Konstantin Ananyev 
Reviewed-by: Honnappa Nagarahalli 
---
v3:
   move memzone free outside the lock
---
 lib/ring/rte_ring.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c
index 8ed455043d..057d25ff6f 100644
--- a/lib/ring/rte_ring.c
+++ b/lib/ring/rte_ring.c
@@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r)
return;
}
 
-   if (rte_memzone_free(r->memzone) != 0) {
-   RTE_LOG(ERR, RING, "Cannot free memory\n");
-   return;
-   }
-
ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list);
rte_mcfg_tailq_write_lock();
 
@@ -356,6 +351,9 @@ rte_ring_free(struct rte_ring *r)
 
rte_mcfg_tailq_write_unlock();
 
+   if (rte_memzone_free(r->memzone) != 0)
+   RTE_LOG(ERR, RING, "Cannot free memory\n");
+
rte_free(te);
 }
 
-- 
2.33.0