Re: [RFC PATCH 3/3] ethdev: rename parameter in API to get FEC
I think my patch isn't correct. I would say that - fec_capa is a set of bits produced by 'RTE_ETH_FEC_MODE_TO_CAPA()'; - fec_mode is an element of 'enum rte_eth_fec_mode'. Based on this definition, replacing 'fec_capa' with 'fec_mode' should involve changing the parameter type. Probably I shouldn't change the name, but I should add a more detailed comment. > Independent from being single FEC mode or not, I think both 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as param, what do you think? Andrew Rybchenko asked to replace 'mode' with 'fec_capa' for 'rte_eth_fec_set()' in https://inbox.dpdk.org/dev/aa745bd1-a564-fa8c-c77b-2d99c9769...@solarflare.com/ I don't think we need to change it for rte_eth_fec_set(). On 5/2/23 7:02 PM, Ferruh Yigit wrote: On 4/28/2023 11:27 AM, Denis Pryazhennikov wrote: Only one valid FEC mode can be get by rte_eth_fec_get(). The previous name implied that more than one FEC mode can be obtained. +1 and patch looks good. But isn't this valid for 'rte_eth_fec_set()', it gets 'fec_mode'. FEC capability has its own type "struct rte_eth_fec_capa". Independent from being single FEC mode or not, I think both 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as param, what do you think? Documentation was updated accordingly. Signed-off-by: Denis Pryazhennikov Acked-by: Ivan Malov Acked-by: Viacheslav Galaktionov <...>
Re: [PATCH V3] lib: set/get max memzone segments
Hello Ophir, Good thing someone is looking into this. Thanks. I have a few comments. This commitlog is a bit compact. Separating it with some empty lines would help digest it. On Wed, May 3, 2023 at 9:27 AM Ophir Munk wrote: > > In current DPDK the RTE_MAX_MEMZONE definition is unconditionally hard > coded as 2560. > For applications requiring different values of this > parameter – it is more convenient to set the max value via an rte API - > rather than changing the dpdk source code per application. In many > organizations, the possibility to compile a private DPDK library for a > particular application does not exist at all. With this option there is > no need to recompile DPDK and it allows using an in-box packaged DPDK. > An example usage for updating the RTE_MAX_MEMZONE would be of an > application that uses the DPDK mempool library which is based on DPDK > memzone library. The application may need to create a number of > steering tables, each of which will require its own mempool allocation. > This commit is not about how to optimize the application usage of > mempool nor about how to improve the mempool implementation based on > memzone. It is about how to make the max memzone definition - run-time > customized. The code was using a rather short name "RTE_MAX_MEMZONE". But I prefer we spell this as "max memzones count" (or a better wording), in the descriptions/comments. > This commit adds an API which must be called before rte_eal_init(): > rte_memzone_max_set(int max). If not called, the default memzone Afaics, this prototype got unaligned with the patch content, as a size_t is now taken as input. You can simply mention rte_memzone_max_set(). > (RTE_MAX_MEMZONE) is used. There is also an API to query the effective After the patch RTE_MAX_MEMZONE does not exist anymore. > max memzone: rte_memzone_max_get(). > > Signed-off-by: Ophir Munk A global comment on the patch: rte_calloc provides what you want in all cases below: an array of objects. I prefer rte_calloc(count, sizeof elem) to rte_zmalloc(count * sizeof elem). This also avoids a temporary variable to compute the total size of such an array. > --- > app/test/test_func_reentrancy.c | 2 +- > app/test/test_malloc_perf.c | 2 +- > app/test/test_memzone.c | 43 > - > config/rte_config.h | 1 - > drivers/net/qede/base/bcm_osal.c| 30 -- > drivers/net/qede/base/bcm_osal.h| 3 +++ > drivers/net/qede/qede_main.c| 7 ++ > lib/eal/common/eal_common_memzone.c | 28 +--- > lib/eal/include/rte_memzone.h | 20 + > lib/eal/version.map | 4 > 10 files changed, 112 insertions(+), 28 deletions(-) > > diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c > index d1ed5d4..ae9de6f 100644 > --- a/app/test/test_func_reentrancy.c > +++ b/app/test/test_func_reentrancy.c > @@ -51,7 +51,7 @@ typedef void (*case_clean_t)(unsigned lcore_id); > #define MEMPOOL_ELT_SIZE(sizeof(uint32_t)) > #define MEMPOOL_SIZE(4) > > -#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U)) > +#define MAX_LCORES (rte_memzone_max_get() / (MAX_ITER_MULTI * 4U)) > > static uint32_t obj_count; > static uint32_t synchro; > diff --git a/app/test/test_malloc_perf.c b/app/test/test_malloc_perf.c > index ccec43a..9bd1662 100644 > --- a/app/test/test_malloc_perf.c > +++ b/app/test/test_malloc_perf.c > @@ -165,7 +165,7 @@ test_malloc_perf(void) > return -1; > > if (test_alloc_perf("rte_memzone_reserve", memzone_alloc, > memzone_free, > - NULL, memset_us_gb, RTE_MAX_MEMZONE - 1) < 0) > + NULL, memset_us_gb, rte_memzone_max_get() - 1) < 0) > return -1; > > return 0; > diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c > index c9255e5..36b9790 100644 > --- a/app/test/test_memzone.c > +++ b/app/test/test_memzone.c > @@ -871,9 +871,18 @@ test_memzone_bounded(void) > static int > test_memzone_free(void) > { > - const struct rte_memzone *mz[RTE_MAX_MEMZONE + 1]; > + size_t mz_size; > + const struct rte_memzone **mz; > int i; > char name[20]; > + int rc = -1; > + > + mz_size = (rte_memzone_max_get() + 1) * sizeof(struct rte_memzone *); > + mz = rte_zmalloc("memzone_test", mz_size, 0); > + if (!mz) { > + printf("Fail allocating memzone test array\n"); > + return rc; > + } > > mz[0] = rte_memzone_reserve(TEST_MEMZONE_NAME("tempzone0"), 2000, > SOCKET_ID_ANY, 0); > @@ -881,42 +890,42 @@ test_memzone_free(void) > SOCKET_ID_ANY, 0); > > if (mz[0] > mz[1]) > - return -1; > + goto exit_test; > if (!rte_memzone_lookup(TEST_
Re: [v2] [RFC] dpaa2: replace system("echo ...") with file i/o
On 2023-05-02 11:54, Sachin Saxena (OSS) wrote: From: Stephen Hemminger Using system() is a bad idea in driver code because it introduces a number of potential security issues. The codeql analysis tool flags this a potential security issue. Instead just use normal stdio to do the same thing. Compile test only, do not have this hardware and therefore can not test this. Signed-off-by: Stephen Hemminger Reviewed-by: Sachin Saxena --- drivers/bus/fslmc/portal/dpaa2_hw_dpio.c | 45 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c index 4aec7b2cd8..990cfc5d3b 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c +++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c @@ -125,14 +125,21 @@ static void dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id) { #define STRING_LEN28 -#define COMMAND_LEN50 +#define AFFINITY_LEN 128 +#define CMD_LEN300 uint32_t cpu_mask = 1; - int ret; - size_t len = 0; - char *temp = NULL, *token = NULL; - char string[STRING_LEN], command[COMMAND_LEN]; + size_t len = CMD_LEN; + char *temp, *token = NULL; + char string[STRING_LEN]; + char smp_affinity[AFFINITY_LEN]; FILE *file; + temp = (char *)malloc(len * sizeof(char)); No cast is necessary to go from void * to any other pointer. sizeof(char) is by definition one. What you allocate from the heap are in units of chars, so multiplying with sizeof(char) doesn't make sense. How is pre-allocating the buffer an improvement over the old code (deferring memory allocation to getline())? + if (temp == NULL) { + DPAA2_BUS_WARN("Unable to allocate temp buffer"); + return; + } + snprintf(string, STRING_LEN, "dpio.%d", dpio_id); file = fopen("/proc/interrupts", "r"); if (!file) { @@ -155,17 +162,27 @@ dpaa2_affine_dpio_intr_to_respective_core(int32_t dpio_id, int cpu_id) } cpu_mask = cpu_mask << cpu_id; - snprintf(command, COMMAND_LEN, "echo %X > /proc/irq/%s/smp_affinity", -cpu_mask, token); - ret = system(command); - if (ret < 0) - DPAA2_BUS_DEBUG( - "Failed to affine interrupts on respective core"); - else - DPAA2_BUS_DEBUG(" %s command is executed", command); - + snprintf(smp_affinity, AFFINITY_LEN, +"/proc/irq/%s/smp_affinity", token); + /* Free 'temp' memory after using the substring 'token' */ free(temp); fclose(file); + + file = fopen(smp_affinity, "w"); + if (file == NULL) { + DPAA2_BUS_WARN("Failed to open %s", smp_affinity); + return; + } + fprintf(file, "%X\n", cpu_mask); + fflush(file); + + if (ferror(file)) { + fclose(file); + DPAA2_BUS_WARN("Failed to write to %s", smp_affinity); + return; + } + + fclose(file); } static int dpaa2_dpio_intr_init(struct dpaa2_dpio_dev *dpio_dev)
Re: [PATCH 1/2] config/arm: Do not require processor information
On 2023/04/20 16:12, Akihiko Odaki wrote: On 2023/04/20 16:10, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Thursday, April 20, 2023 9:40 AM To: Ruifeng Wang ; Bruce Richardson ; Juraj Linkeš Cc: dev@dpdk.org; nd Subject: Re: [PATCH 1/2] config/arm: Do not require processor information On 2023/04/17 16:41, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Friday, April 14, 2023 8:42 PM To: Ruifeng Wang ; Bruce Richardson Cc: dev@dpdk.org; Akihiko Odaki Subject: [PATCH 1/2] config/arm: Do not require processor information DPDK can be built even without exact processor information for x86 and ppc so allow to build for Arm even if we don't know the targeted processor is unknown. Hi Akihiko, The design idea was to require an explicit generic build. Default/native build doesn't fall back to generic build when SoC info is not on the list. So the user has less chance to generate a suboptimal binary by accident. Hi, It is true that the suboptimal binary can result, but the rationale here is that we tolerate that for x86 and ppc so it should not really matter for Arm too. On x86 and ppc you don't need to modify meson.build just to run dts on a development machine. What modification do you need for a development machine? I suppose "meson setup build -Dplatform=generic" will generate a binary that can run on your development machine. I didn't describe the situation well. I use DPDK Test Suite for testing and it determines what flags to be passed to Meson. You need to modify DPDK's meson.build or DTS to get it built. Regards, Akihiko Odaki Hi, Can you have a look at this again? Regards, Akihiko Odaki
Re: [RFC PATCH 3/3] ethdev: rename parameter in API to get FEC
On 5/4/2023 8:13 AM, Denis Pryazhennikov wrote: > I think my patch isn't correct. > I would say that > - fec_capa is a set of bits produced by 'RTE_ETH_FEC_MODE_TO_CAPA()'; > - fec_mode is an element of 'enum rte_eth_fec_mode'. > > Based on this definition, replacing 'fec_capa' with 'fec_mode' should > involve changing the parameter type. > Probably I shouldn't change the name, but I should add a more detailed > comment. > >> Independent from being single FEC mode or not, I think both > 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as > param, what do you think? > > Andrew Rybchenko asked to replace 'mode' with 'fec_capa' for > 'rte_eth_fec_set()' in > https://inbox.dpdk.org/dev/aa745bd1-a564-fa8c-c77b-2d99c9769...@solarflare.com/ > I don't think we need to change it for rte_eth_fec_set(). > OK > On 5/2/23 7:02 PM, Ferruh Yigit wrote: >> On 4/28/2023 11:27 AM, Denis Pryazhennikov wrote: >>> Only one valid FEC mode can be get by rte_eth_fec_get(). >>> The previous name implied that more than one FEC mode >>> can be obtained. >> +1 and patch looks good. >> >> But isn't this valid for 'rte_eth_fec_set()', it gets 'fec_mode'. FEC >> capability has its own type "struct rte_eth_fec_capa". >> >> Independent from being single FEC mode or not, I think both >> 'rte_eth_fec_get()' & 'rte_eth_fec_set()' should get 'fec_mode' as >> param, what do you think? >> >>> Documentation was updated accordingly. >>> >>> Signed-off-by: Denis Pryazhennikov >>> Acked-by: Ivan Malov >>> Acked-by: Viacheslav Galaktionov >> <...> >>
Re: [PATCH v2] build: announce requirement for C11
On 2023-05-03 19:30, Bruce Richardson wrote: Add a deprecation notice informing users that we will require a C11 compiler from 23.11 release onwards. This requirement was agreed by technical board to enable use of newer C language features, e.g. standard atomics. [1] [1] http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/ Signed-off-by: Bruce Richardson Acked-by: Tyler Retzlaff --- V2: - add requirement for stdatomics - fix sphinx formatting --- doc/guides/rel_notes/deprecation.rst | 9 + 1 file changed, 9 insertions(+) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index dcc1ca1696..70c6019d26 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -11,6 +11,15 @@ here. Deprecation Notices --- +* C Compiler: From DPDK 23.11 onwards, + building DPDK will require a C compiler which supports the C11 standard, + including support for C11 standard atomics. The whole of C11, or just the mandatory parts (+atomics)? + + Please note: + + * C11 is supported from GCC version 5 onwards, and is the default language version in that release + * C11 is the default compilation mode in Clang from version 3.6 + * kvargs: The function ``rte_kvargs_process`` will get a new parameter for returning key match count. It will ease handling of no-match case.
RE: DPDK 22.11 Troubleshooting
Hi Bruce, Thank you for the response. There is no errors when I run the makefile, however I do see a difference in the programs. I don't believe the makefile is linking all the libraries together as intended. For example, when I run the ethtool sample program and compile it using meson, it works fine and rte_eth_dev_count_avail() returns the correct amount. However, when I compile ethtool with the makefile and run it rte_eth_dev_count_avail() returns 0. pkg-config --path libdpdk returns: /usr/lib64/pkgconfig/libdpdk.pc (I am using Redat 8.7) V/r, - Name : Gilbert Carrillo | Software Engineer Company : Ampex Data Systems Address : 26460 Corporate Ave, Ste 200 Hayward, CA 94545 Office : +1-650-367-2011 Mobile : +1-575-312-7477 Website : ampex.com -Original Message- From: Bruce Richardson Sent: Wednesday, May 3, 2023 2:28 AM To: Gilbert Carrillo Cc: dev@dpdk.org Subject: Re: DPDK 22.11 Troubleshooting On Mon, May 01, 2023 at 10:27:05PM +, Gilbert Carrillo wrote: >Hello, > > >I installed DPDK version 22.11 and the QDMA DPDK driver. However, I am >having trouble compiling the test applications. > > >I have a c++ program that has an external buffer and my end goal is to >attach/map an mbuf to my external buffer for zero-copy DMA. > > >Currently I use CMAKE to compile my program, so I was curious if this >application has to be run on meson/ninja or is there a cmake option? I >tried compiling the test applications with the makefile but had no >luck, I could only compile using meson/ninja. > If you have installed DPDK on your system, then a pkg-config file for it should be available, allowing applications to be built against it using any build system. The makefiles for the sample applications demonstrate how this can be done. If building the example applications with make is failing, can you share the error messages got here, as it should work ok, once DPDK is correctly installed on the system. An additional test you can run is "pkg-config --path libdpdk" to check where DPDK is installed [though not all versions of pkg-config support --path, I think]. Regards, /Bruce
RE: DPDK 22.11 Troubleshooting
Make static returns an error (see attached). v/R, Gilbert -Original Message- From: Bruce Richardson Sent: Wednesday, May 3, 2023 10:35 AM To: Gilbert Carrillo Cc: dev@dpdk.org Subject: Re: DPDK 22.11 Troubleshooting On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote: > Hi Bruce, > > Thank you for the response. > > There is no errors when I run the makefile, however I do see a difference in > the programs. I don't believe the makefile is linking all the libraries > together as intended. > > For example, when I run the ethtool sample program and compile it using > meson, it works fine and rte_eth_dev_count_avail() returns the correct > amount. However, when I compile ethtool with the makefile and run it > rte_eth_dev_count_avail() returns 0. > Note that by default the meson build will statically link the examples, while the makefile will dynamically load the drivers at runtime. That may explain the difference. Can you try building and running using "make static" rather than just "make". /Bruce
RE: DPDK 22.11 Troubleshooting
-Original Message- From: Bruce Richardson Sent: Wednesday, May 3, 2023 11:18 AM To: Gilbert Carrillo Cc: dev@dpdk.org Subject: Re: DPDK 22.11 Troubleshooting On Wed, May 03, 2023 at 04:53:20PM +, Gilbert Carrillo wrote: > Make static returns an error (see attached). > > v/R, > Gilbert > To help investigate this issue, can you perhaps include the text of the full build output when you run "make static". On my system I see libelf listed on the linker flags when I run "make static", and things link properly. I'm wondering how my setup may differ from yours. /Bruce > -Original Message- > From: Bruce Richardson > Sent: Wednesday, May 3, 2023 10:35 AM > To: Gilbert Carrillo > Cc: dev@dpdk.org > Subject: Re: DPDK 22.11 Troubleshooting > > On Wed, May 03, 2023 at 04:22:05PM +, Gilbert Carrillo wrote: > > Hi Bruce, > > > > Thank you for the response. > > > > There is no errors when I run the makefile, however I do see a difference > > in the programs. I don't believe the makefile is linking all the libraries > > together as intended. > > > > For example, when I run the ethtool sample program and compile it using > > meson, it works fine and rte_eth_dev_count_avail() returns the correct > > amount. However, when I compile ethtool with the makefile and run it > > rte_eth_dev_count_avail() returns 0. > > > > Note that by default the meson build will statically link the examples, while > the makefile will dynamically load the drivers at runtime. That may explain > the difference. Can you try building and running using "make static" rather > than just "make". > > /Bruce Below is the full output after running make static from the ethtool folder in examples. [root@localhost ethtool]# make static make -C lib static make[1]: Entering directory '/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/lib' cc -O3 -fPIC -DALLOW_EXPERIMENTAL_API -O3 -I/usr/include/dpdk -include rte_config.h -march=corei7 -mno-avx512f -c rte_ethtool.c -o build/rte_ethtool.c.o In file included from rte_ethtool.c:7: rte_ethtool.c: In function 'rte_ethtool_get_drvinfo': rte_ethtool.c:52:29: warning: implicit declaration of function 'rte_dev_name'; did you mean 'rte_dev_remove'? [-Wimplicit-function-declaration] strlcpy(drvinfo->bus_info, rte_dev_name(dev_info.device), ^~~~ /usr/include/dpdk/rte_string_fns.h:90:50: note: in definition of macro 'strlcpy' #define strlcpy(dst, src, size) rte_strlcpy(dst, src, size) ^~~ rte_ethtool.c:52:29: warning: passing argument 2 of 'rte_strlcpy' makes pointer from integer without a cast [-Wint-conversion] strlcpy(drvinfo->bus_info, rte_dev_name(dev_info.device), ^ /usr/include/dpdk/rte_string_fns.h:90:50: note: in definition of macro 'strlcpy' #define strlcpy(dst, src, size) rte_strlcpy(dst, src, size) ^~~ /usr/include/dpdk/rte_string_fns.h:59:36: note: expected 'const char *' but argument is of type 'int' rte_strlcpy(char *dst, const char *src, size_t size) ^~~ ar -cr build/librte_ethtool.a build/*.o make[1]: Leaving directory '/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/lib' make -C ethtool-app static make[1]: Entering directory '/home/ampex/dpdk-23.03/dpdk-23.03/examples/ethtool/ethtool-app' cc -I../lib -O3 -I/usr/include/dpdk -include rte_config.h -march=corei7 -mno-avx512f -DALLOW_EXPERIMENTAL_API main.c ethapp.c -o build/ethtool-static -L../lib/build -l:librte_ethtool.a -Wl,--whole-archive -l:librte_common_iavf.a -l:librte_bus_auxiliary.a -l:librte_bus_pci.a -l:librte_bus_vdev.a -l:librte_bus_vmbus.a -l:librte_common_mlx5.a -l:librte_mempool_ring.a -l:librte_net_bnxt.a -l:librte_net_e1000.a -l:librte_net_enic.a -l:librte_net_failsafe.a -l:librte_net_i40e.a -l:librte_net_iavf.a -l:librte_net_ice.a -l:librte_net_ixgbe.a -l:librte_net_mlx4.a -l:librte_net_mlx5.a -l:librte_net_netvsc.a -l:librte_net_nfp.a -l:librte_net_qede.a -l:librte_net_ring.a -l:librte_net_tap.a -l:librte_net_vdev_netvsc.a -l:librte_net_vhost.a -l:librte_net_virtio.a -l:librte_node.a -l:librte_graph.a -l:librte_flow_classify.a -l:librte_pipeline.a -l:librte_table.a -l:librte_pdump.a -l:librte_port.a -l:librte_fib.a -l:librte_ipsec.a -l:librte_vhost.a -l:librte_stack.a -l:librte_security.a -l:librte_sched.a -l:librte_reorder.a -l:librte_rib.a -l:librte_dmadev.a -l:librte_regexdev.a -l:librte_rawdev.a -l:librte_pcapng.a -l:librte_member.a -l:librte_lpm.a -l:librte_latencystats.a -l:librte_ip_frag.a -l:librte_gso.a -l:librte_gro.a -l:librte_eventdev.a -l:librte_efd.a -l:librte_distributor.a -l:librte_cryptodev.a -l:librte_compressdev.a -l:librte_cfgfile.a -l:librte_bpf.a -l:librte_bitratestats.a -l:librte_bbdev.a -l:librte_acl.a -l:librte_timer.a -l:librte_hash.a -l:librte_metrics.a -l:librte_cmdline.a -l:librte_pci.a -l:l
Re: [PATCH v11 0/4] add support for self monitoring
Hello Tomasz, On Thu, Feb 16, 2023 at 6:55 PM Tomasz Duszynski wrote: > > This series adds self monitoring support i.e allows to configure and > read performance measurement unit (PMU) counters in runtime without > using perf utility. This has certain advantages when application runs on > isolated cores running dedicated tasks. > > Events can be read directly using rte_pmu_read() or using dedicated > tracepoint rte_eal_trace_pmu_read(). The latter will cause events to be > stored inside CTF file. > > By design, all enabled events are grouped together and the same group > is attached to lcores that use self monitoring funtionality. > > Events are enabled by names, which need to be read from standard > location under sysfs i.e > > /sys/bus/event_source/devices/PMU/events > > where PMU is a core pmu i.e one measuring cpu events. As of today > raw events are not supported. > > Tomasz Duszynski (4): > lib: add generic support for reading PMU events > pmu: support reading ARM PMU events in runtime > pmu: support reading Intel x86_64 PMU events in runtime > eal: add PMU support to tracing library There are still some pending comments on this series and it can't be merged until they get sorted out. I noted two points : - Konstantin asked for better explanations in the implementation. - He also pointed out at using this feature with non EAL lcores. Could you work on this so we can consider this series for v23.07? Thank you. -- David Marchand
Re: [PATCH] net/gve: add struct members and typedefs for DQO
On 4/10/2023 7:47 AM, Junfeng Guo wrote: > Add struct members for gve_tx_queue and gve_rx_queue. > Add typedefs for little endians. > > Signed-off-by: Junfeng Guo > Signed-off-by: Rushil Gupta > Signed-off-by: Joshua Washington > Signed-off-by: Jeroen de Borst Applied to dpdk-next-net/main, thanks.
Re: [PATCH 1/1] net/gve: update base code for DQO
On 4/11/2023 5:59 AM, Rushil Gupta wrote: > Update gve base code to support DQO. > > This patch is based on this: > https://patchwork.dpdk.org/project/dpdk/list/?series=27647&state=* > > Signed-off-by: Rushil Gupta > Signed-off-by: Junfeng Guo Applied to dpdk-next-net/main, thanks.
RE: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13
> -Original Message- > From: jer...@marvell.com > Sent: Tuesday, May 2, 2023 21:51 > To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng > ; Li, Xiaoyun > Cc: tho...@monjalon.net; david.march...@redhat.com; > ferruh.yi...@xilinx.com; step...@networkplumber.org; Jerin Jacob > ; sta...@dpdk.org > Subject: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13 > > From: Jerin Jacob > > Fix the following build issue by not allowing nb_ids to be zero. > nb_ids can be zero based on rte_rawdev_xstats_get() API > documentation but it is not valid for the case when second > argument is NULL. Is this the new standard for GCC 13? > > examples/ntb/ntb_fwd.c: In function 'ntb_stats_display': > examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' > accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=] > 945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) { > | > ^~ > > examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3 > of type 'uint64_t[0]' {aka 'long unsigned int[]'} > In file included from ../examples/ntb/ntb_fwd.c:17: > lib/rawdev/rte_rawdev.h:504:1: note: in a call to function > 'rte_rawdev_xstats_get' > 504 | rte_rawdev_xstats_get(uint16_t dev_id, > > Fixes: 5194299d6ef5 ("examples/ntb: support more functions") > Cc: sta...@dpdk.org > Signed-off-by: Jerin Jacob > --- > examples/ntb/ntb_fwd.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c > index f9abed28e4..5489c3b3cd 100644 > --- a/examples/ntb/ntb_fwd.c > +++ b/examples/ntb/ntb_fwd.c > @@ -923,7 +923,7 @@ ntb_stats_display(void) > > /* Get NTB dev stats and stats names */ > nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0); > - if (nb_ids < 0) { > + if (nb_ids <= 0) { Should the one in func ntb_stats_clear also be updated non-zero? > printf("Error: Cannot get count of xstats\n"); > return; > } > -- > 2.40.1
Re: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13
On Thu, May 4, 2023 at 2:06 PM Guo, Junfeng wrote: > > > > > -Original Message- > > From: jer...@marvell.com > > Sent: Tuesday, May 2, 2023 21:51 > > To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng > > ; Li, Xiaoyun > > Cc: tho...@monjalon.net; david.march...@redhat.com; > > ferruh.yi...@xilinx.com; step...@networkplumber.org; Jerin Jacob > > ; sta...@dpdk.org > > Subject: [dpdk-dev] [PATCH v1] examples/ntb: fix build issue with GCC 13 > > > > From: Jerin Jacob > > > > Fix the following build issue by not allowing nb_ids to be zero. > > nb_ids can be zero based on rte_rawdev_xstats_get() API > > documentation but it is not valid for the case when second > > argument is NULL. > > Is this the new standard for GCC 13? No. Looks like optimization from compiler. It is able to detect their case for zero memory allocation. > > > > > examples/ntb/ntb_fwd.c: In function 'ntb_stats_display': > > examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' > > accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=] > > 945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) { > > | > > ^~ > > > > examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3 > > of type 'uint64_t[0]' {aka 'long unsigned int[]'} > > In file included from ../examples/ntb/ntb_fwd.c:17: > > lib/rawdev/rte_rawdev.h:504:1: note: in a call to function > > 'rte_rawdev_xstats_get' > > 504 | rte_rawdev_xstats_get(uint16_t dev_id, > > > > Fixes: 5194299d6ef5 ("examples/ntb: support more functions") > > Cc: sta...@dpdk.org > > Signed-off-by: Jerin Jacob > > --- > > examples/ntb/ntb_fwd.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c > > index f9abed28e4..5489c3b3cd 100644 > > --- a/examples/ntb/ntb_fwd.c > > +++ b/examples/ntb/ntb_fwd.c > > @@ -923,7 +923,7 @@ ntb_stats_display(void) > > > > /* Get NTB dev stats and stats names */ > > nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0); > > - if (nb_ids < 0) { > > + if (nb_ids <= 0) { > > Should the one in func ntb_stats_clear also be updated non-zero? Some reason compiler is not detecting it. I will fix in next version. > > > printf("Error: Cannot get count of xstats\n"); > > return; > > } > > -- > > 2.40.1 >
Re: [PATCH v2] build: announce requirement for C11
On Thu, May 04, 2023 at 09:50:09AM +0200, Mattias Rönnblom wrote: > On 2023-05-03 19:30, Bruce Richardson wrote: > > Add a deprecation notice informing users that we will require a C11 > > compiler from 23.11 release onwards. This requirement was agreed by > > technical board to enable use of newer C language features, e.g. > > standard atomics. [1] > > > > [1] > > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/ > > > > Signed-off-by: Bruce Richardson > > Acked-by: Tyler Retzlaff > > > > --- > > > > V2: > > - add requirement for stdatomics > > - fix sphinx formatting > > --- > > doc/guides/rel_notes/deprecation.rst | 9 + > > 1 file changed, 9 insertions(+) > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > b/doc/guides/rel_notes/deprecation.rst > > index dcc1ca1696..70c6019d26 100644 > > --- a/doc/guides/rel_notes/deprecation.rst > > +++ b/doc/guides/rel_notes/deprecation.rst > > @@ -11,6 +11,15 @@ here. > > Deprecation Notices > > --- > > +* C Compiler: From DPDK 23.11 onwards, > > + building DPDK will require a C compiler which supports the C11 standard, > > + including support for C11 standard atomics. > > The whole of C11, or just the mandatory parts (+atomics)? > I assume we only need mandatory + atomics, however perhaps someone more knowledgable about what the optional parts of the spec are, can correct me on this. Once clarified, I maybe should reword this yet again to call it out even more specificallly. /Bruce
[dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13
From: Jerin Jacob Fix the following build issue by not allowing nb_ids to be zero. nb_ids can be zero based on rte_rawdev_xstats_get() API documentation but it is not valid for the case when second argument is NULL. examples/ntb/ntb_fwd.c: In function 'ntb_stats_display': examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=] 945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) { | ^~ examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3 of type 'uint64_t[0]' {aka 'long unsigned int[]'} In file included from ../examples/ntb/ntb_fwd.c:17: lib/rawdev/rte_rawdev.h:504:1: note: in a call to function 'rte_rawdev_xstats_get' 504 | rte_rawdev_xstats_get(uint16_t dev_id, Fixes: 5194299d6ef5 ("examples/ntb: support more functions") Cc: sta...@dpdk.org Signed-off-by: Jerin Jacob Tested-by: Ali Alnubani --- v2: - Apply the same fix for ntb_stats_clear()(Junfeng) examples/ntb/ntb_fwd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c index f9abed28e4..585aad9d70 100644 --- a/examples/ntb/ntb_fwd.c +++ b/examples/ntb/ntb_fwd.c @@ -865,7 +865,7 @@ ntb_stats_clear(void) /* Clear NTB dev stats */ nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0); - if (nb_ids < 0) { + if (nb_ids <= 0) { printf("Error: Cannot get count of xstats\n"); return; } @@ -923,7 +923,7 @@ ntb_stats_display(void) /* Get NTB dev stats and stats names */ nb_ids = rte_rawdev_xstats_names_get(dev_id, NULL, 0); - if (nb_ids < 0) { + if (nb_ids <= 0) { printf("Error: Cannot get count of xstats\n"); return; } -- 2.40.1
RE: [dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13
> -Original Message- > From: jer...@marvell.com > Sent: Thursday, May 4, 2023 16:54 > To: dev@dpdk.org; Wu, Jingjing ; Guo, Junfeng > ; Li, Xiaoyun > Cc: tho...@monjalon.net; david.march...@redhat.com; > ferruh.yi...@amd.com; Jerin Jacob ; > sta...@dpdk.org; Ali Alnubani > Subject: [dpdk-dev] [PATCH v2] examples/ntb: fix build issue with GCC 13 > > From: Jerin Jacob > > Fix the following build issue by not allowing nb_ids to be zero. > nb_ids can be zero based on rte_rawdev_xstats_get() API > documentation but it is not valid for the case when second > argument is NULL. > > examples/ntb/ntb_fwd.c: In function 'ntb_stats_display': > examples/ntb/ntb_fwd.c:945:23: error: 'rte_rawdev_xstats_get' > accessing 8 bytes in a region of size 0 [-Werror=stringop-overflow=] > 945 | if (nb_ids != rte_rawdev_xstats_get(dev_id, ids, values, nb_ids)) { > | > ^~ > > examples/ntb/ntb_fwd.c:945:23: note: referencing argument 3 > of type 'uint64_t[0]' {aka 'long unsigned int[]'} > In file included from ../examples/ntb/ntb_fwd.c:17: > lib/rawdev/rte_rawdev.h:504:1: note: in a call to function > 'rte_rawdev_xstats_get' > 504 | rte_rawdev_xstats_get(uint16_t dev_id, > > Fixes: 5194299d6ef5 ("examples/ntb: support more functions") > Cc: sta...@dpdk.org > Signed-off-by: Jerin Jacob > Tested-by: Ali Alnubani > --- > v2: > - Apply the same fix for ntb_stats_clear()(Junfeng) > > examples/ntb/ntb_fwd.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/examples/ntb/ntb_fwd.c b/examples/ntb/ntb_fwd.c > index f9abed28e4..585aad9d70 100644 > --- a/examples/ntb/ntb_fwd.c > +++ b/examples/ntb/ntb_fwd.c > @@ -865,7 +865,7 @@ ntb_stats_clear(void) > -- > 2.40.1 LGTM, it makes sense to avoid zero memory allocation. Thanks! Acked-by: Junfeng Guo Regards, Junfeng Guo
RE: [PATCH 1/2] config/arm: Do not require processor information
> -Original Message- > From: Akihiko Odaki > Sent: Thursday, May 4, 2023 3:47 PM > To: Ruifeng Wang ; Bruce Richardson > ; > Juraj Linkeš > Cc: dev@dpdk.org; nd > Subject: Re: [PATCH 1/2] config/arm: Do not require processor information > > On 2023/04/20 16:12, Akihiko Odaki wrote: > > On 2023/04/20 16:10, Ruifeng Wang wrote: > >>> -Original Message- > >>> From: Akihiko Odaki > >>> Sent: Thursday, April 20, 2023 9:40 AM > >>> To: Ruifeng Wang ; Bruce Richardson > >>> ; Juraj Linkeš > >>> > >>> Cc: dev@dpdk.org; nd > >>> Subject: Re: [PATCH 1/2] config/arm: Do not require processor > >>> information > >>> > >>> On 2023/04/17 16:41, Ruifeng Wang wrote: > > -Original Message- > > From: Akihiko Odaki > > Sent: Friday, April 14, 2023 8:42 PM > > To: Ruifeng Wang ; Bruce Richardson > > > > Cc: dev@dpdk.org; Akihiko Odaki > > Subject: [PATCH 1/2] config/arm: Do not require processor > > information > > > > DPDK can be built even without exact processor information for x86 > > and ppc so allow to build for Arm even if we don't know the > > targeted processor is > >>> unknown. > > Hi Akihiko, > > The design idea was to require an explicit generic build. > Default/native build doesn't fall back to generic build when SoC > info is not on the list. > So the user has less chance to generate a suboptimal binary by > accident. > >>> > >>> Hi, > >>> > >>> It is true that the suboptimal binary can result, but the rationale > >>> here is that we tolerate that for x86 and ppc so it should not > >>> really matter for Arm too. On x86 and ppc you don't need to modify > >>> meson.build just to run dts on a development machine. > >> > >> What modification do you need for a development machine? > >> I suppose "meson setup build -Dplatform=generic" will generate a > >> binary that can run on your development machine. > > > > I didn't describe the situation well. I use DPDK Test Suite for > > testing and it determines what flags to be passed to Meson. You need > > to modify DPDK's meson.build or DTS to get it built. > > > >> > >>> > >>> Regards, > >>> Akihiko Odaki > > Hi, > > Can you have a look at this again? Thanks for the clarification of your use case. Changes to DTS are in planning. It will allow the user to choose the type of the build. Your use case will be fulfilled then. Regards, Ruifeng > > Regards, > Akihiko Odaki
Re: 21.11.4 patches review and test
On 04/05/2023 03:13, Xu, HailinX wrote: -Original Message- From: Kevin Traynor Sent: Tuesday, May 2, 2023 5:35 PM To: Xu, HailinX ; sta...@dpdk.org Cc: Stokes, Ian ; Mcnamara, John ; Luca Boccassi ; Xu, Qian Q ; Thomas Monjalon ; Peng, Yuan ; Chen, Zhaoyan ; dev@dpdk.org Subject: Re: 21.11.4 patches review and test On 20/04/2023 11:32, Kevin Traynor wrote: On 20/04/2023 03:40, Xu, HailinX wrote: -Original Message- From: Xu, HailinX Sent: Thursday, April 13, 2023 2:13 PM To: Kevin Traynor ; sta...@dpdk.org Cc: dev@dpdk.org; Abhishek Marathe ; Ali Alnubani ; Walker, Benjamin ; David Christensen ; Hemant Agrawal ; Stokes, Ian ; Jerin Jacob ; Mcnamara, John ; Ju-Hyoung Lee ; Luca Boccassi ; Pei Zhang ; Xu, Qian Q ; Raslan Darawsheh ; Thomas Monjalon ; yangh...@redhat.com; Peng, Yuan ; Chen, Zhaoyan Subject: RE: 21.11.4 patches review and test -Original Message- From: Kevin Traynor Sent: Thursday, April 6, 2023 7:38 PM To: sta...@dpdk.org Cc: dev@dpdk.org; Abhishek Marathe ; Ali Alnubani ; Walker, Benjamin ; David Christensen ; Hemant Agrawal ; Stokes, Ian ; Jerin Jacob ; Mcnamara, John ; Ju-Hyoung Lee ; Kevin Traynor ; Luca Boccassi ; Pei Zhang ; Xu, Qian Q ; Raslan Darawsheh ; Thomas Monjalon ; yangh...@redhat.com; Peng, Yuan ; Chen, Zhaoyan Subject: 21.11.4 patches review and test Hi all, Here is a list of patches targeted for stable release 21.11.4. The planned date for the final release is 25th April. Please help with testing and validation of your use cases and report any issues/results with reply-all to this mail. For the final release the fixes and reported validations will be added to the release notes. A release candidate tarball can be found at: https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1 These patches are located at branch 21.11 of dpdk-stable repo: https://dpdk.org/browse/dpdk-stable/ Thanks. Kevin HI All, Update the test status for Intel part. Till now dpdk21.11.4-rc1 validation test rate is 85%. No critical issue is found. 2 new bugs are found, 1 new issue is under confirming by Intel Dev. New bugs: --20.11.8-rc1 also has these two issues 1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: performance drop about 23.5% when send small packets https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet 2. some of the virtio tests are failing:-- Intel dev is under investigating # Basic Intel(R) NIC testing * Build & CFLAG compile: cover the build test combination with latest GCC/Clang version and the popular OS revision such as Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4, FreeBSD13.1, SUSE15, CentOS7.9, etc. - All test done. No new dpdk issue is found. * PF(i40e, ixgbe): test scenarios including RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * VF(i40e, ixgbe): test scenarios including VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * PF/VF(ice): test scenarios including Switch features/Package Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc. - All test done. No new dpdk issue is found. * Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test, etc. - All test done. No new dpdk issue is found. * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - QAT&SW/FIB library, etc. - On going. # Basic cryptodev and virtio testing * Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing/VMAWARE ESXI 8.0, etc. - All test done. found bug1. * Cryptodev: *Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. - Execution rate is 90%. found bug2. *Performance test: test scenarios including Thoughput Performance/Cryptodev Latency, etc. - All test done. No new dpdk issue is found. Regards, Xu, Hailin Update the test status for Intel part. completed dpdk21.11.4-rc1 all validation. No critical issue is found. Hi. Thanks for testing. 2 new bugs are found, 1 new issue is under confirming by Intel Dev. New bugs: --20.11.8-rc1 also has these two issues 1. pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: performance drop about 23.5% when send small packets https://bugs.dpdk.org/show_bug.cgi?id=1212 --not fix yet, Only the specified platform exists Do you know which patch caaused the regression? I'm not fully clear from the Bz for 20.11. The backported patch ID'd as root cause [0] in 20.11 is in the previous releases of 20.11 (and 21.11). Trying to understand because then it would have shown in testing for previous releases. Or is this a new test introduced for latest LTS releases? and if so, what is the baseline performance based on? [0] commit 1c9a7fba5c90e0422b517404
DPDK Release Status Meeting 2023-05-04
Release status meeting minutes 2023-05-04 = Agenda: * Release Dates * Subtrees * Roadmaps * LTS * Defects * Opens Participants: * AMD * ARM * Intel * Marvell * Nvidia * Red Hat Release Dates - The following are the proposed current dates for 23.07: * V1: 22 April 2023 * RC1: 31 May 2023 * RC2: 21 June 2023 * RC3: 28 June 2023 * Release: 12 July 2023 Subtrees * next-net * Under review. * ~120 patches in backlog. * GVE license update * next-net-intel * Starting to apply patches, about 20 patches are waiting for merge. * next-net-mlx * No update * next-net-mvl * Some patches merged. * ~30 patches remaining. * next-eventdev * Some patches merged. * AMD patches under review * next-baseband * No major features so far. * next-virtio * Intel series for port mirroring * Some concerns on code duplication and API complexity * Report of performance degradation in vhost driver * New series on guest notifications in slow path * VDUSE - Vhost VDPA in userspace. https://lwn.net/Articles/841054/ * Under review * next-crypto * 60-70 patches in backlog * Patches to add new algorithms * Reviewing PDCP patches * Reviews and merges ongoing * main * Starting to merge patches to improve the Windows port * Reviewing ethdev patches * Patches from ARM for PMU * Postponed from last release * For tracing * Awaiting feedback on comments * Power Management feature from AMD * Memarea feature * CDX bus patches from AMD * Virtual PCI like bus * New checkpatch checks for atomic functions * Issues with Fedora 38 and GCC 13 * Issue with Coverity automated analysis * Call for updates to the external Roadmap: https://core.dpdk.org/roadmap/ Proposed Schedule for 2023 -- See also http://core.dpdk.org/roadmap/#dates 23.07 * Proposal deadline (RFC/v1 patches): 22 April 2023 * API freeze (-rc1): 31 May 2023 * PMD features freeze (-rc2): 21 June 2023 * Builtin applications features freeze (-rc3): 28 June 2023 * Release: 12 July 2023 23.11 * Proposal deadline (RFC/v1 patches): 12 August 2023 * API freeze (-rc1): 29 September 2023 * PMD features freeze (-rc2): 20 October 2023 * Builtin applications features freeze (-rc3): 27 October 2023 * Release: 15 November 2023 LTS --- Next LTS releases will be: * 22.11.1 * RC1 sent - Scheduled for May 5 * 21.11.4 * Got test feedback from Nvidia, Red Hat and Intel * 20.11.8 * RC1 sent and first test report received * 19.11.15 * CVE and critical fixes only. * Distros * v20.11 in Debian 11 * Ubuntu 22.04-LTS contains 21.11 * Ubuntu 23.04 contains 22.11 Defects --- * Bugzilla links, 'Bugs', added for hosted projects * https://www.dpdk.org/hosted-projects/ Opens - * None DPDK Release Status Meetings The DPDK Release Status Meeting is intended for DPDK Committers to discuss the status of the master tree and sub-trees, and for project managers to track progress or milestone dates. The meeting occurs on every Thursday at 9:30 UTC over Jitsi on https://meet.jit.si/DPDK You don't need an invite to join the meeting but if you want a calendar reminder just send an email to "John McNamara john.mcnam...@intel.com" for the invite.
Re: [PATCH 00/10] gve PMD enhancement
On 4/13/2023 7:16 AM, Junfeng Guo wrote: > This patch set includs two main enhancements for gve PMD: > - support basic data path with DQO queue format > - support jumbo frame with GQI queue format > > This patch set is based on this: > patchwork.dpdk.org/project/dpdk/list/?series=27653&state=* > > Junfeng Guo (10): > net/gve: add Tx queue setup for DQO > net/gve: add Rx queue setup for DQO > net/gve: support device start and close for DQO > net/gve: support queue release and stop for DQO > net/gve: support basic Tx data path for DQO > net/gve: support basic Rx data path for DQO > net/gve: support basic stats for DQO > net/gve: enable Tx checksum offload for DQO > net/gve: add maintainers for GVE > net/gve: support jumbo frame for GQI Except 9/10 (maintainers file update), please see note on relevant patch for it, Series applied to dpdk-next-net/main, thanks.
Re: [PATCH 09/10] net/gve: add maintainers for GVE
On 4/13/2023 7:16 AM, Junfeng Guo wrote: > Add maintainers from Google for GVE. > > Signed-off-by: Junfeng Guo > Signed-off-by: Rushil Gupta > --- > MAINTAINERS | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 8df23e5099..08001751b0 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini > > Google Virtual Ethernet > M: Junfeng Guo > +M: Jeroen de Borst > +M: Rushil Gupta > +M: Joshua Washington > F: drivers/net/gve/ > F: doc/guides/nics/gve.rst > F: doc/guides/nics/features/gve.ini Requested ack from new added maintainers not received (to other version of this patch [2]). This patch is not dependency for rest of the set, so dropping it from the set. Also there is a standalone version of this patch [1], can you please send a new version to standalone one, instead of including same patch in various sets? [1] https://patches.dpdk.org/project/dpdk/patch/20221109072352.1387300-1-junfeng@intel.com/ [2] https://patches.dpdk.org/project/dpdk/patch/20230328094512.1796648-4-junfeng@intel.com/
[RFC PATCH v2 0/4] dts: add dts api docs
Augment the meson build system with dts api generation. The api docs are generated from Python docstrings in DTS using Sphinx. The format of choice is the Google format [0]. The guide html sphinx configuration is used to preserve the same style. The build requires the same Python version and dependencies as DTS, because Sphinx imports the Python modules. Dependencies are installed using Poetry from the dts directory: poetry install --with docs After installing, enter the Poetry shell: poetry shell And then run the build: ninja -C dts/doc There's only one properly documented module that serves as a demonstration of the style - framework.testbed_model.node. [0] https://google.github.io/styleguide/pyguide.html#s3.8.4-comments-in-classes Juraj Linkeš (4): dts: code adjustments for sphinx dts: add doc generation dependencies dts: add doc generation dts: format docstrigs to google format doc/api/meson.build | 1 + doc/guides/conf.py| 22 +- doc/guides/meson.build| 1 + doc/guides/tools/dts.rst | 29 + dts/doc/doc-index.rst | 20 + dts/doc/meson.build | 50 ++ dts/framework/config/__init__.py | 11 + .../{testbed_model/hw => config}/cpu.py | 13 + dts/framework/dts.py | 8 +- dts/framework/remote_session/__init__.py | 3 +- dts/framework/remote_session/linux_session.py | 2 +- dts/framework/remote_session/os_session.py| 12 +- .../remote_session/remote/__init__.py | 16 - .../{remote => }/remote_session.py| 0 .../{remote => }/ssh_session.py | 0 dts/framework/settings.py | 55 +- dts/framework/testbed_model/__init__.py | 10 +- dts/framework/testbed_model/hw/__init__.py| 27 - dts/framework/testbed_model/node.py | 164 ++-- dts/framework/testbed_model/sut_node.py | 9 +- .../testbed_model/{hw => }/virtual_device.py | 0 dts/main.py | 3 +- dts/meson.build | 16 + dts/poetry.lock | 770 -- dts/pyproject.toml| 7 + dts/tests/TestSuite_hello_world.py| 6 +- meson.build | 1 + meson_options.txt | 2 + 28 files changed, 1038 insertions(+), 220 deletions(-) create mode 100644 dts/doc/doc-index.rst create mode 100644 dts/doc/meson.build rename dts/framework/{testbed_model/hw => config}/cpu.py (95%) delete mode 100644 dts/framework/remote_session/remote/__init__.py rename dts/framework/remote_session/{remote => }/remote_session.py (100%) rename dts/framework/remote_session/{remote => }/ssh_session.py (100%) delete mode 100644 dts/framework/testbed_model/hw/__init__.py rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%) create mode 100644 dts/meson.build -- 2.30.2
[RFC PATCH v2 1/4] dts: code adjustments for sphinx
sphinx-build only imports the Python modules when building the documentation; it doesn't run DTS. This requires changes that make the code importable without running it. This means: * properly guarding argument parsing in the if __name__ == '__main__' block. * the logger used by DTS runner underwent the same treatment so that it doesn't create unnecessary log files. * however, DTS uses the arguments to construct an object holding global variables. The defaults for the global variables needed to be moved from argument parsing elsewhere. * importing the remote_session module from framework resulted in circular imports because of one module trying to import another module. This is fixed by more granular imports. Signed-off-by: Juraj Linkeš --- dts/framework/config/__init__.py | 11 .../{testbed_model/hw => config}/cpu.py | 13 + dts/framework/dts.py | 8 ++- dts/framework/remote_session/__init__.py | 3 +- dts/framework/remote_session/linux_session.py | 2 +- dts/framework/remote_session/os_session.py| 12 +++- .../remote_session/remote/__init__.py | 16 -- .../{remote => }/remote_session.py| 0 .../{remote => }/ssh_session.py | 0 dts/framework/settings.py | 55 ++- dts/framework/testbed_model/__init__.py | 10 +--- dts/framework/testbed_model/hw/__init__.py| 27 - dts/framework/testbed_model/node.py | 12 ++-- dts/framework/testbed_model/sut_node.py | 9 ++- .../testbed_model/{hw => }/virtual_device.py | 0 dts/main.py | 3 +- dts/tests/TestSuite_hello_world.py| 6 +- 17 files changed, 88 insertions(+), 99 deletions(-) rename dts/framework/{testbed_model/hw => config}/cpu.py (95%) delete mode 100644 dts/framework/remote_session/remote/__init__.py rename dts/framework/remote_session/{remote => }/remote_session.py (100%) rename dts/framework/remote_session/{remote => }/ssh_session.py (100%) delete mode 100644 dts/framework/testbed_model/hw/__init__.py rename dts/framework/testbed_model/{hw => }/virtual_device.py (100%) diff --git a/dts/framework/config/__init__.py b/dts/framework/config/__init__.py index ebb0823ff5..293c4cb15b 100644 --- a/dts/framework/config/__init__.py +++ b/dts/framework/config/__init__.py @@ -7,6 +7,8 @@ Yaml config parsing methods """ +# pylama:ignore=W0611 + import json import os.path import pathlib @@ -19,6 +21,15 @@ from framework.settings import SETTINGS +from .cpu import ( +LogicalCore, +LogicalCoreCount, +LogicalCoreCountFilter, +LogicalCoreList, +LogicalCoreListFilter, +lcore_filter, +) + class StrEnum(Enum): @staticmethod diff --git a/dts/framework/testbed_model/hw/cpu.py b/dts/framework/config/cpu.py similarity index 95% rename from dts/framework/testbed_model/hw/cpu.py rename to dts/framework/config/cpu.py index d1918a12dc..8fe785dfe4 100644 --- a/dts/framework/testbed_model/hw/cpu.py +++ b/dts/framework/config/cpu.py @@ -272,3 +272,16 @@ def filter(self) -> list[LogicalCore]: ) return filtered_lcores + + +def lcore_filter( +core_list: list[LogicalCore], +filter_specifier: LogicalCoreCount | LogicalCoreList, +ascending: bool, +) -> LogicalCoreFilter: +if isinstance(filter_specifier, LogicalCoreList): +return LogicalCoreListFilter(core_list, filter_specifier, ascending) +elif isinstance(filter_specifier, LogicalCoreCount): +return LogicalCoreCountFilter(core_list, filter_specifier, ascending) +else: +raise ValueError(f"Unsupported filter r{filter_specifier}") diff --git a/dts/framework/dts.py b/dts/framework/dts.py index 0502284580..22a09b7e34 100644 --- a/dts/framework/dts.py +++ b/dts/framework/dts.py @@ -3,6 +3,7 @@ # Copyright(c) 2022-2023 PANTHEON.tech s.r.o. # Copyright(c) 2022-2023 University of New Hampshire +import logging import sys from .config import CONFIGURATION, BuildTargetConfiguration, ExecutionConfiguration @@ -12,7 +13,8 @@ from .testbed_model import SutNode from .utils import check_dts_python_version -dts_logger: DTSLOG = getLogger("DTSRunner") +# dummy defaults to satisfy linters +dts_logger: DTSLOG | logging.Logger = logging.getLogger("DTSRunner") result: DTSResult = DTSResult(dts_logger) @@ -24,6 +26,10 @@ def run_all() -> None: global dts_logger global result +# create a regular DTS logger and create a new result with it +dts_logger = getLogger("DTSRunner") +result = DTSResult(dts_logger) + # check the python version of the server that run dts check_dts_python_version() diff --git a/dts/framework/remote_session/__init__.py b/dts/framework/remote_session/__init__.py index ee221503df..17ca1459f7 100644 --- a/dts/framework/remote_session/__init__.py +++ b/dts/framework/remote_session/__init__.py @@ -17,7 +17,8 @@ from .linux_session import Lin
[RFC PATCH v2 2/4] dts: add doc generation dependencies
Sphinx imports every Python module when generating documentation from docstrings, meaning all dts dependencies, including Python version, must be satisfied. By adding Sphinx to dts dependencies we make sure that the proper Python version and dependencies are used when Sphinx is executed. Signed-off-by: Juraj Linkeš --- dts/poetry.lock| 770 + dts/pyproject.toml | 7 + 2 files changed, 710 insertions(+), 67 deletions(-) diff --git a/dts/poetry.lock b/dts/poetry.lock index 0b2a007d4d..500f89dac1 100644 --- a/dts/poetry.lock +++ b/dts/poetry.lock @@ -1,24 +1,69 @@ +# This file is automatically @generated by Poetry and should not be changed by hand. + +[[package]] +name = "alabaster" +version = "0.7.13" +description = "A configurable sidebar-enabled Sphinx theme" +category = "dev" +optional = false +python-versions = ">=3.6" +files = [ +{file = "alabaster-0.7.13-py3-none-any.whl", hash = "sha256:1ee19aca801bbabb5ba3f5f258e4422dfa86f82f3e9cefb0859b283cdd7f62a3"}, +{file = "alabaster-0.7.13.tar.gz", hash = "sha256:a27a4a084d5e690e16e01e03ad2b2e552c61a65469419b907243193de1a84ae2"}, +] + [[package]] name = "attrs" -version = "22.1.0" +version = "22.2.0" description = "Classes Without Boilerplate" category = "main" optional = false -python-versions = ">=3.5" +python-versions = ">=3.6" +files = [ +{file = "attrs-22.2.0-py3-none-any.whl", hash = "sha256:29e95c7f6778868dbd49170f98f8818f78f3dc5e0e37c0b1f474e3561b240836"}, +{file = "attrs-22.2.0.tar.gz", hash = "sha256:c9227bfc2f01993c03f68db37d1d15c9690188323c067c641f1a35ca58185f99"}, +] [package.extras] -dev = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", "furo", "sphinx", "sphinx-notfound-page", "pre-commit", "cloudpickle"] -docs = ["furo", "sphinx", "zope.interface", "sphinx-notfound-page"] -tests = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "zope.interface", "cloudpickle"] -tests_no_zope = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "mypy (>=0.900,!=0.940)", "pytest-mypy-plugins", "cloudpickle"] +cov = ["attrs[tests]", "coverage-enable-subprocess", "coverage[toml] (>=5.3)"] +dev = ["attrs[docs,tests]"] +docs = ["furo", "myst-parser", "sphinx", "sphinx-notfound-page", "sphinxcontrib-towncrier", "towncrier", "zope.interface"] +tests = ["attrs[tests-no-zope]", "zope.interface"] +tests-no-zope = ["cloudpickle", "cloudpickle", "hypothesis", "hypothesis", "mypy (>=0.971,<0.990)", "mypy (>=0.971,<0.990)", "pympler", "pympler", "pytest (>=4.3.0)", "pytest (>=4.3.0)", "pytest-mypy-plugins", "pytest-mypy-plugins", "pytest-xdist[psutil]", "pytest-xdist[psutil]"] + +[[package]] +name = "babel" +version = "2.12.1" +description = "Internationalization utilities" +category = "dev" +optional = false +python-versions = ">=3.7" +files = [ +{file = "Babel-2.12.1-py3-none-any.whl", hash = "sha256:b4246fb7677d3b98f501a39d43396d3cafdc8eadb045f4a31be01863f655c610"}, +{file = "Babel-2.12.1.tar.gz", hash = "sha256:cc2d9cd01d44420ae725a21c9e3711b3aadc7976d6147f622d8581963455"}, +] [[package]] name = "black" -version = "22.10.0" +version = "22.12.0" description = "The uncompromising code formatter." category = "dev" optional = false python-versions = ">=3.7" +files = [ +{file = "black-22.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9eedd20838bd5d75b80c9f5487dbcb06836a43833a37846cf1d8c1cc01cef59d"}, +{file = "black-22.12.0-cp310-cp310-win_amd64.whl", hash = "sha256:159a46a4947f73387b4d83e87ea006dbb2337eab6c879620a3ba52699b1f4351"}, +{file = "black-22.12.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:d30b212bffeb1e252b31dd269dfae69dd17e06d92b87ad26e23890f3efea366f"}, +{file = "black-22.12.0-cp311-cp311-win_amd64.whl", hash = "sha256:7412e75863aa5c5411886804678b7d083c7c28421210180d67dfd8cf1221e1f4"}, +{file = "black-22.12.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c116eed0efb9ff870ded8b62fe9f28dd61ef6e9ddd28d83d7d264a38417dcee2"}, +{file = "black-22.12.0-cp37-cp37m-win_amd64.whl", hash = "sha256:1f58cbe16dfe8c12b7434e50ff889fa479072096d79f0a7f25e4ab8e94cd8350"}, +{file = "black-22.12.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:77d86c9f3db9b1bf6761244bc0b3572a546f5fe37917a044e02f3166d5aafa7d"}, +{file = "black-22.12.0-cp38-cp38-win_amd64.whl", hash = "sha256:82d9fe8fee3401e02e79767016b4907820a7dc28d70d137eb397b92ef3cc5bfc"}, +{file = "black-22.12.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:101c69b23df9b44247bd88e1d7e90154336ac4992502d4197bdac35dd7ee3320"}, +{file = "black-22.12.0-cp39-cp39-win_amd64.whl", hash = "sha256:559c7a1ba9a006226f09e4916060982fd27334ae1998e
[RFC PATCH v2 3/4] dts: add doc generation
The tool used to generate developer docs is sphinx, which is already used in DPDK. The configuration is kept the same to preserve the style. Sphinx generates the documentation from Python docstrings. The docstring format most suitable for DTS seems to be the Google format [0] which requires the sphinx.ext.napoleon extension. There are two requirements for building DTS docs: * The same Python version as DTS or higher, because Sphinx import the code. * Also the same Python packages as DTS, for the same reason. [0] https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings Signed-off-by: Juraj Linkeš --- doc/api/meson.build | 1 + doc/guides/conf.py | 22 ++ doc/guides/meson.build | 1 + doc/guides/tools/dts.rst | 29 +++ dts/doc/doc-index.rst| 20 dts/doc/meson.build | 50 dts/meson.build | 16 + meson.build | 1 + meson_options.txt| 2 ++ 9 files changed, 137 insertions(+), 5 deletions(-) create mode 100644 dts/doc/doc-index.rst create mode 100644 dts/doc/meson.build create mode 100644 dts/meson.build diff --git a/doc/api/meson.build b/doc/api/meson.build index 2876a78a7e..1f0c725a94 100644 --- a/doc/api/meson.build +++ b/doc/api/meson.build @@ -1,6 +1,7 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2018 Luca Boccassi +doc_api_build_dir = meson.current_build_dir() doxygen = find_program('doxygen', required: get_option('enable_docs')) if not doxygen.found() diff --git a/doc/guides/conf.py b/doc/guides/conf.py index a55ce38800..04c842b67a 100644 --- a/doc/guides/conf.py +++ b/doc/guides/conf.py @@ -7,10 +7,9 @@ from sphinx import __version__ as sphinx_version from os import listdir from os import environ -from os.path import basename -from os.path import dirname +from os.path import basename, dirname from os.path import join as path_join -from sys import argv, stderr +from sys import argv, stderr, path import configparser @@ -24,6 +23,19 @@ file=stderr) pass +extensions = ['sphinx.ext.napoleon'] + +# Python docstring options +autodoc_member_order = 'bysource' +autodoc_typehints = 'both' +autodoc_typehints_format = 'short' +napoleon_numpy_docstring = False +napoleon_attr_annotations = True +napoleon_use_ivar = True +napoleon_use_rtype = False +add_module_names = False +toc_object_entries_show_parents = 'hide' + stop_on_error = ('-W' in argv) project = 'Data Plane Development Kit' @@ -35,8 +47,8 @@ html_show_copyright = False highlight_language = 'none' -release = environ.setdefault('DPDK_VERSION', "None") -version = release +path.append(environ.setdefault('DTS_ROOT', '.')) +version = environ.setdefault('DPDK_VERSION', "None") master_doc = 'index' diff --git a/doc/guides/meson.build b/doc/guides/meson.build index 51f81da2e3..8933d75f6b 100644 --- a/doc/guides/meson.build +++ b/doc/guides/meson.build @@ -1,6 +1,7 @@ # SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2018 Intel Corporation +doc_guides_source_dir = meson.current_source_dir() sphinx = find_program('sphinx-build', required: get_option('enable_docs')) if not sphinx.found() diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst index ebd6dceb6a..a547da2017 100644 --- a/doc/guides/tools/dts.rst +++ b/doc/guides/tools/dts.rst @@ -282,3 +282,32 @@ There are three tools used in DTS to help with code checking, style and formatti These three tools are all used in ``devtools/dts-check-format.sh``, the DTS code check and format script. Refer to the script for usage: ``devtools/dts-check-format.sh -h``. + + +Building DTS API docs +- + +To build DTS API docs, install the dependencies with Poetry, then enter its shell: + + .. code-block:: console + + poetry install --with docs + poetry shell + + +Build commands +~~ + +The documentation is built using the standard DPDK build system. + +After entering Poetry's shell, build the documentation with: + + .. code-block:: console + + ninja -C build dts/doc + +The output is generated in ``build/doc/api/dts/html``. + +.. Note:: + + Make sure to fix any Sphinx warnings when adding or updating docstrings. diff --git a/dts/doc/doc-index.rst b/dts/doc/doc-index.rst new file mode 100644 index 00..10151c6851 --- /dev/null +++ b/dts/doc/doc-index.rst @@ -0,0 +1,20 @@ +.. DPDK Test Suite documentation master file, created by + sphinx-quickstart on Tue Mar 14 12:23:52 2023. + You can adapt this file completely to your liking, but it should at least + contain the root `toctree` directive. + +Welcome to DPDK Test Suite's documentation! +=== + +.. toctree:: + :maxdepth: 4 + :caption: Contents: + + modules + +Indices and tables +== + +* :ref:`genindex` +* :ref:`modindex` +* :ref:`search` diff --git a/dts/doc/meson.build b/dts/doc/meson.build n
[RFC PATCH v2 4/4] dts: format docstrigs to google format
WIP: only one module is reformatted to serve as a demonstration. The google format is documented here [0]. [0]: https://google.github.io/styleguide/pyguide.html Signed-off-by: Juraj Linkeš --- dts/framework/testbed_model/node.py | 152 +++- 1 file changed, 103 insertions(+), 49 deletions(-) diff --git a/dts/framework/testbed_model/node.py b/dts/framework/testbed_model/node.py index 90467981c3..ad8ef442af 100644 --- a/dts/framework/testbed_model/node.py +++ b/dts/framework/testbed_model/node.py @@ -3,8 +3,13 @@ # Copyright(c) 2022-2023 PANTHEON.tech s.r.o. # Copyright(c) 2022-2023 University of New Hampshire -""" -A node is a generic host that DTS connects to and manages. +"""Common functionality for node management. + +There's a base class, Node, that's supposed to be extended by other classes +with functionality specific to that node type. +The only part that can be used standalone is the Node.skip_setup static method, +which is a decorator used to skip method execution +if skip_setup is passed by the user on the cmdline or in an env variable. """ from typing import Any, Callable @@ -26,10 +31,25 @@ class Node(object): -""" -Basic class for node management. This class implements methods that -manage a node, such as information gathering (of CPU/PCI/NIC) and -environment setup. +"""The base class for node management. + +It shouldn't be instantiated, but rather extended. +It implements common methods to manage any node: + + * connection to the node + * information gathering of CPU + * hugepages setup + +Arguments: +node_config: The config from the input configuration file. + +Attributes: +main_session: The primary OS-agnostic remote session used +to communicate with the node. +config: The configuration used to create the node. +name: The name of the node. +lcores: The list of logical cores that DTS can use on the node. +It's derived from logical cores present on the node and user configuration. """ main_session: OSSession @@ -56,65 +76,89 @@ def __init__(self, node_config: NodeConfiguration): self._logger.info(f"Created node: {self.name}") def set_up_execution(self, execution_config: ExecutionConfiguration) -> None: -""" -Perform the execution setup that will be done for each execution -this node is part of. +"""Execution setup steps. + +Configure hugepages and call self._set_up_execution where +the rest of the configuration steps (if any) are implemented. + +Args: +execution_config: The execution configuration according to which +the setup steps will be taken. """ self._setup_hugepages() self._set_up_execution(execution_config) def _set_up_execution(self, execution_config: ExecutionConfiguration) -> None: -""" -This method exists to be optionally overwritten by derived classes and -is not decorated so that the derived class doesn't have to use the decorator. +"""Optional additional execution setup steps for derived classes. + +Derived classes should overwrite this +if they want to add additional execution setup steps. """ def tear_down_execution(self) -> None: -""" -Perform the execution teardown that will be done after each execution -this node is part of concludes. +"""Execution teardown steps. + +There are currently no common execution teardown steps +common to all DTS node types. """ self._tear_down_execution() def _tear_down_execution(self) -> None: -""" -This method exists to be optionally overwritten by derived classes and -is not decorated so that the derived class doesn't have to use the decorator. +"""Optional additional execution teardown steps for derived classes. + +Derived classes should overwrite this +if they want to add additional execution teardown steps. """ def set_up_build_target( self, build_target_config: BuildTargetConfiguration ) -> None: -""" -Perform the build target setup that will be done for each build target -tested on this node. +"""Build target setup steps. + +There are currently no common build target setup steps +common to all DTS node types. + +Args: +build_target_config: The build target configuration according to which +the setup steps will be taken. """ self._set_up_build_target(build_target_config) def _set_up_build_target( self, build_target_config: BuildTargetConfiguration ) -> None: -""" -This method exists to be optionally overwritten by derived classes and -is not decorated so that the derived class
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Qi, Thanks for your review. On Fri, Apr 28, 2023 at 11:43AM, Zhang, Qi Z wrote: -Original Message- From: Min Zhou Sent: Monday, April 24, 2023 5:06 PM To: Yang, Qiming ; Wu, Wenjun1 ; zhou...@loongson.cn Cc: dev@dpdk.org; maob...@loongson.cn Subject: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); So "volatile" does not prevent re-order with Loongson compiler? The memory consistency model of the LoongArch [1] uses the Weak Consistency model in which memory operations can be reordered. [1] https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN#overview-of-memory-consistency rxd = *rxdp; /* @@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; + rte_rmb(); rxd = *rxdp; PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u " -- 2.31.1 Best regards, Min
Re: [RFC PATCH v2 3/4] dts: add doc generation
On Thu, May 04, 2023 at 02:37:48PM +0200, Juraj Linkeš wrote: > The tool used to generate developer docs is sphinx, which is already > used in DPDK. The configuration is kept the same to preserve the style. > > Sphinx generates the documentation from Python docstrings. The docstring > format most suitable for DTS seems to be the Google format [0] which > requires the sphinx.ext.napoleon extension. > > There are two requirements for building DTS docs: > * The same Python version as DTS or higher, because Sphinx import the > code. > * Also the same Python packages as DTS, for the same reason. > > [0] > https://google.github.io/styleguide/pyguide.html#38-comments-and-docstrings > > Signed-off-by: Juraj Linkeš > --- > doc/api/meson.build | 1 + > doc/guides/conf.py | 22 ++ > doc/guides/meson.build | 1 + > doc/guides/tools/dts.rst | 29 +++ > dts/doc/doc-index.rst| 20 > dts/doc/meson.build | 50 > dts/meson.build | 16 + > meson.build | 1 + > meson_options.txt| 2 ++ > 9 files changed, 137 insertions(+), 5 deletions(-) > create mode 100644 dts/doc/doc-index.rst > create mode 100644 dts/doc/meson.build > create mode 100644 dts/meson.build > > diff --git a/dts/doc/meson.build b/dts/doc/meson.build > new file mode 100644 > index 00..db2bb0bed9 > --- /dev/null > +++ b/dts/doc/meson.build > @@ -0,0 +1,50 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2023 PANTHEON.tech s.r.o. > + > +sphinx = find_program('sphinx-build', required: > get_option('enable_dts_docs')) > +sphinx_apidoc = find_program('sphinx-apidoc', required: > get_option('enable_dts_docs')) > + > +if sphinx.found() and sphinx_apidoc.found() > +endif > + > +dts_api_framework_dir = join_paths(dts_dir, 'framework') > +dts_api_build_dir = join_paths(doc_api_build_dir, 'dts') > +dts_api_src = custom_target('dts_api_src', > +output: 'modules.rst', > +command: ['SPHINX_APIDOC_OPTIONS=members,show-inheritance', > +sphinx_apidoc, '--append-syspath', '--force', > +'--module-first', '--separate', > +'--doc-project', 'DTS', '-V', meson.project_version(), > +'-o', dts_api_build_dir, > +dts_api_framework_dir], > +build_by_default: get_option('enable_dts_docs')) > +doc_targets += dts_api_src > +doc_target_names += 'DTS_API_sphinx_sources' > + > +cp = find_program('cp', required: get_option('enable_docs')) This should probably be "enable_dts_docs"
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Morten, Thanks for your comments. On Fri, Apr 28, 2023 at 2:27PM, Morten Brørup wrote: From: Zhang, Qi Z [mailto:qi.z.zh...@intel.com] Sent: Friday, 28 April 2023 05.44 From: Min Zhou Sent: Monday, April 24, 2023 5:06 PM Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); So "volatile" does not prevent re-order with Loongson compiler? "Volatile" does not prevent re-ordering on any compiler. "Volatile" only prevents caching of the variable marked volatile. https://wiki.sei.cmu.edu/confluence/display/c/CON02-C.+Do+not+use+volatile+as+a+synchronization+primitive Thinking out loud: I don't know the performance cost of rte_rmb(); perhaps using atomic accesses with the optimal memory ordering would be a better solution in the long term. Yes, rte_rmb() probably had side effects on the performance. I will use a better solution to solve the problem in the V2 patch. rxd = *rxdp; /* @@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; + rte_rmb(); rxd = *rxdp; PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u " -- 2.31.1 Best regards, Min
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Konstantin, Thanks for your comments. On 2023/5/1 下午9:29, Konstantin Ananyev wrote: Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); rxd = *rxdp; Indeed, looks like a problem to me on systems with relaxed MO. Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. The LoongArch architecture uses the Weak Consistency model which can cause the problem, especially in scenario with many cores, such as Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores. About a fix - looks right, but a bit excessive to me - as I understand all we need here is to prevent re-ordering by CPU itself. Yes, thanks for cc-ing. So rte_smp_rmb() seems enough here. Or might be just: staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE); Does __atomic_load_n() work on Windows if we use it to solve this problem ? /* @@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; + rte_rmb(); rxd = *rxdp; PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u " -- 2.31.1 Best regards, Min
RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
> From: zhoumin [mailto:zhou...@loongson.cn] > Sent: Thursday, 4 May 2023 15.17 > > Hi Konstantin, > > Thanks for your comments. > > On 2023/5/1 下午9:29, Konstantin Ananyev wrote: > >> Segmentation fault has been observed while running the > >> ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 > >> processor which has 64 cores and 4 NUMA nodes. > >> > >> From the ixgbe_recv_pkts_lro() function, we found that as long as the > >> first > >> packet has the EOP bit set, and the length of this packet is less > >> than or > >> equal to rxq->crc_len, the segmentation fault will definitely happen > >> even > >> though on the other platforms, such as X86. > >> > >> Because when processd the first packet the first_seg->next will be > >> NULL, if > >> at the same time this packet has the EOP bit set and its length is less > >> than or equal to rxq->crc_len, the following loop will be excecuted: > >> > >> for (lp = first_seg; lp->next != rxm; lp = lp->next) > >> ; > >> > >> We know that the first_seg->next will be NULL under this condition. > >> So the > >> expression of lp->next->next will cause the segmentation fault. > >> > >> Normally, the length of the first packet with EOP bit set will be > >> greater > >> than rxq->crc_len. However, the out-of-order execution of CPU may > >> make the > >> read ordering of the status and the rest of the descriptor fields in > >> this > >> function not be correct. The related codes are as following: > >> > >> rxdp = &rx_ring[rx_id]; > >> #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); > >> > >> if (!(staterr & IXGBE_RXDADV_STAT_DD)) > >> break; > >> > >> #2 rxd = *rxdp; > >> > >> The sentence #2 may be executed before sentence #1. This action is > >> likely > >> to make the ready packet zero length. If the packet is the first > >> packet and > >> has the EOP bit set, the above segmentation fault will happen. > >> > >> So, we should add rte_rmb() to ensure the read ordering be correct. > >> We also > >> did the same thing in the ixgbe_recv_pkts() function to make the rxd > >> data > >> be valid even thougth we did not find segmentation fault in this > >> function. > >> > >> Signed-off-by: Min Zhou > >> --- > >> v2: > >> - Make the calling of rte_rmb() for all platforms > >> --- > >> drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ > >> 1 file changed, 3 insertions(+) > >> > >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c > >> b/drivers/net/ixgbe/ixgbe_rxtx.c > >> index c9d6ca9efe..302a5ab7ff 100644 > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > >> @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf > >> **rx_pkts, > >> staterr = rxdp->wb.upper.status_error; > >> if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) > >> break; > >> + > >> + rte_rmb(); > >> rxd = *rxdp; > > > > > > > > Indeed, looks like a problem to me on systems with relaxed MO. > > Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. > The LoongArch architecture uses the Weak Consistency model which can > cause the problem, especially in scenario with many cores, such as > Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot > reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores. > > About a fix - looks right, but a bit excessive to me - > > as I understand all we need here is to prevent re-ordering by CPU itself. > Yes, thanks for cc-ing. > > So rte_smp_rmb() seems enough here. > > Or might be just: > > staterr = __atomic_load_n(&rxdp->wb.upper.status_error, > > __ATOMIC_ACQUIRE); > > > Does __atomic_load_n() work on Windows if we use it to solve this problem ? Yes, __atomic_load_n() works on Windows too.
RE: [PATCH v2] net/ixgbe: consider DCB/VMDq conf when getting RSS conf
> -Original Message- > From: Min Zhou > Sent: Friday, April 28, 2023 4:43 PM > To: Zhang, Qi Z ; Yang, Qiming > ; Wu, Wenjun1 ; > zhou...@loongson.cn > Cc: dev@dpdk.org; maob...@loongson.cn > Subject: [PATCH v2] net/ixgbe: consider DCB/VMDq conf when getting RSS > conf > > The mrqe field of MRQC register is an enum. From the Intel 82599 datasheet, > we know that these values below for the mrqe field are all related to RSS > configuration: > b = RSS disabled. > 0001b = RSS only -- Single set of RSS 16 queues. > 0010b = DCB enabled and RSS disabled -- 8 TCs, each allocated 1 > queue. > 0011b = DCB enabled and RSS disabled -- 4 TCs, each allocated 1 > queue. > 0100b = DCB and RSS -- 8 TCs, each allocated 16 RSS queues. > 0101b = DCB and RSS -- 4 TCs, each allocated 16 RSS queues. > 1000b = Virtualization only -- 64 pools, no RSS, each pool allocated > 2 queues. > 1010b = Virtualization and RSS -- 32 pools, each allocated 4 RSS > queues. > 1011b = Virtualization and RSS -- 64 pools, each allocated 2 RSS > queues. > > The ixgbe pmd will check whether the rss is enabled or not when getting rss > conf. So, beside comparing the value of mrqe field with xxx0b and xxx1b, we > also need to consider the other configurations, such as DCB + RSS or VMDQ + > RSS. Otherwise, we may not get the correct rss conf in some cases, such as > when we use DCB and RSS with 8 TCs which corresponds to 0100b for the > mrqe field. > > Signed-off-by: Min Zhou Acked-by: Qi Zhang Applied to dpdk-next-net-intel. Thanks Qi
RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
> -Original Message- > From: Morten Brørup > Sent: Thursday, May 4, 2023 9:22 PM > To: zhoumin ; Konstantin Ananyev > > Cc: dev@dpdk.org; maob...@loongson.cn; Yang, Qiming > ; Wu, Wenjun1 ; > ruifeng.w...@arm.com; d...@linux.vnet.ibm.com; Tyler Retzlaff > > Subject: RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx > functions > > > From: zhoumin [mailto:zhou...@loongson.cn] > > Sent: Thursday, 4 May 2023 15.17 > > > > Hi Konstantin, > > > > Thanks for your comments. > > > > On 2023/5/1 下午9:29, Konstantin Ananyev wrote: > > >> Segmentation fault has been observed while running the > > >> ixgbe_recv_pkts_lro() function to receive packets on the Loongson > > >> 3C5000 processor which has 64 cores and 4 NUMA nodes. > > >> > > >> From the ixgbe_recv_pkts_lro() function, we found that as long as > > >> the first packet has the EOP bit set, and the length of this packet > > >> is less than or equal to rxq->crc_len, the segmentation fault will > > >> definitely happen even though on the other platforms, such as X86. Sorry to interrupt, but I am curious why this issue still exists on x86 architecture. Can volatile be used to instruct the compiler to generate read instructions in a specific order, and does x86 guarantee not to reorder load operations? > > >> > > >> Because when processd the first packet the first_seg->next will be > > >> NULL, if at the same time this packet has the EOP bit set and its > > >> length is less than or equal to rxq->crc_len, the following loop > > >> will be excecuted: > > >> > > >> for (lp = first_seg; lp->next != rxm; lp = lp->next) > > >> ; > > >> > > >> We know that the first_seg->next will be NULL under this condition. > > >> So the > > >> expression of lp->next->next will cause the segmentation fault. > > >> > > >> Normally, the length of the first packet with EOP bit set will be > > >> greater than rxq->crc_len. However, the out-of-order execution of > > >> CPU may make the read ordering of the status and the rest of the > > >> descriptor fields in this function not be correct. The related > > >> codes are as following: > > >> > > >> rxdp = &rx_ring[rx_id]; > > >> #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); > > >> > > >> if (!(staterr & IXGBE_RXDADV_STAT_DD)) > > >> break; > > >> > > >> #2 rxd = *rxdp; > > >> > > >> The sentence #2 may be executed before sentence #1. This action is > > >> likely to make the ready packet zero length. If the packet is the > > >> first packet and has the EOP bit set, the above segmentation fault > > >> will happen. > > >> > > >> So, we should add rte_rmb() to ensure the read ordering be correct. > > >> We also > > >> did the same thing in the ixgbe_recv_pkts() function to make the > > >> rxd data be valid even thougth we did not find segmentation fault > > >> in this function. > > >> > > >> Signed-off-by: Min Zhou > > >> --- > > >> v2: > > >> - Make the calling of rte_rmb() for all platforms > > >> --- > > >> drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ > > >> 1 file changed, 3 insertions(+) > > >> > > >> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c > > >> b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff > > >> 100644 > > >> --- a/drivers/net/ixgbe/ixgbe_rxtx.c > > >> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c > > >> @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct > > >> rte_mbuf **rx_pkts, > > >> staterr = rxdp->wb.upper.status_error; > > >> if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) > > >> break; > > >> + > > >> + rte_rmb(); > > >> rxd = *rxdp; > > > > > > > > > > > > Indeed, looks like a problem to me on systems with relaxed MO. > > > Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. > > The LoongArch architecture uses the Weak Consistency model which can > > cause the problem, especially in scenario with many cores, such as > > Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot > > reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 > cores. > > > About a fix - looks right, but a bit excessive to me - as I > > > understand all we need here is to prevent re-ordering by CPU itself. > > Yes, thanks for cc-ing. > > > So rte_smp_rmb() seems enough here. > > > Or might be just: > > > staterr = __atomic_load_n(&rxdp->wb.upper.status_error, > > > __ATOMIC_ACQUIRE); > > > > > Does __atomic_load_n() work on Windows if we use it to solve this > problem ? > > Yes, __atomic_load_n() works on Windows too. >
Re: [PATCH] net/gve: Check whether the driver is compatible with the device presented.
On 4/26/2023 10:37 PM, Rushil Gupta wrote: > Change gve_driver_info fields to report DPDK as OS type and DPDK RTE > version as OS version, reserving driver_version fields for GVE driver > version based on features. > './devtools/check-git-log.sh' is giving some warnings, can you please check them? Contribution guide documentation may help to fix reported issues: https://doc.dpdk.org/guides/contributing/patches.html#commit-messages-subject-line > Depends-on: series-27687 > > Signed-off-by: Rushil Gupta > Signed-off-by: Joshua Washington > Signed-off-by: Junfeng Guo > Signed-off-by: Jeroen de Borst > --- > drivers/net/gve/base/gve.h| 3 -- > drivers/net/gve/base/gve_adminq.c | 19 + > drivers/net/gve/base/gve_adminq.h | 49 ++- > drivers/net/gve/base/gve_osdep.h | 36 + > drivers/net/gve/gve_ethdev.c | 65 +-- > drivers/net/gve/gve_ethdev.h | 2 +- > drivers/net/gve/gve_version.c | 14 +++ > drivers/net/gve/gve_version.h | 25 > drivers/net/gve/meson.build | 1 + > 9 files changed, 198 insertions(+), 16 deletions(-) > create mode 100644 drivers/net/gve/gve_version.c > create mode 100644 drivers/net/gve/gve_version.h > > diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h > index 2dc4507acb..89f9654a72 100644 > --- a/drivers/net/gve/base/gve.h > +++ b/drivers/net/gve/base/gve.h > @@ -8,9 +8,6 @@ > > #include "gve_desc.h" > > -#define GVE_VERSION "1.3.0" > -#define GVE_VERSION_PREFIX "GVE-" > - > #ifndef GOOGLE_VENDOR_ID > #define GOOGLE_VENDOR_ID 0x1ae0 > #endif > diff --git a/drivers/net/gve/base/gve_adminq.c > b/drivers/net/gve/base/gve_adminq.c > index e745b709b2..2e5099a5b0 100644 > --- a/drivers/net/gve/base/gve_adminq.c > +++ b/drivers/net/gve/base/gve_adminq.c > @@ -401,6 +401,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv, > case GVE_ADMINQ_GET_PTYPE_MAP: > priv->adminq_get_ptype_map_cnt++; > break; > + case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY: > + priv->adminq_verify_driver_compatibility_cnt++; > + break; > default: > PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode); > } > @@ -465,6 +468,22 @@ int gve_adminq_configure_device_resources(struct > gve_priv *priv, > return gve_adminq_execute_cmd(priv, &cmd); > } > > +int gve_adminq_verify_driver_compatibility(struct gve_priv *priv, > +u64 driver_info_len, > +dma_addr_t driver_info_addr) > +{ > + union gve_adminq_command cmd; > + > + memset(&cmd, 0, sizeof(cmd)); > + cmd.opcode = cpu_to_be32(GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY); > + cmd.verify_driver_compatibility = (struct > gve_adminq_verify_driver_compatibility) { > + .driver_info_len = cpu_to_be64(driver_info_len), > + .driver_info_addr = cpu_to_be64(driver_info_addr), > + }; > + > + return gve_adminq_execute_cmd(priv, &cmd); > +} > + > int gve_adminq_deconfigure_device_resources(struct gve_priv *priv) > { > union gve_adminq_command cmd; > diff --git a/drivers/net/gve/base/gve_adminq.h > b/drivers/net/gve/base/gve_adminq.h > index 05550119de..edac32f031 100644 > --- a/drivers/net/gve/base/gve_adminq.h > +++ b/drivers/net/gve/base/gve_adminq.h > @@ -1,6 +1,6 @@ > /* SPDX-License-Identifier: MIT > * Google Virtual Ethernet (gve) driver > - * Copyright (C) 2015-2022 Google, Inc. > + * Copyright (C) 2015-2023 Google, Inc. > */ > > #ifndef _GVE_ADMINQ_H > @@ -23,6 +23,7 @@ enum gve_adminq_opcodes { > GVE_ADMINQ_REPORT_STATS = 0xC, > GVE_ADMINQ_REPORT_LINK_SPEED= 0xD, > GVE_ADMINQ_GET_PTYPE_MAP= 0xE, > + GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY = 0xF, > }; > > /* Admin queue status codes */ > @@ -145,6 +146,47 @@ enum gve_sup_feature_mask { > }; > > #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0 > +enum gve_driver_capbility { > + gve_driver_capability_gqi_qpl = 0, > + gve_driver_capability_gqi_rda = 1, > + gve_driver_capability_dqo_qpl = 2, /* reserved for future use */ > + gve_driver_capability_dqo_rda = 3, > +}; > + > +#define GVE_CAP1(a) BIT((int)a) > +#define GVE_CAP2(a) BIT(((int)a) - 64) > +#define GVE_CAP3(a) BIT(((int)a) - 128) > +#define GVE_CAP3(a) BIT(((int)a) - 192) GVE_CAP2, GVE_CAP3 & GVE_CAP4 seems not used, do we need to add them now? And I guess '(((int)a) - 64)' usage is to unset some bits, will it be better to '(a & ~(1 << 6))' ? > + > +#define GVE_DRIVER_CAPABILITY_FLAGS1 \ > + (GVE_CAP1(gve_driver_capability_gqi_qpl) | \ > + GVE_CAP1(gve_driver_capability_gqi_rda) | \ > + GVE_CAP1(gve_driver_capability_dqo_rda)) > + > +#define GVE_DRIVER_CAPABILITY_FLAGS2 0x0 > +#define GVE_DRIVER_CAPABILITY_FLAGS3 0x0 > +#define GVE_DRIVER_CAPABILITY_FLAGS4 0x
Re: [PATCH 1/2] config/arm: Do not require processor information
On 2023/05/04 18:43, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Thursday, May 4, 2023 3:47 PM To: Ruifeng Wang ; Bruce Richardson ; Juraj Linkeš Cc: dev@dpdk.org; nd Subject: Re: [PATCH 1/2] config/arm: Do not require processor information On 2023/04/20 16:12, Akihiko Odaki wrote: On 2023/04/20 16:10, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Thursday, April 20, 2023 9:40 AM To: Ruifeng Wang ; Bruce Richardson ; Juraj Linkeš Cc: dev@dpdk.org; nd Subject: Re: [PATCH 1/2] config/arm: Do not require processor information On 2023/04/17 16:41, Ruifeng Wang wrote: -Original Message- From: Akihiko Odaki Sent: Friday, April 14, 2023 8:42 PM To: Ruifeng Wang ; Bruce Richardson Cc: dev@dpdk.org; Akihiko Odaki Subject: [PATCH 1/2] config/arm: Do not require processor information DPDK can be built even without exact processor information for x86 and ppc so allow to build for Arm even if we don't know the targeted processor is unknown. Hi Akihiko, The design idea was to require an explicit generic build. Default/native build doesn't fall back to generic build when SoC info is not on the list. So the user has less chance to generate a suboptimal binary by accident. Hi, It is true that the suboptimal binary can result, but the rationale here is that we tolerate that for x86 and ppc so it should not really matter for Arm too. On x86 and ppc you don't need to modify meson.build just to run dts on a development machine. What modification do you need for a development machine? I suppose "meson setup build -Dplatform=generic" will generate a binary that can run on your development machine. I didn't describe the situation well. I use DPDK Test Suite for testing and it determines what flags to be passed to Meson. You need to modify DPDK's meson.build or DTS to get it built. Regards, Akihiko Odaki Hi, Can you have a look at this again? Thanks for the clarification of your use case. Changes to DTS are in planning. It will allow the user to choose the type of the build. Your use case will be fulfilled then. Such a feature indeed satisfies my requirement. Thanks in advance, Akihiko Odaki
Re: [PATCH v2] build: announce requirement for C11
On Thu, May 04, 2023 at 09:48:59AM +0100, Bruce Richardson wrote: > On Thu, May 04, 2023 at 09:50:09AM +0200, Mattias Rönnblom wrote: > > On 2023-05-03 19:30, Bruce Richardson wrote: > > > Add a deprecation notice informing users that we will require a C11 > > > compiler from 23.11 release onwards. This requirement was agreed by > > > technical board to enable use of newer C language features, e.g. > > > standard atomics. [1] > > > > > > [1] > > > http://inbox.dpdk.org/dev/dbapr08mb58148cec3e1454e8848a938998...@dbapr08mb5814.eurprd08.prod.outlook.com/ > > > > > > Signed-off-by: Bruce Richardson > > > Acked-by: Tyler Retzlaff > > > > > > --- > > > > > > V2: > > > - add requirement for stdatomics > > > - fix sphinx formatting > > > --- > > > doc/guides/rel_notes/deprecation.rst | 9 + > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/doc/guides/rel_notes/deprecation.rst > > > b/doc/guides/rel_notes/deprecation.rst > > > index dcc1ca1696..70c6019d26 100644 > > > --- a/doc/guides/rel_notes/deprecation.rst > > > +++ b/doc/guides/rel_notes/deprecation.rst > > > @@ -11,6 +11,15 @@ here. > > > Deprecation Notices > > > --- > > > +* C Compiler: From DPDK 23.11 onwards, > > > + building DPDK will require a C compiler which supports the C11 > > > standard, > > > + including support for C11 standard atomics. > > > > The whole of C11, or just the mandatory parts (+atomics)? > > > I assume we only need mandatory + atomics, however perhaps someone more > knowledgable about what the optional parts of the spec are, can correct me > on this. Once clarified, I maybe should reword this yet again to call it > out even more specificallly. mandatory + atomics, no more than that. so -std=c11 and __STDC_NO_ATOMICS__ is not defined, that is all. > > /Bruce
Re: [PATCH v3 5/5] config/arm: add AMD CDX
On 4/21/2023 3:54 PM, Nipun Gupta wrote: > Adding support for AMD CDX devices > > Signed-off-by: Nipun Gupta > --- > config/arm/arm64_cdx_linux_gcc | 17 + > config/arm/meson.build | 14 ++ > 2 files changed, 31 insertions(+) > create mode 100644 config/arm/arm64_cdx_linux_gcc > > diff --git a/config/arm/arm64_cdx_linux_gcc b/config/arm/arm64_cdx_linux_gcc > new file mode 100644 > index 00..8e6d619dae > --- /dev/null > +++ b/config/arm/arm64_cdx_linux_gcc > @@ -0,0 +1,17 @@ > +[binaries] > +c = ['ccache', 'aarch64-linux-gnu-gcc'] > +cpp = ['ccache', 'aarch64-linux-gnu-g++'] > +ar = 'aarch64-linux-gnu-ar' > +as = 'aarch64-linux-gnu-as' > +strip = 'aarch64-linux-gnu-strip' > +pkgconfig = 'aarch64-linux-gnu-pkg-config' > +pcap-config = '' > + > +[host_machine] > +system = 'linux' > +cpu_family = 'aarch64' > +cpu = 'armv8-a' > +endian = 'little' > + > +[properties] > +platform = 'cdx' > diff --git a/config/arm/meson.build b/config/arm/meson.build > index 5213434ca4..39b8929534 100644 > --- a/config/arm/meson.build > +++ b/config/arm/meson.build > @@ -305,6 +305,18 @@ soc_bluefield = { > 'numa': false > } > > +soc_cdx = { > +'description': 'AMD CDX', > +'implementer': '0x41', > +'part_number': '0xd42', > +'flags': [ > +['RTE_MACHINE', '"cdx"'], > +['RTE_MAX_LCORE', 16], > +['RTE_MAX_NUMA_NODES', 1] > +], > +'numa': false > +} Hi Nipun, Why we need a new arm platform/config? Is it because of above flags? If it can work with default values, I think we can drop this patch.
RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue
Hi Morten > -Original Message- > From: Morten Brørup > > Power saving is important for the environment (to save the planet and all > that), so everyone should contribute, if they have a good solution. So even if > our algorithm had a significant degree of innovation, we would probably > choose to make it public anyway. Open sourcing it also makes it possible for > chip vendors like Intel to fine tune it more than we can ourselves, which also > comes back to benefit us. All products need some sort of power saving in to > stay competitive, but power saving algorithms is not an area we want to > pursue for competitive purposes in our products. > > Our algorithm is too simple to make a library at this point, but I have been > thinking about how we can make it a generic library when it has matured > some more. I will take your information about the many customers' need to > have it invisibly injected into consideration in this regard. > > Our current algorithm works like this: > > while (running) { > int more = 0; > more += stage1(); > more += stage2(); > more += stage3(); > if (!more) sleep(); > } > > Each pipeline stage only returns 1 if it processed a full burst. Furthermore, > if a > pipeline stage processed a full burst, but happens to know that no more data > is readily available for it, it returns 0 instead. > > Obviously, the sleep() duration must be short enough to avoid that the NIC > RX descriptor rings overflow before the ingress pipeline stage is serviced > again. > > Changing the algorithm to "more" (1 = more work expected by the pipeline > stage) from "busy" (1 = some work done by the pipeline stage) has the > consequence that sleep() is called more often, which has the follow-on > consequence that the ingress stage is called less often, and thus more often > has a full burst to process. > > We know from our in-house profiler that processing a full burst provides > *much* higher execution efficiency (cycles/packet) than processing a few > packets. This is public knowledge - after all, this is the whole point of > DPDK's > vector packet processing design! Nonetheless, it might surprise some people > how much the efficiency (cycles/packet) increases when processing a full > burst compared to processing just a few packets. I will leave it up to the > readers to make their own experiments. :-) > > Our initial "busy" algorithm behaved like this: > Process a few packets (at low efficiency), don't sleep, Process a few packets > (at low efficiency), don't sleep, Process a few packets (at low efficiency), > don't sleep, Process a few packets (at low efficiency), don't sleep, Process a > few packets (at low efficiency), don't sleep, Process a few packets (at low > efficiency), don't sleep, Process a few packets (at low efficiency), don't > sleep, Process a few packets (at low efficiency), don't sleep, No packets to > process (we are lucky this time!), sleep briefly, Repeat. > > So we switched to our "more" algorithm, which behaves like this: > Process a few packets (at low efficiency), sleep briefly, Process a full > burst of > packets (at high efficiency), don't sleep, Repeat. > > Instead of processing e.g. 8 small bursts per sleep, we now process only 2 > bursts per sleep. And the big of the two bursts is processed at higher > efficiency. > > We can improve this algorithm in some areas... > > E.g. some of our pipeline stages also know that they are not going to do > anymore work for the next X amount of nanoseconds; but we don't use that > information in our power management algorithm yet. The sleep duration > could depend on this. > > Also, we don't use the CPU power management states yet. I assume that > doing some work for 20 us at half clock speed is more power conserving than > doing the same work at full speed for 10 us and then sleeping for 10 us. > That's another potential improvement. > > > What we need in generic a power management helper library are functions > to feed it with the application's perception of how much work is being done, > and functions to tell if we can sleep and/or if we should change the power > management states of the individual CPU cores. > > Such a unified power management helper (or "busyness") library could > perhaps also be fed with data directly from the drivers and libraries to > support the customer use cases you described. [DC] Thank you for that detailed description, very interesting. There may well be merit in upstreaming such an algorithm as a library once it has matured as you said. Configuration could include specifying what a "full burst" actually is. Different stages of a pipeline may also have different definitions of busyness, so that may also need to considered: - Some stages may perform an operation (e.g. running an acl rule check) on a burst of packets and then it is complete - Other stages may be more asynchronous in nature e.g. enqueuing and dequeuing to/from a crypto device or a QoS scheduler. The dequeue might not
Re: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue
On Thu, 4 May 2023 16:11:31 + "Coyle, David" wrote: > Hi Morten > > > -Original Message- > > From: Morten Brørup > > > > > > > Power saving is important for the environment (to save the planet and all > > that), so everyone should contribute, if they have a good solution. So even > > if > > our algorithm had a significant degree of innovation, we would probably > > choose to make it public anyway. Open sourcing it also makes it possible for > > chip vendors like Intel to fine tune it more than we can ourselves, which > > also > > comes back to benefit us. All products need some sort of power saving in to > > stay competitive, but power saving algorithms is not an area we want to > > pursue for competitive purposes in our products. > > > > Our algorithm is too simple to make a library at this point, but I have been > > thinking about how we can make it a generic library when it has matured > > some more. I will take your information about the many customers' need to > > have it invisibly injected into consideration in this regard. > > > > Our current algorithm works like this: > > > > while (running) { > > int more = 0; > > more += stage1(); > > more += stage2(); > > more += stage3(); > > if (!more) sleep(); > > } > > > > Each pipeline stage only returns 1 if it processed a full burst. > > Furthermore, if a > > pipeline stage processed a full burst, but happens to know that no more data > > is readily available for it, it returns 0 instead. > > > > Obviously, the sleep() duration must be short enough to avoid that the NIC > > RX descriptor rings overflow before the ingress pipeline stage is serviced > > again. > > > > Changing the algorithm to "more" (1 = more work expected by the pipeline > > stage) from "busy" (1 = some work done by the pipeline stage) has the > > consequence that sleep() is called more often, which has the follow-on > > consequence that the ingress stage is called less often, and thus more often > > has a full burst to process. > > > > We know from our in-house profiler that processing a full burst provides > > *much* higher execution efficiency (cycles/packet) than processing a few > > packets. This is public knowledge - after all, this is the whole point of > > DPDK's > > vector packet processing design! Nonetheless, it might surprise some people > > how much the efficiency (cycles/packet) increases when processing a full > > burst compared to processing just a few packets. I will leave it up to the > > readers to make their own experiments. :-) > > > > Our initial "busy" algorithm behaved like this: > > Process a few packets (at low efficiency), don't sleep, Process a few > > packets > > (at low efficiency), don't sleep, Process a few packets (at low efficiency), > > don't sleep, Process a few packets (at low efficiency), don't sleep, > > Process a > > few packets (at low efficiency), don't sleep, Process a few packets (at low > > efficiency), don't sleep, Process a few packets (at low efficiency), don't > > sleep, Process a few packets (at low efficiency), don't sleep, No packets to > > process (we are lucky this time!), sleep briefly, Repeat. > > > > So we switched to our "more" algorithm, which behaves like this: > > Process a few packets (at low efficiency), sleep briefly, Process a full > > burst of > > packets (at high efficiency), don't sleep, Repeat. > > > > Instead of processing e.g. 8 small bursts per sleep, we now process only 2 > > bursts per sleep. And the big of the two bursts is processed at higher > > efficiency. > > > > We can improve this algorithm in some areas... > > > > E.g. some of our pipeline stages also know that they are not going to do > > anymore work for the next X amount of nanoseconds; but we don't use that > > information in our power management algorithm yet. The sleep duration > > could depend on this. > > > > Also, we don't use the CPU power management states yet. I assume that > > doing some work for 20 us at half clock speed is more power conserving than > > doing the same work at full speed for 10 us and then sleeping for 10 us. > > That's another potential improvement. > > > > > > What we need in generic a power management helper library are functions > > to feed it with the application's perception of how much work is being done, > > and functions to tell if we can sleep and/or if we should change the power > > management states of the individual CPU cores. > > > > Such a unified power management helper (or "busyness") library could > > perhaps also be fed with data directly from the drivers and libraries to > > support the customer use cases you described. > > [DC] Thank you for that detailed description, very interesting. There may > well be merit in upstreaming such an algorithm as a library once it has > matured as you said. > > Configuration could include specifying what a "full burst" > actually is. Different stages of a pipeline may also have different > definitions > of busyness, so that
[Bug 1225] Salesforce Integration
https://bugs.dpdk.org/show_bug.cgi?id=1225 Stephen Hemminger (step...@networkplumber.org) changed: What|Removed |Added CC||step...@networkplumber.org Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Stephen Hemminger (step...@networkplumber.org) --- Spam -- You are receiving this mail because: You are the assignee for the bug.
Re: DPDK Release Status Meeting 2023-05-04
On Thu, 4 May 2023 10:24:45 + "Mcnamara, John" wrote: > Release status meeting minutes 2023-05-04 > = > > Agenda: > * Release Dates > * Subtrees > * Roadmaps > * LTS > * Defects > * Opens > > Participants: > * AMD > * ARM > * Intel > * Marvell > * Nvidia > * Red Hat > > > Release Dates > - > > The following are the proposed current dates for 23.07: > > * V1: 22 April 2023 > * RC1: 31 May 2023 > * RC2: 21 June 2023 > * RC3: 28 June 2023 > * Release: 12 July 2023 > > > Subtrees > > > * next-net > * Under review. > * ~120 patches in backlog. > * GVE license update > > * next-net-intel > * Starting to apply patches, about 20 patches are waiting for merge. > > * next-net-mlx > * No update > > * next-net-mvl > * Some patches merged. > * ~30 patches remaining. > > * next-eventdev > * Some patches merged. > * AMD patches under review > > * next-baseband > * No major features so far. > > * next-virtio > * Intel series for port mirroring > * Some concerns on code duplication and API complexity > * Report of performance degradation in vhost driver > * New series on guest notifications in slow path > * VDUSE - Vhost VDPA in userspace. https://lwn.net/Articles/841054/ > * Under review > > * next-crypto > * 60-70 patches in backlog > * Patches to add new algorithms > * Reviewing PDCP patches > * Reviews and merges ongoing > > * main > * Starting to merge patches to improve the Windows port > * Reviewing ethdev patches > * Patches from ARM for PMU > * Postponed from last release > * For tracing > * Awaiting feedback on comments > * Power Management feature from AMD > * Memarea feature > * CDX bus patches from AMD > * Virtual PCI like bus > * New checkpatch checks for atomic functions > * Issues with Fedora 38 and GCC 13 > * Issue with Coverity automated analysis > * Call for updates to the external Roadmap: > https://core.dpdk.org/roadmap/ > > > Proposed Schedule for 2023 > -- > > See also http://core.dpdk.org/roadmap/#dates > > 23.07 > * Proposal deadline (RFC/v1 patches): 22 April 2023 > * API freeze (-rc1): 31 May 2023 > * PMD features freeze (-rc2): 21 June 2023 > * Builtin applications features freeze (-rc3): 28 June 2023 > * Release: 12 July 2023 > > 23.11 > * Proposal deadline (RFC/v1 patches): 12 August 2023 > * API freeze (-rc1): 29 September 2023 > * PMD features freeze (-rc2): 20 October 2023 > * Builtin applications features freeze (-rc3): 27 October 2023 > * Release: 15 November 2023 > > > LTS > --- > > Next LTS releases will be: > > * 22.11.1 > * RC1 sent - Scheduled for May 5 > * 21.11.4 > * Got test feedback from Nvidia, Red Hat and Intel > * 20.11.8 > * RC1 sent and first test report received > * 19.11.15 > * CVE and critical fixes only. > > > * Distros > * v20.11 in Debian 11 > * Ubuntu 22.04-LTS contains 21.11 > * Ubuntu 23.04 contains 22.11 > > Defects > --- > > * Bugzilla links, 'Bugs', added for hosted projects > * https://www.dpdk.org/hosted-projects/ Could we add current bugzilla statistics to the this status report?
RE: [RFC PATCH] ring: adding TPAUSE instruction to ring dequeue
> -Original Message- > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > On Thu, 4 May 2023 16:11:31 + > "Coyle, David" wrote: > > > Hi Morten > > > > > -Original Message- > > > From: Morten Brørup > > > > > > > > > > > > Power saving is important for the environment (to save the planet and all > > > that), so everyone should contribute, if they have a good solution. So > even if > > > our algorithm had a significant degree of innovation, we would probably > > > choose to make it public anyway. Open sourcing it also makes it possible > for > > > chip vendors like Intel to fine tune it more than we can ourselves, which > also > > > comes back to benefit us. All products need some sort of power saving in > to > > > stay competitive, but power saving algorithms is not an area we want to > > > pursue for competitive purposes in our products. > > > > > > Our algorithm is too simple to make a library at this point, but I have > been > > > thinking about how we can make it a generic library when it has matured > > > some more. I will take your information about the many customers' need to > > > have it invisibly injected into consideration in this regard. > > > > > > Our current algorithm works like this: > > > > > > while (running) { > > > int more = 0; > > > more += stage1(); > > > more += stage2(); > > > more += stage3(); > > > if (!more) sleep(); > > > } > > > > > > Each pipeline stage only returns 1 if it processed a full burst. > Furthermore, if a > > > pipeline stage processed a full burst, but happens to know that no more > data > > > is readily available for it, it returns 0 instead. > > > > > > Obviously, the sleep() duration must be short enough to avoid that the NIC > > > RX descriptor rings overflow before the ingress pipeline stage is serviced > > > again. > > > > > > Changing the algorithm to "more" (1 = more work expected by the pipeline > > > stage) from "busy" (1 = some work done by the pipeline stage) has the > > > consequence that sleep() is called more often, which has the follow-on > > > consequence that the ingress stage is called less often, and thus more > often > > > has a full burst to process. > > > > > > We know from our in-house profiler that processing a full burst provides > > > *much* higher execution efficiency (cycles/packet) than processing a few > > > packets. This is public knowledge - after all, this is the whole point of > DPDK's > > > vector packet processing design! Nonetheless, it might surprise some > people > > > how much the efficiency (cycles/packet) increases when processing a full > > > burst compared to processing just a few packets. I will leave it up to the > > > readers to make their own experiments. :-) > > > > > > Our initial "busy" algorithm behaved like this: > > > Process a few packets (at low efficiency), don't sleep, Process a few > packets > > > (at low efficiency), don't sleep, Process a few packets (at low > efficiency), > > > don't sleep, Process a few packets (at low efficiency), don't sleep, > Process a > > > few packets (at low efficiency), don't sleep, Process a few packets (at > low > > > efficiency), don't sleep, Process a few packets (at low efficiency), don't > > > sleep, Process a few packets (at low efficiency), don't sleep, No packets > to > > > process (we are lucky this time!), sleep briefly, Repeat. > > > > > > So we switched to our "more" algorithm, which behaves like this: > > > Process a few packets (at low efficiency), sleep briefly, Process a full > burst of > > > packets (at high efficiency), don't sleep, Repeat. > > > > > > Instead of processing e.g. 8 small bursts per sleep, we now process only 2 > > > bursts per sleep. And the big of the two bursts is processed at higher > > > efficiency. > > > > > > We can improve this algorithm in some areas... > > > > > > E.g. some of our pipeline stages also know that they are not going to do > > > anymore work for the next X amount of nanoseconds; but we don't use that > > > information in our power management algorithm yet. The sleep duration > > > could depend on this. > > > > > > Also, we don't use the CPU power management states yet. I assume that > > > doing some work for 20 us at half clock speed is more power conserving > than > > > doing the same work at full speed for 10 us and then sleeping for 10 us. > > > That's another potential improvement. > > > > > > > > > What we need in generic a power management helper library are functions > > > to feed it with the application's perception of how much work is being > done, > > > and functions to tell if we can sleep and/or if we should change the power > > > management states of the individual CPU cores. > > > > > > Such a unified power management helper (or "busyness") library could > > > perhaps also be fed with data directly from the drivers and libraries to > > > support the customer use cases you described. > > > > [DC] Thank you for that detailed description, very interesting. There may > > w
[PATCH v3 00/11] sync Truflow support with latest release
Update Truflow support to latest release, deprecating code, updating the copyright date and hsi structure, syncing the truflow core, adding ULP shared session support, RSS action support, Queue action support, rte meter support, and more. Please apply. v2->v3: - update some commit messages - removed some empty lines in the patches - removed some dead and unnecessary code - fixed some checkpatch errors Version 2 fixes: - misspellings - whitespace issues - signed off issues Kishore Padmanabha (1): net/bnxt: fix multi-root card support Randy Schacher (8): net/bnxt: remove deprecated features net/bnxt: update bnxt hsi structure net/bnxt: update copyright date and cleanup whitespace net/bnxt: update Truflow core net/bnxt: update ULP shared session support net/bnxt: add RSS and Queue action in TruFLow net/bnxt: add support for rte meter net/bnxt: add support for eCPRI packet parsing Shuanglin Wang (1): net/bnxt: set RSS config based on RSS mode Somnath Kotur (1): net/bnxt: update PTP support on Thor .mailmap | 1 + doc/guides/nics/features/bnxt.ini | 3 + drivers/net/bnxt/bnxt.h |66 +- drivers/net/bnxt/bnxt_cpr.c | 2 +- drivers/net/bnxt/bnxt_cpr.h | 2 +- drivers/net/bnxt/bnxt_ethdev.c| 209 +- drivers/net/bnxt/bnxt_filter.c| 2 +- drivers/net/bnxt/bnxt_filter.h| 6 +- drivers/net/bnxt/bnxt_flow.c |75 +- drivers/net/bnxt/bnxt_hwrm.c | 272 +- drivers/net/bnxt/bnxt_hwrm.h |40 +- drivers/net/bnxt/bnxt_irq.c | 2 +- drivers/net/bnxt/bnxt_irq.h | 3 +- drivers/net/bnxt/bnxt_nvm_defs.h | 3 +- drivers/net/bnxt/bnxt_reps.c | 4 +- drivers/net/bnxt/bnxt_reps.h | 2 +- drivers/net/bnxt/bnxt_ring.c | 7 +- drivers/net/bnxt/bnxt_ring.h | 3 +- drivers/net/bnxt/bnxt_rxq.c | 159 +- drivers/net/bnxt/bnxt_rxq.h | 2 +- drivers/net/bnxt/bnxt_rxr.c |15 +- drivers/net/bnxt/bnxt_rxr.h | 3 +- drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 2 +- drivers/net/bnxt/bnxt_rxtx_vec_common.h | 2 +- drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 2 +- drivers/net/bnxt/bnxt_rxtx_vec_sse.c | 2 +- drivers/net/bnxt/bnxt_stats.c | 2 +- drivers/net/bnxt/bnxt_stats.h | 2 +- drivers/net/bnxt/bnxt_txq.c | 3 +- drivers/net/bnxt/bnxt_txq.h | 2 +- drivers/net/bnxt/bnxt_txr.c |55 +- drivers/net/bnxt/bnxt_txr.h | 4 +- drivers/net/bnxt/bnxt_util.c | 2 +- drivers/net/bnxt/bnxt_util.h | 3 +- drivers/net/bnxt/bnxt_vnic.c | 974 +- drivers/net/bnxt/bnxt_vnic.h |80 +- drivers/net/bnxt/hsi_struct_def_dpdk.h| 5723 ++- drivers/net/bnxt/meson.build | 5 +- drivers/net/bnxt/rte_pmd_bnxt.c | 2 +- drivers/net/bnxt/rte_pmd_bnxt.h | 2 +- drivers/net/bnxt/tf_core/bitalloc.c | 3 +- drivers/net/bnxt/tf_core/bitalloc.h | 3 +- drivers/net/bnxt/tf_core/cfa_resource_types.h | 5 +- drivers/net/bnxt/tf_core/cfa_tcam_mgr.c | 2116 + drivers/net/bnxt/tf_core/cfa_tcam_mgr.h | 523 + .../net/bnxt/tf_core/cfa_tcam_mgr_device.h| 101 + .../net/bnxt/tf_core/cfa_tcam_mgr_hwop_msg.c | 201 + .../net/bnxt/tf_core/cfa_tcam_mgr_hwop_msg.h |28 + drivers/net/bnxt/tf_core/cfa_tcam_mgr_p4.c| 921 + drivers/net/bnxt/tf_core/cfa_tcam_mgr_p4.h|20 + drivers/net/bnxt/tf_core/cfa_tcam_mgr_p58.c | 926 + drivers/net/bnxt/tf_core/cfa_tcam_mgr_p58.h |20 + drivers/net/bnxt/tf_core/cfa_tcam_mgr_sbmp.h | 126 + .../net/bnxt/tf_core/cfa_tcam_mgr_session.c | 377 + .../net/bnxt/tf_core/cfa_tcam_mgr_session.h |54 + drivers/net/bnxt/tf_core/dpool.c | 3 +- drivers/net/bnxt/tf_core/dpool.h | 3 +- drivers/net/bnxt/tf_core/ll.c | 2 +- drivers/net/bnxt/tf_core/ll.h | 2 +- drivers/net/bnxt/tf_core/lookup3.h| 1 - drivers/net/bnxt/tf_core/meson.build |38 +- drivers/net/bnxt/tf_core/rand.c | 2 +- drivers/net/bnxt/tf_core/rand.h | 3 +- drivers/net/bnxt/tf_core/stack.c | 2 +- drivers/net/bnxt/tf_core/stack.h | 3 +- drivers/net/bnxt/tf_core/tf_common.h | 3 +- drivers/net/bnxt/tf_core/tf_core.c|56 +- drivers/net/bnxt/tf_core/tf_core.h| 189 +- drivers/net/bnxt/tf_core/tf_
[PATCH v3 01/11] net/bnxt: remove deprecated features
From: Randy Schacher - Deprecate shadow identifier - Deprecate shadow TCAM - Remove files which are not needed anymore. Signed-off-by: Randy Schacher Signed-off-by: Kishore Padmanabha Reviewed-by: Peter Spreadborough Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_hwrm.c | 53 -- drivers/net/bnxt/bnxt_hwrm.h | 10 - drivers/net/bnxt/tf_core/meson.build | 2 - drivers/net/bnxt/tf_core/tf_core.c| 2 - drivers/net/bnxt/tf_core/tf_core.h| 91 +- drivers/net/bnxt/tf_core/tf_device.c | 35 - drivers/net/bnxt/tf_core/tf_device.h | 6 - drivers/net/bnxt/tf_core/tf_device_p4.c | 10 - drivers/net/bnxt/tf_core/tf_device_p58.c | 10 - drivers/net/bnxt/tf_core/tf_identifier.c | 108 --- drivers/net/bnxt/tf_core/tf_identifier.h | 4 - drivers/net/bnxt/tf_core/tf_if_tbl.h | 8 - drivers/net/bnxt/tf_core/tf_session.c | 9 +- drivers/net/bnxt/tf_core/tf_session.h | 18 +- .../net/bnxt/tf_core/tf_shadow_identifier.c | 190 .../net/bnxt/tf_core/tf_shadow_identifier.h | 229 - drivers/net/bnxt/tf_core/tf_shadow_tcam.c | 837 -- drivers/net/bnxt/tf_core/tf_shadow_tcam.h | 195 drivers/net/bnxt/tf_core/tf_tcam.c| 243 - drivers/net/bnxt/tf_core/tf_tcam.h| 38 +- drivers/net/bnxt/tf_core/tf_util.c| 2 - drivers/net/bnxt/tf_ulp/bnxt_ulp.c| 3 - 22 files changed, 8 insertions(+), 2095 deletions(-) delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_identifier.c delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_identifier.h delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_tcam.c delete mode 100644 drivers/net/bnxt/tf_core/tf_shadow_tcam.h diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index d86ac73293..3f273df6f3 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -407,59 +407,6 @@ int bnxt_hwrm_tf_message_direct(struct bnxt *bp, return rc; } -int bnxt_hwrm_tf_message_tunneled(struct bnxt *bp, - bool use_kong_mb, - uint16_t tf_type, - uint16_t tf_subtype, - uint32_t *tf_response_code, - void *msg, - uint32_t msg_len, - void *response, - uint32_t response_len) -{ - int rc = 0; - struct hwrm_cfa_tflib_input req = { .req_type = 0 }; - struct hwrm_cfa_tflib_output *resp = bp->hwrm_cmd_resp_addr; - bool mailbox = BNXT_USE_CHIMP_MB; - - if (msg_len > sizeof(req.tf_req)) - return -ENOMEM; - - if (use_kong_mb) - mailbox = BNXT_USE_KONG(bp); - - HWRM_PREP(&req, HWRM_TF, mailbox); - /* Build request using the user supplied request payload. -* TLV request size is checked at build time against HWRM -* request max size, thus no checking required. -*/ - req.tf_type = tf_type; - req.tf_subtype = tf_subtype; - memcpy(req.tf_req, msg, msg_len); - - rc = bnxt_hwrm_send_message(bp, &req, sizeof(req), mailbox); - HWRM_CHECK_RESULT(); - - /* Copy the resp to user provided response buffer */ - if (response != NULL) - /* Post process response data. We need to copy only -* the 'payload' as the HWRM data structure really is -* HWRM header + msg header + payload and the TFLIB -* only provided a payload place holder. -*/ - if (response_len != 0) { - memcpy(response, - resp->tf_resp, - response_len); - } - - /* Extract the internal tflib response code */ - *tf_response_code = resp->tf_resp_code; - HWRM_UNLOCK(); - - return rc; -} - int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp, struct bnxt_vnic_info *vnic) { int rc = 0; diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h index a82d9fb3ef..f9d9fe0ef2 100644 --- a/drivers/net/bnxt/bnxt_hwrm.h +++ b/drivers/net/bnxt/bnxt_hwrm.h @@ -79,16 +79,6 @@ struct hwrm_func_qstats_output; bp->rx_cos_queue[x].profile = \ resp->queue_id##x##_service_profile -int bnxt_hwrm_tf_message_tunneled(struct bnxt *bp, - bool use_kong_mb, - uint16_t tf_type, - uint16_t tf_subtype, - uint32_t *tf_response_code, - void *msg, - uint32_t msg_len, - void *response, - uint32_t response_len); -
[PATCH v3 03/11] net/bnxt: update copyright date and cleanup whitespace
From: Randy Schacher Update the Copyright to 2023 Clean up extra blank lines Clean up other whitespace issues Signed-off-by: Randy Schacher Reviewed-by: Kishore Padmanabha Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_cpr.c| 2 +- drivers/net/bnxt/bnxt_cpr.h| 2 +- drivers/net/bnxt/bnxt_filter.c | 2 +- drivers/net/bnxt/bnxt_irq.c| 2 +- drivers/net/bnxt/bnxt_irq.h| 2 +- drivers/net/bnxt/bnxt_nvm_defs.h | 2 +- drivers/net/bnxt/bnxt_reps.h | 2 +- drivers/net/bnxt/bnxt_ring.h | 2 +- drivers/net/bnxt/bnxt_rxq.h| 2 +- drivers/net/bnxt/bnxt_rxr.h| 2 +- drivers/net/bnxt/bnxt_rxtx_vec_avx2.c | 2 +- drivers/net/bnxt/bnxt_rxtx_vec_common.h| 2 +- drivers/net/bnxt/bnxt_rxtx_vec_neon.c | 2 +- drivers/net/bnxt/bnxt_rxtx_vec_sse.c | 2 +- drivers/net/bnxt/bnxt_stats.c | 2 +- drivers/net/bnxt/bnxt_stats.h | 2 +- drivers/net/bnxt/bnxt_txq.h| 2 +- drivers/net/bnxt/bnxt_util.c | 2 +- drivers/net/bnxt/bnxt_util.h | 2 +- drivers/net/bnxt/meson.build | 2 +- drivers/net/bnxt/rte_pmd_bnxt.c| 2 +- drivers/net/bnxt/rte_pmd_bnxt.h| 2 +- drivers/net/bnxt/tf_core/bitalloc.c| 3 +-- drivers/net/bnxt/tf_core/bitalloc.h| 3 +-- drivers/net/bnxt/tf_core/cfa_resource_types.h | 3 +-- drivers/net/bnxt/tf_core/dpool.c | 3 ++- drivers/net/bnxt/tf_core/dpool.h | 3 +-- drivers/net/bnxt/tf_core/ll.c | 2 +- drivers/net/bnxt/tf_core/ll.h | 2 +- drivers/net/bnxt/tf_core/lookup3.h | 1 - drivers/net/bnxt/tf_core/rand.c| 2 +- drivers/net/bnxt/tf_core/rand.h| 3 +-- drivers/net/bnxt/tf_core/stack.c | 2 +- drivers/net/bnxt/tf_core/stack.h | 3 +-- drivers/net/bnxt/tf_core/tf_common.h | 3 +-- drivers/net/bnxt/tf_core/tf_core.h | 1 - drivers/net/bnxt/tf_core/tf_device.h | 1 - drivers/net/bnxt/tf_core/tf_device_p4.h| 3 +-- drivers/net/bnxt/tf_core/tf_device_p58.h | 2 +- drivers/net/bnxt/tf_core/tf_em.h | 3 +-- drivers/net/bnxt/tf_core/tf_em_common.c| 8 +--- drivers/net/bnxt/tf_core/tf_em_common.h| 4 +--- drivers/net/bnxt/tf_core/tf_em_hash_internal.c | 2 +- drivers/net/bnxt/tf_core/tf_em_host.c | 3 +-- drivers/net/bnxt/tf_core/tf_em_internal.c | 3 ++- drivers/net/bnxt/tf_core/tf_ext_flow_handle.h | 4 +--- drivers/net/bnxt/tf_core/tf_global_cfg.c | 2 +- drivers/net/bnxt/tf_core/tf_global_cfg.h | 3 +-- drivers/net/bnxt/tf_core/tf_hash.c | 2 +- drivers/net/bnxt/tf_core/tf_hash.h | 3 +-- drivers/net/bnxt/tf_core/tf_identifier.c | 2 +- drivers/net/bnxt/tf_core/tf_identifier.h | 3 +-- drivers/net/bnxt/tf_core/tf_if_tbl.h | 3 +-- drivers/net/bnxt/tf_core/tf_msg_common.h | 3 +-- drivers/net/bnxt/tf_core/tf_project.h | 3 +-- drivers/net/bnxt/tf_core/tf_resources.h| 3 +-- drivers/net/bnxt/tf_core/tf_rm.h | 6 +- drivers/net/bnxt/tf_core/tf_session.h | 1 - drivers/net/bnxt/tf_core/tf_sram_mgr.h | 1 - drivers/net/bnxt/tf_core/tf_tbl.h | 4 +--- drivers/net/bnxt/tf_core/tf_tbl_sram.h | 6 +- drivers/net/bnxt/tf_core/tf_tcam.h | 3 +-- drivers/net/bnxt/tf_core/tf_tcam_shared.h | 1 - drivers/net/bnxt/tf_core/tf_util.c | 3 +-- drivers/net/bnxt/tf_core/tf_util.h | 3 +-- drivers/net/bnxt/tf_core/tfp.c | 2 +- drivers/net/bnxt/tf_core/tfp.h | 4 +--- drivers/net/bnxt/tf_ulp/bnxt_tf_common.h | 3 +-- drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.h | 1 - drivers/net/bnxt/tf_ulp/bnxt_ulp.h | 1 - drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h | 1 - drivers/net/bnxt/tf_ulp/ulp_flow_db.h | 1 - drivers/net/bnxt/tf_ulp/ulp_gen_hash.c | 2 +- drivers/net/bnxt/tf_ulp/ulp_gen_hash.h | 3 +-- drivers/net/bnxt/tf_ulp/ulp_gen_tbl.h | 1 - drivers/net/bnxt/tf_ulp/ulp_ha_mgr.h | 1 - drivers/net/bnxt/tf_ulp/ulp_mapper.h | 1 - drivers/net/bnxt/tf_ulp/ulp_mark_mgr.c | 2 +- drivers/net/bnxt/tf_ulp/ulp_mark_mgr.h | 3 +-- drivers/net/bnxt/tf_ulp/ulp_matcher.h | 3 +-- drivers/net/bnxt/tf_ulp/ulp_port_db.h | 1 - drivers/net/bnxt/tf_ulp/ulp_rte_parser.h | 1 - drivers/net/bnxt/tf_ulp/ulp_template_struct.h | 1 - drivers/net/bnxt/tf_ulp/ulp_tun.c | 2 +- drivers/net/bnxt/tf_ulp/ulp_tun.h | 3 +-- drivers/net/bnxt/tf_ulp/ulp_utils.c| 2 +- drivers/net/bnxt/tf_ulp/ul
[PATCH v3 07/11] net/bnxt: add support for rte meter
From: Randy Schacher Add RTE meter support into the ULP layer. Currently: - Chaining of meters is not supported - Meter can be shared by multiple flows - srtcm_rfc2697 type is supported - Stats are not supported in the implementation yet Signed-off-by: Randy Schacher Signed-off-by: Jay Ding Acked-by: Ajit Khaparde --- doc/guides/nics/features/bnxt.ini | 1 + drivers/net/bnxt/bnxt.h | 2 + drivers/net/bnxt/bnxt_ethdev.c| 1 + drivers/net/bnxt/tf_ulp/bnxt_ulp.c| 8 + drivers/net/bnxt/tf_ulp/bnxt_ulp.h| 3 + drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c | 909 ++ drivers/net/bnxt/tf_ulp/meson.build | 1 + drivers/net/bnxt/tf_ulp/ulp_rte_handler_tbl.c | 4 +- drivers/net/bnxt/tf_ulp/ulp_rte_parser.c | 29 + drivers/net/bnxt/tf_ulp/ulp_rte_parser.h | 5 + 10 files changed, 961 insertions(+), 2 deletions(-) create mode 100644 drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c diff --git a/doc/guides/nics/features/bnxt.ini b/doc/guides/nics/features/bnxt.ini index 9c456fa863..2809beb629 100644 --- a/doc/guides/nics/features/bnxt.ini +++ b/doc/guides/nics/features/bnxt.ini @@ -77,6 +77,7 @@ dec_ttl = Y drop = Y jump = Y mark = Y +meter= Y of_pop_vlan = Y of_push_vlan = Y of_set_vlan_pcp = Y diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h index 6dd3c8b87c..7d508c7c23 100644 --- a/drivers/net/bnxt/bnxt.h +++ b/drivers/net/bnxt/bnxt.h @@ -1014,6 +1014,7 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev, struct rte_eth_udp_tunnel *udp_tunnel); extern const struct rte_flow_ops bnxt_flow_ops; +extern const struct rte_flow_ops bnxt_flow_meter_ops; #define bnxt_acquire_flow_lock(bp) \ pthread_mutex_lock(&(bp)->flow_lock) @@ -1065,6 +1066,7 @@ int bnxt_flow_ops_get_op(struct rte_eth_dev *dev, int bnxt_dev_start_op(struct rte_eth_dev *eth_dev); int bnxt_dev_stop_op(struct rte_eth_dev *eth_dev); void bnxt_handle_vf_cfg_change(void *arg); +int bnxt_flow_meter_ops_get(struct rte_eth_dev *eth_dev, void *arg); struct bnxt_vnic_info *bnxt_get_default_vnic(struct bnxt *bp); struct tf *bnxt_get_tfp_session(struct bnxt *bp, enum bnxt_session_type type); #endif diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 4d84aaee0c..7bceb0524a 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -4091,6 +4091,7 @@ static const struct eth_dev_ops bnxt_dev_ops = { .timesync_adjust_time = bnxt_timesync_adjust_time, .timesync_read_rx_timestamp = bnxt_timesync_read_rx_timestamp, .timesync_read_tx_timestamp = bnxt_timesync_read_tx_timestamp, + .mtr_ops_get = bnxt_flow_meter_ops_get, }; static uint32_t bnxt_map_reset_regs(struct bnxt *bp, uint32_t reg) diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c index 08eb0c6063..3459140f18 100644 --- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c @@ -1681,6 +1681,14 @@ bnxt_ulp_init(struct bnxt *bp, return rc; } + if (ulp_dev_id == BNXT_ULP_DEVICE_ID_THOR) { + rc = bnxt_flow_meter_init(bp); + if (rc) { + BNXT_TF_DBG(ERR, "Failed to config meter\n"); + goto jump_to_error; + } + } + BNXT_TF_DBG(DEBUG, "ulp ctx has been initialized\n"); return rc; diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h index 53d76e1465..a6ad5c1eaa 100644 --- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h @@ -371,6 +371,9 @@ bnxt_ulp_vxlan_ip_port_set(struct bnxt_ulp_context *ulp_ctx, unsigned int bnxt_ulp_vxlan_ip_port_get(struct bnxt_ulp_context *ulp_ctx); +int32_t +bnxt_flow_meter_init(struct bnxt *bp); + uint32_t bnxt_ulp_cntxt_convert_dev_id(uint32_t ulp_dev_id); diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c b/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c new file mode 100644 index 00..2461c46f90 --- /dev/null +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp_meter.c @@ -0,0 +1,909 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2014-2023 Broadcom + * All rights reserved. + */ + +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "bnxt.h" +#include "bnxt_filter.h" +#include "bnxt_hwrm.h" +#include "bnxt_ring.h" +#include "bnxt_rxq.h" +#include "bnxt_rxr.h" +#include "bnxt_vnic.h" +#include "hsi_struct_def_dpdk.h" + +#include "tfp.h" +#include "bnxt_tf_common.h" +#include "ulp_rte_parser.h" +#include "ulp_matcher.h" +#include "ulp_flow_db.h" +#include "ulp_mapper.h" +#include "ulp_fc_mgr.h" +#include "ulp_port_db.h" +#include "ulp_ha_mgr.h" +#include "ulp_tun.h" +#
[PATCH v3 08/11] net/bnxt: update PTP support on Thor
From: Somnath Kotur add locking and time stamp checks to ptp feature Signed-off-by: Somnath Kotur Signed-off-by: Randy Schacher Reviewed-by: Kalesh AP Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt.h| 5 drivers/net/bnxt/bnxt_ethdev.c | 11 + drivers/net/bnxt/bnxt_hwrm.c | 11 - drivers/net/bnxt/bnxt_ring.c | 3 +++ drivers/net/bnxt/bnxt_rxr.c| 8 -- drivers/net/bnxt/bnxt_txq.c| 1 + drivers/net/bnxt/bnxt_txr.c| 45 +++--- drivers/net/bnxt/bnxt_txr.h| 1 + 8 files changed, 79 insertions(+), 6 deletions(-) diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h index 7d508c7c23..dadd0bd95a 100644 --- a/drivers/net/bnxt/bnxt.h +++ b/drivers/net/bnxt/bnxt.h @@ -349,6 +349,7 @@ struct bnxt_ptp_cfg { BNXT_PTP_MSG_PDELAY_RESP) uint8_t tx_tstamp_en:1; int rx_filter; + uint8_t filter_all; #define BNXT_PTP_RX_TS_L 0 #define BNXT_PTP_RX_TS_H 1 @@ -372,6 +373,8 @@ struct bnxt_ptp_cfg { /* On P5, the Rx timestamp is present in the Rx completion record */ uint64_trx_timestamp; uint64_tcurrent_time; + uint64_told_time; + rte_spinlock_t ptp_lock; }; struct bnxt_coal { @@ -722,6 +725,7 @@ struct bnxt { #define BNXT_FW_CAP_LINK_ADMIN BIT(7) #define BNXT_FW_CAP_TRUFLOW_EN BIT(8) #define BNXT_FW_CAP_VLAN_TX_INSERT BIT(9) +#define BNXT_FW_CAP_RX_ALL_PKT_TS BIT(10) #define BNXT_TRUFLOW_EN(bp)((bp)->fw_cap & BNXT_FW_CAP_TRUFLOW_EN &&\ (bp)->app_id != 0xFF) @@ -849,6 +853,7 @@ struct bnxt { struct bnxt_led_info*leds; uint8_t ieee_1588; struct bnxt_ptp_cfg *ptp_cfg; + uint8_t ptp_all_rx_tstamp; uint16_tvf_resv_strategy; struct bnxt_ctx_mem_info*ctx; diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index 7bceb0524a..ea817e1fdd 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -1441,8 +1441,11 @@ static void bnxt_ptp_get_current_time(void *arg) if (!ptp) return; + rte_spinlock_lock(&ptp->ptp_lock); + ptp->old_time = ptp->current_time; bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_CURRENT_TIME, &ptp->current_time); + rte_spinlock_unlock(&ptp->ptp_lock); rc = rte_eal_alarm_set(US_PER_S, bnxt_ptp_get_current_time, (void *)bp); if (rc != 0) { PMD_DRV_LOG(ERR, "Failed to re-schedule PTP alarm\n"); @@ -1458,8 +1461,11 @@ static int bnxt_schedule_ptp_alarm(struct bnxt *bp) if (bp->flags2 & BNXT_FLAGS2_PTP_ALARM_SCHEDULED) return 0; + rte_spinlock_lock(&ptp->ptp_lock); bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_CURRENT_TIME, &ptp->current_time); + ptp->old_time = ptp->current_time; + rte_spinlock_unlock(&ptp->ptp_lock); rc = rte_eal_alarm_set(US_PER_S, bnxt_ptp_get_current_time, (void *)bp); @@ -3611,12 +3617,15 @@ bnxt_timesync_enable(struct rte_eth_dev *dev) ptp->rx_filter = 1; ptp->tx_tstamp_en = 1; + ptp->filter_all = 1; ptp->rxctl = BNXT_PTP_MSG_EVENTS; rc = bnxt_hwrm_ptp_cfg(bp); if (rc) return rc; + rte_spinlock_init(&ptp->ptp_lock); + bp->ptp_all_rx_tstamp = 1; memset(&ptp->tc, 0, sizeof(struct rte_timecounter)); memset(&ptp->rx_tstamp_tc, 0, sizeof(struct rte_timecounter)); memset(&ptp->tx_tstamp_tc, 0, sizeof(struct rte_timecounter)); @@ -3653,9 +3662,11 @@ bnxt_timesync_disable(struct rte_eth_dev *dev) ptp->rx_filter = 0; ptp->tx_tstamp_en = 0; ptp->rxctl = 0; + ptp->filter_all = 0; bnxt_hwrm_ptp_cfg(bp); + bp->ptp_all_rx_tstamp = 0; if (!BNXT_CHIP_P5(bp)) bnxt_unmap_ptp_regs(bp); else diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index 77588bdf49..82679d1b32 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -669,6 +669,11 @@ int bnxt_hwrm_ptp_cfg(struct bnxt *bp) flags |= HWRM_PORT_MAC_CFG_INPUT_FLAGS_PTP_TX_TS_CAPTURE_DISABLE; + if (ptp->filter_all) + flags |= HWRM_PORT_MAC_CFG_INPUT_FLAGS_ALL_RX_TS_CAPTURE_ENABLE; + else if (bp->fw_cap & BNXT_FW_CAP_RX_ALL_PKT_TS) + flags |= HWRM_PORT_MAC_CFG_INPUT_FLAGS_ALL_RX_TS_CAPTURE_DISABLE; + req.flags = rte_cpu_to_le_32(flags); req.enables = rte_cpu_to_le_32 (HWRM_PORT_MAC_CFG_INPUT_ENABLES_RX_TS_CAPTURE_PTP_MSG_TYPE); @@ -810,7 +
[PATCH v3 09/11] net/bnxt: fix multi-root card support
From: Kishore Padmanabha Changed the logic to use device serial number to identify that different ports belong to same physical card instead of the PCI domain address. Fixes: 34a7ff5a920e ("net/bnxt: support multi root capability") Cc: sta...@dpdk.org Signed-off-by: Kishore Padmanabha Reviewed-by: Shahaji Bhosle Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt.h| 3 +++ drivers/net/bnxt/bnxt_hwrm.c | 1 + drivers/net/bnxt/tf_ulp/bnxt_ulp.c | 11 --- drivers/net/bnxt/tf_ulp/bnxt_ulp.h | 2 ++ 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h index dadd0bd95a..08791b8a17 100644 --- a/drivers/net/bnxt/bnxt.h +++ b/drivers/net/bnxt/bnxt.h @@ -138,6 +138,7 @@ #define BNXT_NUM_CMPL_DMA_AGGR 36 #define BNXT_CMPL_AGGR_DMA_TMR_DURING_INT 50 #define BNXT_NUM_CMPL_DMA_AGGR_DURING_INT 12 +#define BNXT_DEVICE_SERIAL_NUM_SIZE8 #defineBNXT_DEFAULT_VNIC_STATE_MASK\ HWRM_ASYNC_EVENT_CMPL_DEFAULT_VNIC_CHANGE_EVENT_DATA1_DEF_VNIC_STATE_MASK @@ -863,6 +864,8 @@ struct bnxt { uint16_tnum_reps; struct bnxt_rep_info*rep_info; uint16_t*cfa_code_map; + /* Device Serial Number */ + uint8_t dsn[BNXT_DEVICE_SERIAL_NUM_SIZE]; /* Struct to hold adapter error recovery related info */ struct bnxt_error_recovery_info *recovery_info; #define BNXT_MARK_TABLE_SZ (sizeof(struct bnxt_mark_info) * 64 * 1024) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index 82679d1b32..edad84c262 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -863,6 +863,7 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp) bp->max_l2_ctx, bp->max_vnics); bp->max_stat_ctx = rte_le_to_cpu_16(resp->max_stat_ctx); bp->max_mcast_addr = rte_le_to_cpu_32(resp->max_mcast_filters); + memcpy(bp->dsn, resp->device_serial_number, sizeof(bp->dsn)); if (BNXT_PF(bp)) bp->pf->total_vnics = rte_le_to_cpu_16(resp->max_vnics); diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c index 3459140f18..500c177039 100644 --- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.c +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.c @@ -1318,9 +1318,13 @@ ulp_get_session(struct bnxt *bp, struct rte_pci_addr *pci_addr) /* if multi root capability is enabled, then ignore the pci bus id */ STAILQ_FOREACH(session, &bnxt_ulp_session_list, next) { - if (session->pci_info.domain == pci_addr->domain && - (BNXT_MULTIROOT_EN(bp) || - session->pci_info.bus == pci_addr->bus)) { + if (BNXT_MULTIROOT_EN(bp)) { + if (!memcmp(bp->dsn, session->dsn, + sizeof(session->dsn))) { + return session; + } + } else if (session->pci_info.domain == pci_addr->domain && + session->pci_info.bus == pci_addr->bus) { return session; } } @@ -1364,6 +1368,7 @@ ulp_session_init(struct bnxt *bp, /* Add it to the queue */ session->pci_info.domain = pci_addr->domain; session->pci_info.bus = pci_addr->bus; + memcpy(session->dsn, bp->dsn, sizeof(session->dsn)); rc = pthread_mutex_init(&session->bnxt_ulp_mutex, NULL); if (rc) { BNXT_TF_DBG(ERR, "mutex create failed\n"); diff --git a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h index a6ad5c1eaa..92db7751fe 100644 --- a/drivers/net/bnxt/tf_ulp/bnxt_ulp.h +++ b/drivers/net/bnxt/tf_ulp/bnxt_ulp.h @@ -131,11 +131,13 @@ struct bnxt_ulp_pci_info { uint8_t bus; }; +#define BNXT_ULP_DEVICE_SERIAL_NUM_SIZE 8 struct bnxt_ulp_session_state { STAILQ_ENTRY(bnxt_ulp_session_state)next; boolbnxt_ulp_init; pthread_mutex_t bnxt_ulp_mutex; struct bnxt_ulp_pci_infopci_info; + uint8_t dsn[BNXT_ULP_DEVICE_SERIAL_NUM_SIZE]; struct bnxt_ulp_data*cfg_data; struct tf *g_tfp[BNXT_ULP_SESSION_MAX]; uint32_tsession_opened[BNXT_ULP_SESSION_MAX]; -- 2.39.2 (Apple Git-143) smime.p7s Description: S/MIME Cryptographic Signature
[PATCH v3 10/11] net/bnxt: add support for eCPRI packet parsing
From: Randy Schacher Add eCPRI parsing and offload support in the TruFlow ULP layer. Signed-off-by: Randy Schacher Signed-off-by: Shahaji Bhosle Reviewed-by: Manish Kurup Reviewed-by: Ajit Khaparde --- .mailmap | 1 + doc/guides/nics/features/bnxt.ini | 1 + drivers/net/bnxt/bnxt.h | 4 + drivers/net/bnxt/bnxt_ethdev.c| 35 + drivers/net/bnxt/bnxt_hwrm.c | 17 +++ drivers/net/bnxt/bnxt_txr.c | 10 +- drivers/net/bnxt/bnxt_vnic.c | 5 +- drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c| 7 +- drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.h| 1 + drivers/net/bnxt/tf_ulp/bnxt_ulp.c| 24 drivers/net/bnxt/tf_ulp/bnxt_ulp.h| 9 +- drivers/net/bnxt/tf_ulp/ulp_mapper.c | 18 +++ drivers/net/bnxt/tf_ulp/ulp_rte_handler_tbl.c | 4 + drivers/net/bnxt/tf_ulp/ulp_rte_parser.c | 120 +- drivers/net/bnxt/tf_ulp/ulp_rte_parser.h | 5 + drivers/net/bnxt/tf_ulp/ulp_template_struct.h | 2 + 16 files changed, 256 insertions(+), 7 deletions(-) diff --git a/.mailmap b/.mailmap index aded19d181..cc22746f36 100644 --- a/.mailmap +++ b/.mailmap @@ -813,6 +813,7 @@ Malvika Gupta Mandal Purna Chandra Mandeep Rohilla Manish Chopra +Manish Kurup Manish Tomar Mao Jiang Mao YingMing diff --git a/doc/guides/nics/features/bnxt.ini b/doc/guides/nics/features/bnxt.ini index 2809beb629..7b3030a58c 100644 --- a/doc/guides/nics/features/bnxt.ini +++ b/doc/guides/nics/features/bnxt.ini @@ -57,6 +57,7 @@ Perf doc = Y [rte_flow items] any = Y +ecpri= Y eth = P ipv4 = Y ipv6 = Y diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h index 08791b8a17..ed21ba7f29 100644 --- a/drivers/net/bnxt/bnxt.h +++ b/drivers/net/bnxt/bnxt.h @@ -844,10 +844,14 @@ struct bnxt { uint8_t port_cnt; uint8_t vxlan_port_cnt; uint8_t geneve_port_cnt; + uint8_t ecpri_port_cnt; uint16_tvxlan_port; uint16_tgeneve_port; + uint16_tecpri_port; uint16_tvxlan_fw_dst_port_id; uint16_tgeneve_fw_dst_port_id; + uint16_tecpri_fw_dst_port_id; + uint16_tecpri_upar_in_use; uint32_tfw_ver; uint32_thwrm_spec_code; diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c index ea817e1fdd..ee1552452a 100644 --- a/drivers/net/bnxt/bnxt_ethdev.c +++ b/drivers/net/bnxt/bnxt_ethdev.c @@ -2405,6 +2405,20 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev, tunnel_type = HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_GENEVE; break; + case RTE_ETH_TUNNEL_TYPE_ECPRI: + if (bp->ecpri_port_cnt) { + PMD_DRV_LOG(ERR, "Tunnel Port %d already programmed\n", + udp_tunnel->udp_port); + if (bp->ecpri_port != udp_tunnel->udp_port) { + PMD_DRV_LOG(ERR, "Only one port allowed\n"); + return -ENOSPC; + } + bp->ecpri_port_cnt++; + return 0; + } + tunnel_type = + HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_ECPRI; + break; default: PMD_DRV_LOG(ERR, "Tunnel type is not supported\n"); return -ENOTSUP; @@ -2423,6 +2437,10 @@ bnxt_udp_tunnel_port_add_op(struct rte_eth_dev *eth_dev, HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_GENEVE) bp->geneve_port_cnt++; + if (tunnel_type == + HWRM_TUNNEL_DST_PORT_ALLOC_INPUT_TUNNEL_TYPE_ECPRI) + bp->ecpri_port_cnt++; + return rc; } @@ -2474,6 +2492,23 @@ bnxt_udp_tunnel_port_del_op(struct rte_eth_dev *eth_dev, HWRM_TUNNEL_DST_PORT_FREE_INPUT_TUNNEL_TYPE_GENEVE; port = bp->geneve_fw_dst_port_id; break; + case RTE_ETH_TUNNEL_TYPE_ECPRI: + if (!bp->ecpri_port_cnt) { + PMD_DRV_LOG(ERR, "No Tunnel port configured yet\n"); + return -EINVAL; + } + if (bp->ecpri_port != udp_tunnel->udp_port) { + PMD_DRV_LOG(ERR, "Req Port: %d. Configured port: %d\n", + udp_tunnel->udp_port, bp->ecpri_port); + return -EINVAL; + } + if (--bp->ecpri_port_cnt) + return 0; + + tunnel_type = +
[PATCH v3 11/11] net/bnxt: set RSS config based on RSS mode
From: Shuanglin Wang Avoid submitting hwrm RSS request when rss mode disabled. On WH+, if rss mode isn't enabled, then there is no rss context. Submitting HWRM_VNIC_RSS_CFG request to firmware would hit a failure. The fix is to check the rss context. If no rss context, then don't submit the hwrm request. Signed-off-by: Shuanglin Wang Signed-off-by: Kishore Padmanabha Reviewed-by: Mike Baucom Reviewed-by: Ajit Khaparde --- drivers/net/bnxt/bnxt_hwrm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c index b944547656..06f196760f 100644 --- a/drivers/net/bnxt/bnxt_hwrm.c +++ b/drivers/net/bnxt/bnxt_hwrm.c @@ -2405,6 +2405,9 @@ bnxt_hwrm_vnic_rss_cfg_non_p5(struct bnxt *bp, struct bnxt_vnic_info *vnic) struct hwrm_vnic_rss_cfg_output *resp = bp->hwrm_cmd_resp_addr; int rc = 0; + if (vnic->num_lb_ctxts == 0) + return rc; + HWRM_PREP(&req, HWRM_VNIC_RSS_CFG, BNXT_USE_CHIMP_MB); req.hash_type = rte_cpu_to_le_32(vnic->hash_type); -- 2.39.2 (Apple Git-143) smime.p7s Description: S/MIME Cryptographic Signature
RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release
> -Original Message- > From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] > Sent: Thursday, May 4, 2023 7:46 AM > To: Konstantin Ananyev ; wangyunjian > > Cc: dev@dpdk.org; luyicai ; sta...@dpdk.org; nd > ; nd > Subject: RE: [dpdk-dev] [PATCH v2] ring: fix use after free in ring release > > > > > > > > > > > > > > > After the memzone is freed, it is not removed from the > > > 'rte_ring_tailq'. > > > If rte_ring_lookup is called at this time, it will cause a > > > use-after-free problem. This change prevents that from happening. > > > > > > Fixes: 4e32101f9b01 ("ring: support freeing") > > > Cc: sta...@dpdk.org > > > > > > Suggested-by: Honnappa Nagarahalli > > > > > Signed-off-by: Yunjian Wang > > > --- > > > v2: update code suggested by Honnappa Nagarahalli > > > --- > > >lib/ring/rte_ring.c | 8 +++- > > >1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index > > > 8ed455043d..2755323b8a 100644 > > > --- a/lib/ring/rte_ring.c > > > +++ b/lib/ring/rte_ring.c > > > @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r) > > > return; > > > } > > > > > > - if (rte_memzone_free(r->memzone) != 0) { > > > - RTE_LOG(ERR, RING, "Cannot free memory\n"); > > > - return; > > > - } > > > - > > > ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, > > rte_ring_list); > > > rte_mcfg_tailq_write_lock(); > > > > > > @@ -354,6 +349,9 @@ rte_ring_free(struct rte_ring *r) > > > > > > TAILQ_REMOVE(ring_list, te, next); > > > > > > + if (rte_memzone_free(r->memzone) != 0) > > > + RTE_LOG(ERR, RING, "Cannot free memory\n"); > > > + > > > > I nit: I think it is a bit better to first release the lock and > > then free the memzone. > > >>> I think both of our suggestions are contradictory. Any reason why > > >>> you want > > >> to free outside the locked region? > > >> > > >> > > >> Don't know what you mean by 'both suggestions' here. > > > I wrote 'both of our suggestions'. Essentially, in v1, freeing the > > > memzone > > was outside of the lock. I suggested to move it inside and you are > > suggesting to move it inside. > > > > > > Ah ok, I missed v1 and your comments for it. > > As I said before, I don't think that we need to hold qlock here while > > calling mmezone_free(). > > Though I don't see any harm with it either. > > I'd personally would move memzone_free() after releasing qlock, but if > > you guys prefer to keep it as it is - I wouldn't insist. > I looked at other libraries, stack library is the closest. Stack library > frees the > memzone outside the lock. I think we should keep it consistent. > I am fine to move the free outside the lock. Thanks, Konstantin Ananyev and Honnappa Nagarahalli. I will update the patch in v3 and move the free outside the lock. > > > > > > > > >> I think I gave only one - move memzone_free() after tailq_write_unlock(). > > >> To be more precise: > > >> 1) rte_mcfg_tailq_write_lock(); > > >> ... > > >> 2) TAILQ_REMOVE(...); > > >> 3) rte_mcfg_tailq_write_unlock(); > > >> 4) rte_memzone_free(r->memzone); > > >> > > >> As I remember, memzones are protected by their own lock (mlock), so > > >> we don't need to hold qlock to free a memzone, after ring was > > >> already removed from the ring_list. > > >> > > >>> > > >>> I thought, since it belongs to the ring being freed, it makes > > >>> sense to free it > > >> while holding the lock to avoid any race conditions (though, I have > > >> not checked what those are). > > >> > > >> > > >> As I understand, it is ok with current design to grab mlock while > > >> holding > > qlock. > > >> So, there is nothing wrong with current patch, I just think that in > > >> that case it is excessive, and can be safely avoided. > > >> > > >>> > > Apart from that, LGTM. > > Acked-by: Konstantin Ananyev > > > > > rte_mcfg_tailq_write_unlock(); > > > > > > rte_free(te); > > > -- > > > 2.33.0 > > > > >>> > > >
RE: 21.11.4 patches review and test
> -Original Message- > From: Kevin Traynor > Sent: Thursday, May 4, 2023 6:11 PM > To: Xu, HailinX ; sta...@dpdk.org > Cc: Stokes, Ian ; Mcnamara, John > ; Luca Boccassi ; Xu, Qian Q > ; Thomas Monjalon ; Peng, > Yuan ; Chen, Zhaoyan ; > dev@dpdk.org > Subject: Re: 21.11.4 patches review and test > > On 04/05/2023 03:13, Xu, HailinX wrote: > >> -Original Message- > >> From: Kevin Traynor > >> Sent: Tuesday, May 2, 2023 5:35 PM > >> To: Xu, HailinX ; sta...@dpdk.org > >> Cc: Stokes, Ian ; Mcnamara, John > >> ; Luca Boccassi ; Xu, Qian > >> Q ; Thomas Monjalon ; > Peng, > >> Yuan ; Chen, Zhaoyan ; > >> dev@dpdk.org > >> Subject: Re: 21.11.4 patches review and test > >> > >> On 20/04/2023 11:32, Kevin Traynor wrote: > >>> On 20/04/2023 03:40, Xu, HailinX wrote: > > -Original Message- > > From: Xu, HailinX > > Sent: Thursday, April 13, 2023 2:13 PM > > To: Kevin Traynor ; sta...@dpdk.org > > Cc: dev@dpdk.org; Abhishek Marathe > >> ; > > Ali Alnubani ; Walker, Benjamin > > ; David Christensen > > ; Hemant Agrawal > >> ; > > Stokes, Ian ; Jerin Jacob > > ; Mcnamara, John ; > > Ju-Hyoung Lee ; Luca Boccassi > > ; Pei Zhang ; Xu, Qian Q > > ; Raslan Darawsheh ; > >> Thomas > > Monjalon ; yangh...@redhat.com; Peng, Yuan > > ; Chen, Zhaoyan > > Subject: RE: 21.11.4 patches review and test > > > >> -Original Message- > >> From: Kevin Traynor > >> Sent: Thursday, April 6, 2023 7:38 PM > >> To: sta...@dpdk.org > >> Cc: dev@dpdk.org; Abhishek Marathe > >> ; Ali Alnubani > >> ; Walker, Benjamin > >> ; David Christensen > >> ; Hemant Agrawal > >> ; Stokes, Ian ; > >> Jerin Jacob ; Mcnamara, John > >> ; Ju-Hyoung Lee > ; > >> Kevin Traynor ; Luca Boccassi > >> ; Pei Zhang ; Xu, Qian Q > >> ; Raslan Darawsheh ; > > Thomas > >> Monjalon ; yangh...@redhat.com; Peng, Yuan > >> ; Chen, Zhaoyan > >> Subject: 21.11.4 patches review and test > >> > >> Hi all, > >> > >> Here is a list of patches targeted for stable release 21.11.4. > >> > >> The planned date for the final release is 25th April. > >> > >> Please help with testing and validation of your use cases and > >> report any issues/results with reply-all to this mail. For the > >> final release the fixes and reported validations will be added to > >> the > >> release notes. > >> > >> A release candidate tarball can be found at: > >> > >>https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.4-rc1 > >> > >> These patches are located at branch 21.11 of dpdk-stable repo: > >>https://dpdk.org/browse/dpdk-stable/ > >> > >> Thanks. > >> > >> Kevin > > > > HI All, > > > > Update the test status for Intel part. Till now dpdk21.11.4-rc1 > > validation test rate is 85%. No critical issue is found. > > 2 new bugs are found, 1 new issue is under confirming by Intel Dev. > > New bugs: --20.11.8-rc1 also has these two issues > > 1. > >> pvp_qemu_multi_paths_port_restart:perf_pvp_qemu_vector_rx_mac: > > performance drop about 23.5% when send small packets > > https://bugs.dpdk.org/show_bug.cgi?id=1212-- no fix yet > > 2. some of the virtio tests are failing:-- Intel dev is under > >> investigating > > # Basic Intel(R) NIC testing > > * Build & CFLAG compile: cover the build test combination with > > latest GCC/Clang version and the popular OS revision such as > > Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, > > RHEL8.4, FreeBSD13.1, SUSE15, CentOS7.9, etc. > > - All test done. No new dpdk issue is found. > > * PF(i40e, ixgbe): test scenarios including > > RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. > > - All test done. No new dpdk issue is found. > > * VF(i40e, ixgbe): test scenarios including > > VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. > > - All test done. No new dpdk issue is found. > > * PF/VF(ice): test scenarios including Switch features/Package > > Management/Flow Director/Advanced Tx/Advanced > RSS/ACL/DCF/Flexible > > Descriptor, etc. > > - All test done. No new dpdk issue is found. > > * Intel NIC single core/NIC performance: test scenarios including > > PF/VF single core performance test, etc. > > - All test done. No new dpdk issue is found. > > * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library > > basic test - QAT&SW/FIB library, etc. > > - On going. > > > > # Basic cryptodev and virtio testing > > * Virtio: both function and performance test are covered. Such as > > PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf > > testing/VMAWARE ESXI 8.0, etc. > > - All test done. found bug1. > > * Cryptodev: > > *Function test: test scenarios
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Ruifeng, Thanks for your review. On Thur, May 4, 2023 at 2:13PM, Ruifeng Wang wrote: -Original Message- From: Konstantin Ananyev Sent: Monday, May 1, 2023 9:29 PM To: zhou...@loongson.cn Cc: dev@dpdk.org; maob...@loongson.cn; qiming.y...@intel.com; wenjun1...@intel.com; Ruifeng Wang ; d...@linux.vnet.ibm.com Subject: Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou "Fixes" tag for backport. OK, I will add the "Fixes" tag in the V3 patch. --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); rxd = *rxdp; Indeed, looks like a problem to me on systems with relaxed MO. Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. Thanks, Konstantin. About a fix - looks right, but a bit excessive to me - as I understand all we need here is to prevent re-ordering by CPU itself. So rte_smp_rmb() seems enough here. Agree that rte_rmb() is excessive. rte_smp_rmb() or rte_atomic_thread_fence(__ATOMIC_ACQUIRE) is enough. Thanks for your advice. I will compare the rte_smp_rmb(), __atomic_load_n() and rte_atomic_thread_fence() to choose a better one. And it is better to add a comment to justify the barrier. OK, I will add a comment for this change. Or might be just: staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE); /* @@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts, With the proper barrier in place, I think the long comments at the beginning of this loop can be removed. Yes, I think the long comments can be simplified when the proper barrier is already in place. if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; + rte_rmb(); rxd = *rxdp; PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u " -- 2.31.1 Best regards, Min
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Morten, On Thur, May 4, 2023 at 9:21PM, Morten Brørup wrote: From: zhoumin [mailto:zhou...@loongson.cn] Sent: Thursday, 4 May 2023 15.17 Hi Konstantin, Thanks for your comments. On 2023/5/1 下午9:29, Konstantin Ananyev wrote: Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); rxd = *rxdp; Indeed, looks like a problem to me on systems with relaxed MO. Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. The LoongArch architecture uses the Weak Consistency model which can cause the problem, especially in scenario with many cores, such as Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores. About a fix - looks right, but a bit excessive to me - as I understand all we need here is to prevent re-ordering by CPU itself. Yes, thanks for cc-ing. So rte_smp_rmb() seems enough here. Or might be just: staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE); Does __atomic_load_n() work on Windows if we use it to solve this problem ? Yes, __atomic_load_n() works on Windows too. Thank you, Morten. I got it. I will compare those barriers and choose a proper one for this problem. Best regards, Min
RE: [EXT] [PATCH v5 02/15] graph: split graph worker into common and default model
> -Original Message- > From: Pavan Nikhilesh Bhagavatula > Sent: Thursday, April 27, 2023 10:11 PM > To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob > Kollanukkaran ; Kiran Kumar Kokkilagadda > ; Nithin Kumar Dabilpuram > ; step...@networkplumber.org > Cc: Liang, Cunming ; Wang, Haiyue > > Subject: RE: [EXT] [PATCH v5 02/15] graph: split graph worker into common and > default model > > > > > -Original Message- > > From: Zhirun Yan > > Sent: Friday, March 31, 2023 9:33 AM > > To: dev@dpdk.org; Jerin Jacob Kollanukkaran ; > > Kiran Kumar Kokkilagadda ; Nithin Kumar > > Dabilpuram ; step...@networkplumber.org > > Cc: cunming.li...@intel.com; haiyue.w...@intel.com; Zhirun Yan > > > > Subject: [EXT] [PATCH v5 02/15] graph: split graph worker into common > > and default model > > > > External Email > > > > -- > > To support multiple graph worker model, split graph into common and > > default. Naming the current walk function as rte_graph_model_rtc cause > > the default model is RTC(Run-to-completion). > > > > Signed-off-by: Haiyue Wang > > Signed-off-by: Cunming Liang > > Signed-off-by: Zhirun Yan > > --- > > lib/graph/graph_pcap.c | 2 +- > > lib/graph/graph_private.h | 2 +- > > lib/graph/meson.build | 2 +- > > lib/graph/rte_graph_model_rtc.h | 61 > > + > > lib/graph/rte_graph_worker.h| 34 > > lib/graph/rte_graph_worker_common.h | 57 --- > > 6 files changed, 98 insertions(+), 60 deletions(-) create mode > > 100644 lib/graph/rte_graph_model_rtc.h create mode 100644 > > lib/graph/rte_graph_worker.h > > > > diff --git a/lib/graph/graph_pcap.c b/lib/graph/graph_pcap.c index > > 8a220370fa..6c43330029 100644 > > --- a/lib/graph/graph_pcap.c > > +++ b/lib/graph/graph_pcap.c > > @@ -10,7 +10,7 @@ > > #include > > #include > > > > -#include "rte_graph_worker_common.h" > > +#include "rte_graph_worker.h" > > > > #include "graph_pcap_private.h" > > > > diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h > > index f08dbc7e9d..7d1b30b8ac 100644 > > --- a/lib/graph/graph_private.h > > +++ b/lib/graph/graph_private.h > > @@ -12,7 +12,7 @@ > > #include > > > > #include "rte_graph.h" > > -#include "rte_graph_worker_common.h" > > +#include "rte_graph_worker.h" > > > > extern int rte_graph_logtype; > > > > diff --git a/lib/graph/meson.build b/lib/graph/meson.build index > > 4e2b612ad3..3526d1b5d4 100644 > > --- a/lib/graph/meson.build > > +++ b/lib/graph/meson.build > > @@ -16,6 +16,6 @@ sources = files( > > 'graph_populate.c', > > 'graph_pcap.c', > > ) > > -headers = files('rte_graph.h', 'rte_graph_worker_common.h') > > +headers = files('rte_graph.h', 'rte_graph_worker.h') > > > > deps += ['eal', 'pcapng'] > > diff --git a/lib/graph/rte_graph_model_rtc.h > > b/lib/graph/rte_graph_model_rtc.h new file mode 100644 index > > 00..665560f831 > > --- /dev/null > > +++ b/lib/graph/rte_graph_model_rtc.h > > @@ -0,0 +1,61 @@ > > +/* SPDX-License-Identifier: BSD-3-Clause > > + * Copyright(C) 2023 Intel Corporation */ > > + > > Please retain Marvell copyright too. > Yes, I will do in next version. Thanks for reminding me. > > +#include "rte_graph_worker_common.h" > > + > > +/** > > + * Perform graph walk on the circular buffer and invoke the process > > function > > + * of the nodes and collect the stats. > > + * > > + * @param graph > > + * Graph pointer returned from rte_graph_lookup function. > > + * > > + * @see rte_graph_lookup() > > + */ > > +static inline void > > +rte_graph_walk_rtc(struct rte_graph *graph) { > > + const rte_graph_off_t *cir_start = graph->cir_start; > > + const rte_node_t mask = graph->cir_mask; > > + uint32_t head = graph->head; > > + struct rte_node *node; > > + uint64_t start; > > + uint16_t rc; > > + void **objs; > > + > > + /* > > +* Walk on the source node(s) ((cir_start - head) -> cir_start) and > > then > > +* on the pending streams (cir_start -> (cir_start + mask) -> cir_start) > > +* in a circular buffer fashion. > > +* > > +* +-+ <= cir_start - head [number of source nodes] > > +* | | > > +* | ... | <= source nodes > > +* | | > > +* +-+ <= cir_start [head = 0] [tail = 0] > > +* | | > > +* | ... | <= pending streams > > +* | | > > +* +-+ <= cir_start + mask > > +*/ > > + while (likely(head != graph->tail)) { > > + node = (struct rte_node *)RTE_PTR_ADD(graph, > > cir_start[(int32_t)head++]); > > + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); > > + objs = node->objs; > > + rte_prefetch0(objs); > > + > > + if (rte_graph_has_stats_feature()) { > > + start = rte_rdtsc(); > > + rc = node->process(graph, node
RE: [EXT] [PATCH v5 11/15] graph: introduce graph walk by cross-core dispatch
> -Original Message- > From: Pavan Nikhilesh Bhagavatula > Sent: Thursday, April 27, 2023 10:59 PM > To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob > Kollanukkaran ; Kiran Kumar Kokkilagadda > ; Nithin Kumar Dabilpuram > ; step...@networkplumber.org > Cc: Liang, Cunming ; Wang, Haiyue > > Subject: RE: [EXT] [PATCH v5 11/15] graph: introduce graph walk by cross-core > dispatch > > > This patch introduces the task scheduler mechanism to enable > > dispatching tasks to another worker cores. Currently, there is only a > > local work queue for one graph to walk. We introduce a scheduler > > worker queue in each worker core for dispatching tasks. It will > > perform the walk on scheduler work queue first, then handle the local work > queue. > > > > Signed-off-by: Haiyue Wang > > Signed-off-by: Cunming Liang > > Signed-off-by: Zhirun Yan > > --- > > lib/graph/rte_graph_model_dispatch.h | 42 > > > > 1 file changed, 42 insertions(+) > > > > diff --git a/lib/graph/rte_graph_model_dispatch.h > > b/lib/graph/rte_graph_model_dispatch.h > > index 18fa7ce0ab..65b2cc6d87 100644 > > --- a/lib/graph/rte_graph_model_dispatch.h > > +++ b/lib/graph/rte_graph_model_dispatch.h > > @@ -73,6 +73,48 @@ __rte_experimental > > int rte_graph_model_dispatch_lcore_affinity_set(const char *name, > > unsigned int lcore_id); > > > > +/** > > + * Perform graph walk on the circular buffer and invoke the process > > function > > + * of the nodes and collect the stats. > > + * > > + * @param graph > > + * Graph pointer returned from rte_graph_lookup function. > > + * > > + * @see rte_graph_lookup() > > + */ > > +__rte_experimental > > +static inline void > > +rte_graph_walk_mcore_dispatch(struct rte_graph *graph) { > > + const rte_graph_off_t *cir_start = graph->cir_start; > > + const rte_node_t mask = graph->cir_mask; > > + uint32_t head = graph->head; > > + struct rte_node *node; > > I think we should add a RTE_ASSERT here to make sure that the graph object is > a > cloned graph. > Ok, I will add RTE_ASSERT in next version. > > + > > + if (graph->wq != NULL) > > + __rte_graph_sched_wq_process(graph); > > + > > + while (likely(head != graph->tail)) { > > + node = (struct rte_node *)RTE_PTR_ADD(graph, > > cir_start[(int32_t)head++]); > > + > > + /* skip the src nodes which not bind with current worker */ > > + if ((int32_t)head < 0 && node->lcore_id != graph->lcore_id) > > + continue; > > + > > + /* Schedule the node until all task/objs are done */ > > + if (node->lcore_id != RTE_MAX_LCORE && > > + graph->lcore_id != node->lcore_id && graph->rq != NULL > > && > > + __rte_graph_sched_node_enqueue(node, graph->rq)) > > + continue; > > + > > + __rte_node_process(graph, node); > > + > > + head = likely((int32_t)head > 0) ? head & mask : head; > > + } > > + > > + graph->tail = 0; > > +} > > + > > #ifdef __cplusplus > > } > > #endif > > -- > > 2.37.2
RE: [EXT] [PATCH v5 09/15] graph: introduce stream moving cross cores
> -Original Message- > From: Pavan Nikhilesh Bhagavatula > Sent: Thursday, April 27, 2023 10:53 PM > To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob > Kollanukkaran ; Kiran Kumar Kokkilagadda > ; Nithin Kumar Dabilpuram > ; step...@networkplumber.org > Cc: Liang, Cunming ; Wang, Haiyue > > Subject: RE: [EXT] [PATCH v5 09/15] graph: introduce stream moving cross cores > > > This patch introduces key functions to allow a worker thread to enable > > enqueue and move streams of objects to the next nodes over different > > cores. > > > > Signed-off-by: Haiyue Wang > > Signed-off-by: Cunming Liang > > Signed-off-by: Zhirun Yan > > --- > > lib/graph/graph_private.h| 27 + > > lib/graph/meson.build| 2 +- > > lib/graph/rte_graph_model_dispatch.c | 145 > > +++ > > lib/graph/rte_graph_model_dispatch.h | 37 +++ > > lib/graph/version.map| 2 + > > 5 files changed, 212 insertions(+), 1 deletion(-) > > > > diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h > > index b66b18ebbc..e1a2a4bfd8 100644 > > --- a/lib/graph/graph_private.h > > +++ b/lib/graph/graph_private.h > > @@ -366,4 +366,31 @@ void graph_dump(FILE *f, struct graph *g); > > */ > > void node_dump(FILE *f, struct node *n); > > > > +/** > > + * @internal > > + * > > + * Create the graph schedule work queue. And all cloned graphs > > +attached to > > the > > + * parent graph MUST be destroyed together for fast schedule design > > limitation. > > + * > > + * @param _graph > > + * The graph object > > + * @param _parent_graph > > + * The parent graph object which holds the run-queue head. > > + * > > + * @return > > + * - 0: Success. > > + * - <0: Graph schedule work queue related error. > > + */ > > +int graph_sched_wq_create(struct graph *_graph, struct graph > > *_parent_graph); > > + > > +/** > > + * @internal > > + * > > + * Destroy the graph schedule work queue. > > + * > > + * @param _graph > > + * The graph object > > + */ > > +void graph_sched_wq_destroy(struct graph *_graph); > > + > > #endif /* _RTE_GRAPH_PRIVATE_H_ */ > > diff --git a/lib/graph/meson.build b/lib/graph/meson.build index > > c729d984b6..e21affa280 100644 > > --- a/lib/graph/meson.build > > +++ b/lib/graph/meson.build > > @@ -20,4 +20,4 @@ sources = files( > > ) > > headers = files('rte_graph.h', 'rte_graph_worker.h') > > > > -deps += ['eal', 'pcapng'] > > +deps += ['eal', 'pcapng', 'mempool', 'ring'] > > diff --git a/lib/graph/rte_graph_model_dispatch.c > > b/lib/graph/rte_graph_model_dispatch.c > > index 4a2f99496d..a300fefb85 100644 > > --- a/lib/graph/rte_graph_model_dispatch.c > > +++ b/lib/graph/rte_graph_model_dispatch.c > > @@ -5,6 +5,151 @@ > > #include "graph_private.h" > > #include "rte_graph_model_dispatch.h" > > > > +int > > +graph_sched_wq_create(struct graph *_graph, struct graph > > *_parent_graph) > > +{ > > + struct rte_graph *parent_graph = _parent_graph->graph; > > + struct rte_graph *graph = _graph->graph; > > + unsigned int wq_size; > > + > > + wq_size = GRAPH_SCHED_WQ_SIZE(graph->nb_nodes); > > + wq_size = rte_align32pow2(wq_size + 1); > > Hi Zhirun, > > We should introduce a new function `rte_graph_configure` which can help > application to control the ring size and mempool size of the work queue? > We could fallback to default values if nothing is configured. > > rte_graph_configure should take a > struct rte_graph_config { > struct { > u64 rsvd[8]; > } rtc; > struct { > u16 wq_size; > ... > } dispatch; > }; > > This will help future graph models to have their own configuration. > > We can have a rte_graph_config_init() function to initialize the > rte_graph_config > structure. > Hi Pavan, Thanks for your comments. I am agree with you. It would be more friendly for user/developer. And for ring and mempool, there are some limitations(must be a power of 2) about the size. So I prefer to use u16 wq_size_max and u32 mp_size_max for user if they have limited resources. > > > + > > + graph->wq = rte_ring_create(graph->name, wq_size, graph->socket, > > + RING_F_SC_DEQ); > > + if (graph->wq == NULL) > > + SET_ERR_JMP(EIO, fail, "Failed to allocate graph WQ"); > > + > > + graph->mp = rte_mempool_create(graph->name, wq_size, > > + sizeof(struct graph_sched_wq_node), > > + 0, 0, NULL, NULL, NULL, NULL, > > + graph->socket, MEMPOOL_F_SP_PUT); > > + if (graph->mp == NULL) > > + SET_ERR_JMP(EIO, fail_mp, > > + "Failed to allocate graph WQ schedule entry"); > > + > > + graph->lcore_id = _graph->lcore_id; > > + > > + if (parent_graph->rq == NULL) { > > + parent_graph->rq = &parent_graph->rq_head; > > + SLIST_INIT(parent_graph->rq); > > + } > > + > > + graph-
RE: [EXT] [PATCH v5 03/15] graph: move node process into inline function
> -Original Message- > From: Pavan Nikhilesh Bhagavatula > Sent: Thursday, April 27, 2023 11:03 PM > To: Yan, Zhirun ; dev@dpdk.org; Jerin Jacob > Kollanukkaran ; Kiran Kumar Kokkilagadda > ; Nithin Kumar Dabilpuram > ; step...@networkplumber.org > Cc: Liang, Cunming ; Wang, Haiyue > > Subject: RE: [EXT] [PATCH v5 03/15] graph: move node process into inline > function > > > Node process is a single and reusable block, move the code into an > > inline function. > > > > Signed-off-by: Haiyue Wang > > Signed-off-by: Cunming Liang > > Signed-off-by: Zhirun Yan > > --- > > lib/graph/rte_graph_model_rtc.h | 20 ++--- > > lib/graph/rte_graph_worker_common.h | 33 > > + > > 2 files changed, 35 insertions(+), 18 deletions(-) > > > > diff --git a/lib/graph/rte_graph_model_rtc.h > > b/lib/graph/rte_graph_model_rtc.h index 665560f831..0dcb7151e9 100644 > > --- a/lib/graph/rte_graph_model_rtc.h > > +++ b/lib/graph/rte_graph_model_rtc.h > > @@ -20,9 +20,6 @@ rte_graph_walk_rtc(struct rte_graph *graph) > > const rte_node_t mask = graph->cir_mask; > > uint32_t head = graph->head; > > struct rte_node *node; > > - uint64_t start; > > - uint16_t rc; > > - void **objs; > > > > /* > > * Walk on the source node(s) ((cir_start - head) -> cir_start) and > > then @@ -41,21 +38,8 @@ rte_graph_walk_rtc(struct rte_graph *graph) > > */ > > while (likely(head != graph->tail)) { > > node = (struct rte_node *)RTE_PTR_ADD(graph, > > cir_start[(int32_t)head++]); > > - RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); > > - objs = node->objs; > > - rte_prefetch0(objs); > > - > > - if (rte_graph_has_stats_feature()) { > > - start = rte_rdtsc(); > > Since we are refactoring this function could you change rte_rdtsc() to > rte_rdtsc_precise(). Sure, I will do in next version. > > > - rc = node->process(graph, node, objs, node->idx); > > - node->total_cycles += rte_rdtsc() - start; > > - node->total_calls++; > > - node->total_objs += rc; > > - } else { > > - node->process(graph, node, objs, node->idx); > > - } > > - node->idx = 0; > > - head = likely((int32_t)head > 0) ? head & mask : > > head; > > + __rte_node_process(graph, node); > > + head = likely((int32_t)head > 0) ? head & mask : head; > > } > > graph->tail = 0; > > } > > diff --git a/lib/graph/rte_graph_worker_common.h > > b/lib/graph/rte_graph_worker_common.h > > index b58f8f6947..41428974db 100644 > > --- a/lib/graph/rte_graph_worker_common.h > > +++ b/lib/graph/rte_graph_worker_common.h > > @@ -130,6 +130,39 @@ void __rte_node_stream_alloc_size(struct > > rte_graph *graph, > > > > /* Fast path helper functions */ > > > > +/** > > + * @internal > > + * > > + * Enqueue a given node to the tail of the graph reel. > > + * > > + * @param graph > > + * Pointer Graph object. > > + * @param node > > + * Pointer to node object to be enqueued. > > + */ > > +static __rte_always_inline void > > +__rte_node_process(struct rte_graph *graph, struct rte_node *node) { > > + uint64_t start; > > + uint16_t rc; > > + void **objs; > > + > > + RTE_ASSERT(node->fence == RTE_GRAPH_FENCE); > > + objs = node->objs; > > + rte_prefetch0(objs); > > + > > + if (rte_graph_has_stats_feature()) { > > + start = rte_rdtsc(); > > + rc = node->process(graph, node, objs, node->idx); > > + node->total_cycles += rte_rdtsc() - start; > > + node->total_calls++; > > + node->total_objs += rc; > > + } else { > > + node->process(graph, node, objs, node->idx); > > + } > > + node->idx = 0; > > +} > > + > > /** > > * @internal > > * > > -- > > 2.37.2
RE: [PATCH 09/10] net/gve: add maintainers for GVE
> -Original Message- > From: Ferruh Yigit > Sent: Thursday, May 4, 2023 19:01 > To: Guo, Junfeng ; Zhang, Qi Z > ; Wu, Jingjing ; Xing, > Beilei > Cc: dev@dpdk.org; Rushil Gupta ; Jeroen de Borst > ; Joshua Washington > Subject: Re: [PATCH 09/10] net/gve: add maintainers for GVE > > On 4/13/2023 7:16 AM, Junfeng Guo wrote: > > Add maintainers from Google for GVE. > > > > Signed-off-by: Junfeng Guo > > Signed-off-by: Rushil Gupta > > --- > > MAINTAINERS | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/MAINTAINERS b/MAINTAINERS > > index 8df23e5099..08001751b0 100644 > > --- a/MAINTAINERS > > +++ b/MAINTAINERS > > @@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini > > > > Google Virtual Ethernet > > M: Junfeng Guo > > +M: Jeroen de Borst > > +M: Rushil Gupta > > +M: Joshua Washington > > F: drivers/net/gve/ > > F: doc/guides/nics/gve.rst > > F: doc/guides/nics/features/gve.ini > > Requested ack from new added maintainers not received (to other > version > of this patch [2]). This patch is not dependency for rest of the set, so > dropping it from the set. > > Also there is a standalone version of this patch [1], can you please > send a new version to standalone one, instead of including same patch in > various sets? > Sure! Will send out a standalone version with in-reply-to [1]. Thanks for the careful review! > > > [1] > https://patches.dpdk.org/project/dpdk/patch/20221109072352.1387300- > 1-junfeng@intel.com/ > > [2] > https://patches.dpdk.org/project/dpdk/patch/20230328094512.1796648- > 4-junfeng@intel.com/
[PATCH v2] net/gve: add maintainers for GVE
Add maintainers from Google for GVE. Signed-off-by: Junfeng Guo Signed-off-by: Jeroen de Borst Signed-off-by: Rushil Gupta Signed-off-by: Joshua Washington --- MAINTAINERS | 3 +++ 1 file changed, 3 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 8df23e5099..08001751b0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -713,6 +713,9 @@ F: doc/guides/nics/features/enic.ini Google Virtual Ethernet M: Junfeng Guo +M: Jeroen de Borst +M: Rushil Gupta +M: Joshua Washington F: drivers/net/gve/ F: doc/guides/nics/gve.rst F: doc/guides/nics/features/gve.ini -- 2.34.1
[PATCH v2] net/iavf: add devargs to control watchdog
This patch enables the watchdog to detect VF FLR when the link state changes to down, and the default period is 2000us as defined by IAVF_DEV_WATCHDOG_PERIOD. In addition, the 'watchdog_period' devargs was added to adjust the watchdog period(microseconds), or set to 0 to disable the watchdog. Signed-off-by: Zhichao Zeng --- v2: enables watchdog when link status changes to down --- drivers/net/iavf/iavf.h| 5 ++- drivers/net/iavf/iavf_ethdev.c | 81 +- drivers/net/iavf/iavf_vchnl.c | 21 +++-- 3 files changed, 81 insertions(+), 26 deletions(-) diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h index aa18650ffa..98861e4242 100644 --- a/drivers/net/iavf/iavf.h +++ b/drivers/net/iavf/iavf.h @@ -31,7 +31,7 @@ #define IAVF_NUM_MACADDR_MAX 64 -#define IAVF_DEV_WATCHDOG_PERIOD 0 +#define IAVF_DEV_WATCHDOG_PERIOD 2000 /* microseconds, set 0 to disable*/ #define IAVF_DEFAULT_RX_PTHRESH 8 #define IAVF_DEFAULT_RX_HTHRESH 8 @@ -304,6 +304,7 @@ struct iavf_devargs { uint8_t proto_xtr_dflt; uint8_t proto_xtr[IAVF_MAX_QUEUE_NUM]; uint16_t quanta_size; + uint32_t watchdog_period; }; struct iavf_security_ctx; @@ -498,4 +499,6 @@ int iavf_flow_unsub(struct iavf_adapter *adapter, struct iavf_fsub_conf *filter); int iavf_flow_sub_check(struct iavf_adapter *adapter, struct iavf_fsub_conf *filter); +void iavf_dev_watchdog_enable(struct iavf_adapter *adapter); +void iavf_dev_watchdog_disable(struct iavf_adapter *adapter); #endif /* _IAVF_ETHDEV_H_ */ diff --git a/drivers/net/iavf/iavf_ethdev.c b/drivers/net/iavf/iavf_ethdev.c index f6d68403ce..4cf4eb20cd 100644 --- a/drivers/net/iavf/iavf_ethdev.c +++ b/drivers/net/iavf/iavf_ethdev.c @@ -36,6 +36,7 @@ /* devargs */ #define IAVF_PROTO_XTR_ARG "proto_xtr" #define IAVF_QUANTA_SIZE_ARG "quanta_size" +#define IAVF_RESET_WATCHDOG_ARG"watchdog_period" uint64_t iavf_timestamp_dynflag; int iavf_timestamp_dynfield_offset = -1; @@ -43,6 +44,7 @@ int iavf_timestamp_dynfield_offset = -1; static const char * const iavf_valid_args[] = { IAVF_PROTO_XTR_ARG, IAVF_QUANTA_SIZE_ARG, + IAVF_RESET_WATCHDOG_ARG, NULL }; @@ -301,40 +303,46 @@ iavf_dev_watchdog(void *cb_arg) /* enter reset state with VFLR event */ adapter->vf.vf_reset = true; + adapter->vf.link_up = false; rte_eth_dev_callback_process(adapter->vf.eth_dev, RTE_ETH_EVENT_INTR_RESET, NULL); } } - /* re-alarm watchdog */ - rc = rte_eal_alarm_set(IAVF_DEV_WATCHDOG_PERIOD, - &iavf_dev_watchdog, cb_arg); + if (adapter->devargs.watchdog_period) { + /* re-alarm watchdog */ + rc = rte_eal_alarm_set(adapter->devargs.watchdog_period, + &iavf_dev_watchdog, cb_arg); - if (rc) - PMD_DRV_LOG(ERR, "Failed \"%s\" to reset device watchdog alarm", - adapter->vf.eth_dev->data->name); + if (rc) + PMD_DRV_LOG(ERR, "Failed \"%s\" to reset device watchdog alarm", + adapter->vf.eth_dev->data->name); + } } -static void -iavf_dev_watchdog_enable(struct iavf_adapter *adapter __rte_unused) -{ -#if (IAVF_DEV_WATCHDOG_PERIOD > 0) - PMD_DRV_LOG(INFO, "Enabling device watchdog"); - adapter->vf.watchdog_enabled = true; - if (rte_eal_alarm_set(IAVF_DEV_WATCHDOG_PERIOD, - &iavf_dev_watchdog, (void *)adapter)) - PMD_DRV_LOG(ERR, "Failed to enabled device watchdog"); -#endif +void +iavf_dev_watchdog_enable(struct iavf_adapter *adapter) +{ + if (adapter->devargs.watchdog_period && !adapter->vf.watchdog_enabled) { + PMD_DRV_LOG(INFO, "Enabling device watchdog, period is %dμs", + adapter->devargs.watchdog_period); + adapter->vf.watchdog_enabled = true; + if (rte_eal_alarm_set(adapter->devargs.watchdog_period, + &iavf_dev_watchdog, (void *)adapter)) + PMD_DRV_LOG(ERR, "Failed to enabled device watchdog"); + } else { + PMD_DRV_LOG(INFO, "Device watchdog is disabled"); + } } -static void -iavf_dev_watchdog_disable(struct iavf_adapter *adapter __rte_unused) +void +iavf_dev_watchdog_disable(struct iavf_adapter *adapter) { -#if (IAVF_DEV_WATCHDOG_PERIOD > 0) - PMD_DRV_LOG(INFO, "Disabling device watchdog"); - adapter->vf.watchdog_enabled = false; -#endif + if (adapter->devargs.watchdog_period && adapter->vf.watchdog_enabled) { + PMD_DRV_LOG(INFO, "Disabling device watchdog"); + adapter->vf.watchdog_enabled =
RE: [PATCH v3] net/iavf: fix iavf query stats in intr thread
> -Original Message- > From: Ferruh Yigit > Sent: Monday, March 27, 2023 8:38 PM > To: Deng, KaiwenX ; dev@dpdk.org > Cc: sta...@dpdk.org; Yang, Qiming ; Zhou, YidingX > ; Chas Williams ; Min Hu (Connor) > ; Wu, Jingjing ; Xing, Beilei > ; Mike Pattrick ; Zhang, Qi Z > ; Doherty, Declan ; > Mrzyglod, Daniel T ; Dapeng Yu > ; Zhang, Helin ; > Mcnamara, John ; Thomas Monjalon > > Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr thread > > On 3/27/2023 1:31 PM, Ferruh Yigit wrote: > > On 3/27/2023 6:31 AM, Deng, KaiwenX wrote: > >> > >> > >>> -Original Message- > >>> From: Ferruh Yigit > >>> Sent: Thursday, March 23, 2023 11:39 PM > >>> To: Deng, KaiwenX ; dev@dpdk.org > >>> Cc: sta...@dpdk.org; Yang, Qiming ; Zhou, > >>> YidingX ; Chas Williams ; Min > >>> Hu (Connor) ; Wu, Jingjing > >>> ; Xing, Beilei ; Mike > >>> Pattrick ; Zhang, Qi Z ; > >>> Doherty, Declan ; Mrzyglod, Daniel T > >>> ; Dapeng Yu > >>> Subject: Re: [PATCH v3] net/iavf: fix iavf query stats in intr > >>> thread > >>> > >>> On 3/22/2023 7:26 AM, Kaiwen Deng wrote: > When iavf send query-stats command in eal-intr-thread through > virtual channel, there will be no response received from > iavf_dev_virtchnl_handler for this command during block and wait. > Because iavf_dev_virtchnl_handler is also registered in eal-intr-thread. > > When vf device is bonded as BONDING_MODE_TLB mode, the slave > device > update callback will registered in alarm and called by > eal-intr-thread, it would also raise the above issue. > > This commit add to poll the response for VIRTCHNL_OP_GET_STATS > when > >>> it > is called by eal-intr-thread to fix this issue. > > Fixes: 91bf37d250aa ("net/iavf: add lock for VF commands") > Fixes: 22b123a36d07 ("net/avf: initialize PMD") > Fixes: 7c76a747e68c ("bond: add mode 5") > Fixes: 435d523112cc ("net/iavf: fix multi-process shared data") > Fixes: cb5c1b91f76f ("net/iavf: add thread for event callbacks") > >>> > >>> > >>> Hi Kaiwen, > >>> > >>> Above commit already seems trying to address same issue, it creates > >>> "iavf- event-thread" control thread to asyncroniously handle the > >>> interrupts, in non- interrupt context, why it is not working? > >>> > >>> Instead of adding 'rte_thread_is_intr()' checks, can't you make sure > >>> all interrupts handled in control tread? > >>> > >>> And can you please provide a stack trace in commit log, to describe > >>> the issue better? > >> Hi Ferru, > >> Sorry for my late reply, And thanks for your review. > >> > >> The above commit does not fix this issue when we need to get the > returned data. > >> If we call iavf_query_stats and wait for response statistics in the intr- > thread. > >> iavf_handle_virtchnl_msg is also registered in the intr_thread and > >> will not be executed while waiting. > >> > > > > Got it, since return value is required, API can't be called asyncroniously. > > > > > > > > I think 'rte_thread_is_intr()' checks may cause more trouble for you > > in long term, > > > > - why 'iavf_query_stats()' is called in the iterrupt thread, can it be > > prevented? > > > > - does it make sense to allways poll messages from PF (for simplification)? > > > > > > If answer to both are 'No', I am OK to continue with current proposal > > if you are happy with it. > > > > > btw, how critical is this issue? > > If it is critical, I am OK to get it as it is for this release and > investigate it further > for next release, since only a few days left for this release. > Hi Ferruh, I didn't find a more suitable solution after consideration, if you have a better suggestion, please let me know, thanks. > > > > >> This commit I changed it to polling for replies to commands executed in > the interrupt thread. > >> > >> main thread > >> interrupt > thread > >> | > >> | > >> | > >> | > >> iavf_query_stats > >> | > >> iavf_execute_vf_cmd > >>| > >> iavf_aq_send_msg_to_pf and wait handle complete > >> | > >> | > >> | > >> > >> |- > --->| > >> | > >>
Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions
Hi Qi, On Thur, May 4, 2023 at 9:33PM, Zhang, Qi Z wrote: -Original Message- From: Morten Brørup Sent: Thursday, May 4, 2023 9:22 PM To: zhoumin ; Konstantin Ananyev Cc: dev@dpdk.org; maob...@loongson.cn; Yang, Qiming ; Wu, Wenjun1 ; ruifeng.w...@arm.com; d...@linux.vnet.ibm.com; Tyler Retzlaff Subject: RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx functions From: zhoumin [mailto:zhou...@loongson.cn] Sent: Thursday, 4 May 2023 15.17 Hi Konstantin, Thanks for your comments. On 2023/5/1 下午9:29, Konstantin Ananyev wrote: Segmentation fault has been observed while running the ixgbe_recv_pkts_lro() function to receive packets on the Loongson 3C5000 processor which has 64 cores and 4 NUMA nodes. From the ixgbe_recv_pkts_lro() function, we found that as long as the first packet has the EOP bit set, and the length of this packet is less than or equal to rxq->crc_len, the segmentation fault will definitely happen even though on the other platforms, such as X86. Sorry to interrupt, but I am curious why this issue still exists on x86 architecture. Can volatile be used to instruct the compiler to generate read instructions in a specific order, and does x86 guarantee not to reorder load operations? Actually, I did not see the segmentation fault on X86. I just made the first packet which had the EOP bit set had a zero length, then the segmentation fault would happen on X86. So, I thought that the out-of-order access to the descriptor might be possible to make the ready packet zero length, and this case was more likely to cause the segmentation fault. Because when processd the first packet the first_seg->next will be NULL, if at the same time this packet has the EOP bit set and its length is less than or equal to rxq->crc_len, the following loop will be excecuted: for (lp = first_seg; lp->next != rxm; lp = lp->next) ; We know that the first_seg->next will be NULL under this condition. So the expression of lp->next->next will cause the segmentation fault. Normally, the length of the first packet with EOP bit set will be greater than rxq->crc_len. However, the out-of-order execution of CPU may make the read ordering of the status and the rest of the descriptor fields in this function not be correct. The related codes are as following: rxdp = &rx_ring[rx_id]; #1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error); if (!(staterr & IXGBE_RXDADV_STAT_DD)) break; #2 rxd = *rxdp; The sentence #2 may be executed before sentence #1. This action is likely to make the ready packet zero length. If the packet is the first packet and has the EOP bit set, the above segmentation fault will happen. So, we should add rte_rmb() to ensure the read ordering be correct. We also did the same thing in the ixgbe_recv_pkts() function to make the rxd data be valid even thougth we did not find segmentation fault in this function. Signed-off-by: Min Zhou --- v2: - Make the calling of rte_rmb() for all platforms --- drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644 --- a/drivers/net/ixgbe/ixgbe_rxtx.c +++ b/drivers/net/ixgbe/ixgbe_rxtx.c @@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, staterr = rxdp->wb.upper.status_error; if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD))) break; + + rte_rmb(); rxd = *rxdp; Indeed, looks like a problem to me on systems with relaxed MO. Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers. The LoongArch architecture uses the Weak Consistency model which can cause the problem, especially in scenario with many cores, such as Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot reproduce it on Loongson 3C5000 with one NUMA node, which just has 16 cores. About a fix - looks right, but a bit excessive to me - as I understand all we need here is to prevent re-ordering by CPU itself. Yes, thanks for cc-ing. So rte_smp_rmb() seems enough here. Or might be just: staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE); Does __atomic_load_n() work on Windows if we use it to solve this problem ? Yes, __atomic_load_n() works on Windows too. Best regards, Min
Re: [dpdk-dev][PATCH] ethdev: add send queue flow matching item
On Thu, Apr 20, 2023 at 10:59 AM wrote: > > From: Kiran Kumar K > > Adding support for send queue flow matching item. To be consistent, use Tx queue every where.(git commit subject too) > This item is valid only for egress rules. > An example use case would be that application can > set different vlan insert rules with different PCP values > based on tx queue number. > > Signed-off-by: Kiran Kumar K > --- > app/test-pmd/cmdline_flow.c | 28 +++ > doc/guides/prog_guide/rte_flow.rst | 7 + > doc/guides/rel_notes/release_23_07.rst | 31 ++--- > doc/guides/testpmd_app_ug/testpmd_funcs.rst | 4 +++ > lib/ethdev/rte_flow.c | 1 + > lib/ethdev/rte_flow.h | 26 + > 6 files changed, 68 insertions(+), 29 deletions(-) > > diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c > index 58939ec321..a68a6080a8 100644 > --- a/app/test-pmd/cmdline_flow.c > +++ b/app/test-pmd/cmdline_flow.c > @@ -496,6 +496,8 @@ enum index { > ITEM_QUOTA_STATE_NAME, > ITEM_AGGR_AFFINITY, > ITEM_AGGR_AFFINITY_VALUE, > + ITEM_TX_QUEUE, > + ITEM_TX_QUEUE_VALUE, > > /* Validate/create actions. */ > ACTIONS, > @@ -1452,6 +1454,7 @@ static const enum index next_item[] = { > ITEM_METER, > ITEM_QUOTA, > ITEM_AGGR_AFFINITY, > + ITEM_TX_QUEUE, > END_SET, > ZERO, > }; > @@ -1953,6 +1956,12 @@ static const enum index item_aggr_affinity[] = { > ZERO, > }; > > +static const enum index item_tx_queue[] = { > + ITEM_TX_QUEUE_VALUE, > + ITEM_NEXT, > + ZERO, > +}; > + > static const enum index next_action[] = { > ACTION_END, > ACTION_VOID, > @@ -6945,6 +6954,22 @@ static const struct token token_list[] = { > .args = ARGS(ARGS_ENTRY(struct rte_flow_item_aggr_affinity, > affinity)), > }, > + [ITEM_TX_QUEUE] = { > + .name = "tx_queue", > + .help = "match on the tx queue of send packet", > + .priv = PRIV_ITEM(TX_QUEUE, > + sizeof(struct rte_flow_item_tx_queue)), > + .next = NEXT(item_tx_queue), > + .call = parse_vc, > + }, > + [ITEM_TX_QUEUE_VALUE] = { > + .name = "tx_queue_value", > + .help = "tx queue value", > + .next = NEXT(item_tx_queue, NEXT_ENTRY(COMMON_UNSIGNED), > +item_param), > + .args = ARGS(ARGS_ENTRY(struct rte_flow_item_tx_queue, > + tx_queue)), > + }, > }; > > /** Remove and return last entry from argument stack. */ > @@ -11849,6 +11874,9 @@ flow_item_default_mask(const struct rte_flow_item > *item) > case RTE_FLOW_ITEM_TYPE_AGGR_AFFINITY: > mask = &rte_flow_item_aggr_affinity_mask; > break; > + case RTE_FLOW_ITEM_TYPE_TX_QUEUE: > + mask = &rte_flow_item_tx_queue_mask; > + break; > default: > break; > } > diff --git a/doc/guides/prog_guide/rte_flow.rst > b/doc/guides/prog_guide/rte_flow.rst > index 32fc45516a..7154b56330 100644 > --- a/doc/guides/prog_guide/rte_flow.rst > +++ b/doc/guides/prog_guide/rte_flow.rst > @@ -1486,6 +1486,13 @@ This item is meant to use the same structure as `Item: > PORT_REPRESENTOR`_. > > See also `Action: REPRESENTED_PORT`_. > > +Item: ``TX_QUEUE`` > +^^^ > + > +Matches on the tx queue of send packet . tx -> Tx > + > +- ``tx_queue``: Tx queue. > + > Item: ``AGGR_AFFINITY`` > ^^^ > > diff --git a/doc/guides/rel_notes/release_23_07.rst > b/doc/guides/rel_notes/release_23_07.rst > index a9b1293689..631cbd2b58 100644 > --- a/doc/guides/rel_notes/release_23_07.rst > +++ b/doc/guides/rel_notes/release_23_07.rst > @@ -24,36 +24,9 @@ DPDK Release 23.07 > New Features > > > -.. This section should contain new features added in this release. > - Sample format: > +* **Added flow matching of tx queue.** Same as below. Right? > > - * **Add a title in the past tense with a full stop.** > - > - Add a short 1-2 sentence description in the past tense. > - The description should be enough to allow someone scanning > - the release notes to understand the new feature. > - > - If the feature adds a lot of sub-features you can use a bullet list > - like this: > - > - * Added feature foo to do something. > - * Enhanced feature bar to do something else. > - > - Refer to the previous release notes for examples. > - > - Suggested order in release notes items: > - * Core libs (EAL, mempool, ring, mbuf, buses) > - * Device abstraction libs and PMDs (ordered alphabetically by vendor > name) > - - ethdev (lib, PMDs) > - - c
RE: [RFC 10/27] vhost: retry translating IOVA after IOTLB miss
> -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 10/27] vhost: retry translating IOVA after IOTLB miss > > Vhost-user backend IOTLB misses and updates are > asynchronous, so IOVA address translation function > just fails after having sent an IOTLB miss update if needed > entry was not in the IOTLB cache. > > This is not the case for VDUSE, for which the needed IOTLB > update is returned directly when sending an IOTLB miss. > > This patch retry again finding the needed entry in the > IOTLB cache after having sent an IOTLB miss. > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/vhost.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c > index d35075b96c..4f16307e4d 100644 > --- a/lib/vhost/vhost.c > +++ b/lib/vhost/vhost.c > @@ -96,6 +96,12 @@ __vhost_iova_to_vva(struct virtio_net *dev, struct > vhost_virtqueue *vq, > vhost_user_iotlb_rd_lock(vq); > } > > + tmp_size = *size; > + /* Retry in case of VDUSE, as it is synchronous */ > + vva = vhost_user_iotlb_cache_find(dev, iova, &tmp_size, perm); > + if (tmp_size == *size) > + return vva; > + > return 0; > } > > -- > 2.39.2 Reviewed-by: Chenbo Xia
RE: [RFC 11/27] vhost: introduce backend ops
> -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 11/27] vhost: introduce backend ops > > This patch introduces backend ops struct, that will enable > calling backend specifics callbacks (Vhost-user, VDUSE), in > shared code. > > This is an empty shell for now, it will be filled in later > patches. > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/socket.c | 2 +- > lib/vhost/vhost.c | 8 +++- > lib/vhost/vhost.h | 10 +- > lib/vhost/vhost_user.c | 8 > lib/vhost/vhost_user.h | 1 + > 5 files changed, 26 insertions(+), 3 deletions(-) > > diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c > index 669c322e12..ba54263824 100644 > --- a/lib/vhost/socket.c > +++ b/lib/vhost/socket.c > @@ -221,7 +221,7 @@ vhost_user_add_connection(int fd, struct > vhost_user_socket *vsocket) > return; > } > > - vid = vhost_new_device(); > + vid = vhost_user_new_device(); > if (vid == -1) { > goto err; > } > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c > index 4f16307e4d..41f212315e 100644 > --- a/lib/vhost/vhost.c > +++ b/lib/vhost/vhost.c > @@ -676,11 +676,16 @@ reset_device(struct virtio_net *dev) > * there is a new virtio device being attached). > */ > int > -vhost_new_device(void) > +vhost_new_device(struct vhost_backend_ops *ops) > { > struct virtio_net *dev; > int i; > > + if (ops == NULL) { > + VHOST_LOG_CONFIG("device", ERR, "missing backend ops.\n"); > + return -1; > + } > + > pthread_mutex_lock(&vhost_dev_lock); > for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) { > if (vhost_devices[i] == NULL) > @@ -708,6 +713,7 @@ vhost_new_device(void) > dev->backend_req_fd = -1; > dev->postcopy_ufd = -1; > rte_spinlock_init(&dev->backend_req_lock); > + dev->backend_ops = ops; > > return i; > } > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h > index 4ace5ab081..cc5c707205 100644 > --- a/lib/vhost/vhost.h > +++ b/lib/vhost/vhost.h > @@ -89,6 +89,12 @@ > for (iter = val; iter < num; iter++) > #endif > > +/** > + * Structure that contains backend-specific ops. > + */ > +struct vhost_backend_ops { > +}; > + > /** > * Structure contains buffer address, length and descriptor index > * from vring to do scatter RX. > @@ -513,6 +519,8 @@ struct virtio_net { > void*extern_data; > /* pre and post vhost user message handlers for the device */ > struct rte_vhost_user_extern_ops extern_ops; > + > + struct vhost_backend_ops *backend_ops; > } __rte_cache_aligned; > > static inline void > @@ -812,7 +820,7 @@ get_device(int vid) > return dev; > } > > -int vhost_new_device(void); > +int vhost_new_device(struct vhost_backend_ops *ops); > void cleanup_device(struct virtio_net *dev, int destroy); > void reset_device(struct virtio_net *dev); > void vhost_destroy_device(int); > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c > index a989f2c46d..2d5dec5bc1 100644 > --- a/lib/vhost/vhost_user.c > +++ b/lib/vhost/vhost_user.c > @@ -3464,3 +3464,11 @@ int rte_vhost_host_notifier_ctrl(int vid, uint16_t > qid, bool enable) > > return ret; > } > + > +static struct vhost_backend_ops vhost_user_backend_ops; > + > +int > +vhost_user_new_device(void) > +{ > + return vhost_new_device(&vhost_user_backend_ops); > +} > diff --git a/lib/vhost/vhost_user.h b/lib/vhost/vhost_user.h > index a0987a58f9..61456049c8 100644 > --- a/lib/vhost/vhost_user.h > +++ b/lib/vhost/vhost_user.h > @@ -185,5 +185,6 @@ int vhost_user_iotlb_miss(struct virtio_net *dev, > uint64_t iova, uint8_t perm); > int read_fd_message(char *ifname, int sockfd, char *buf, int buflen, int > *fds, int max_fds, > int *fd_num); > int send_fd_message(char *ifname, int sockfd, char *buf, int buflen, int > *fds, int fd_num); > +int vhost_user_new_device(void); > > #endif > -- > 2.39.2 Reviewed-by: Chenbo Xia
RE: [RFC 12/27] vhost: add IOTLB cache entry removal callback
Hi Maxime, > -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 12/27] vhost: add IOTLB cache entry removal callback > > VDUSE will need to munmap() the IOTLB entry on removal > from the cache, as it performs mmap() before insertion. > > This patch introduces a callback that VDUSE layer will > implement to achieve this. > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/iotlb.c | 12 > lib/vhost/vhost.h | 4 > 2 files changed, 16 insertions(+) > > diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c > index 188dfb8e38..86b0be62b4 100644 > --- a/lib/vhost/iotlb.c > +++ b/lib/vhost/iotlb.c > @@ -25,6 +25,15 @@ struct vhost_iotlb_entry { > > #define IOTLB_CACHE_SIZE 2048 > > +static void > +vhost_user_iotlb_remove_notify(struct virtio_net *dev, struct > vhost_iotlb_entry *entry) > +{ > + if (dev->backend_ops->iotlb_remove_notify == NULL) > + return; > + > + dev->backend_ops->iotlb_remove_notify(entry->uaddr, entry->uoffset, > entry->size); > +} > + > static bool > vhost_user_iotlb_share_page(struct vhost_iotlb_entry *a, struct > vhost_iotlb_entry *b) > { > @@ -198,6 +207,7 @@ vhost_user_iotlb_cache_remove_all(struct virtio_net > *dev) > vhost_user_iotlb_set_dump(node); > > TAILQ_REMOVE(&dev->iotlb_list, node, next); > + vhost_user_iotlb_remove_notify(dev, node); > vhost_user_iotlb_pool_put(dev, node); > } > > @@ -223,6 +233,7 @@ vhost_user_iotlb_cache_random_evict(struct virtio_net > *dev) > vhost_user_iotlb_clear_dump(node, prev_node, next_node); > > TAILQ_REMOVE(&dev->iotlb_list, node, next); > + vhost_user_iotlb_remove_notify(dev, node); > vhost_user_iotlb_pool_put(dev, node); > dev->iotlb_cache_nr--; > break; > @@ -314,6 +325,7 @@ vhost_user_iotlb_cache_remove(struct virtio_net *dev, > uint64_t iova, uint64_t si > vhost_user_iotlb_clear_dump(node, prev_node, next_node); > > TAILQ_REMOVE(&dev->iotlb_list, node, next); > + vhost_user_iotlb_remove_notify(dev, node); > vhost_user_iotlb_pool_put(dev, node); > dev->iotlb_cache_nr--; > } else { > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h > index cc5c707205..2ad26f6951 100644 > --- a/lib/vhost/vhost.h > +++ b/lib/vhost/vhost.h > @@ -89,10 +89,14 @@ > for (iter = val; iter < num; iter++) > #endif > > +struct virtio_net; Adding this in patch 13 could be better since this patch is not using it. Thanks, Chenbo > +typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off, > uint64_t size); > + > /** > * Structure that contains backend-specific ops. > */ > struct vhost_backend_ops { > + vhost_iotlb_remove_notify iotlb_remove_notify; > }; > > /** > -- > 2.39.2
RE: [RFC 15/27] vhost: add API to set max queue pairs
Hi Maxime, > -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 15/27] vhost: add API to set max queue pairs > > This patch introduces a new rte_vhost_driver_set_max_queues > API as preliminary work for multiqueue support with VDUSE. > > Indeed, with VDUSE we need to pre-allocate the vrings at > device creation time, so we need such API not to allocate > the 128 queue pairs supported by the Vhost library. > > Calling the API is optional, 128 queue pairs remaining the > default. > > Signed-off-by: Maxime Coquelin > --- > doc/guides/prog_guide/vhost_lib.rst | 4 > lib/vhost/rte_vhost.h | 17 ++ > lib/vhost/socket.c | 36 +++-- > lib/vhost/version.map | 3 +++ > 4 files changed, 58 insertions(+), 2 deletions(-) Also add changes in release notes? Btw: somewhere we should also mention vduse support is added in release notes. Thanks, Chenbo > > diff --git a/doc/guides/prog_guide/vhost_lib.rst > b/doc/guides/prog_guide/vhost_lib.rst > index e8bb8c9b7b..cd4b109139 100644 > --- a/doc/guides/prog_guide/vhost_lib.rst > +++ b/doc/guides/prog_guide/vhost_lib.rst > @@ -334,6 +334,10 @@ The following is an overview of some key Vhost API > functions: >Clean DMA vChannel finished to use. After this function is called, >the specified DMA vChannel should no longer be used by the Vhost > library. > > +* ``rte_vhost_driver_set_max_queue_num(path, max_queue_pairs)`` > + > + Set the maximum number of queue pairs supported by the device. > + > Vhost-user Implementations > -- > > diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h > index 58a5d4be92..44cbfcb469 100644 > --- a/lib/vhost/rte_vhost.h > +++ b/lib/vhost/rte_vhost.h > @@ -588,6 +588,23 @@ rte_vhost_driver_get_protocol_features(const char > *path, > int > rte_vhost_driver_get_queue_num(const char *path, uint32_t *queue_num); > > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior > notice. > + * > + * Set the maximum number of queue pairs supported by the device. > + * > + * @param path > + * The vhost-user socket file path > + * @param max_queue_pairs > + * The maximum number of queue pairs > + * @return > + * 0 on success, -1 on failure > + */ > +__rte_experimental > +int > +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t > max_queue_pairs); > + > /** > * Get the feature bits after negotiation > * > diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c > index ba54263824..e95c3ffeac 100644 > --- a/lib/vhost/socket.c > +++ b/lib/vhost/socket.c > @@ -56,6 +56,8 @@ struct vhost_user_socket { > > uint64_t protocol_features; > > + uint32_t max_queue_pairs; > + > struct rte_vdpa_device *vdpa_dev; > > struct rte_vhost_device_ops const *notify_ops; > @@ -821,7 +823,7 @@ rte_vhost_driver_get_queue_num(const char *path, > uint32_t *queue_num) > > vdpa_dev = vsocket->vdpa_dev; > if (!vdpa_dev) { > - *queue_num = VHOST_MAX_QUEUE_PAIRS; > + *queue_num = vsocket->max_queue_pairs; > goto unlock_exit; > } > > @@ -831,7 +833,36 @@ rte_vhost_driver_get_queue_num(const char *path, > uint32_t *queue_num) > goto unlock_exit; > } > > - *queue_num = RTE_MIN((uint32_t)VHOST_MAX_QUEUE_PAIRS, > vdpa_queue_num); > + *queue_num = RTE_MIN(vsocket->max_queue_pairs, vdpa_queue_num); > + > +unlock_exit: > + pthread_mutex_unlock(&vhost_user.mutex); > + return ret; > +} > + > +int > +rte_vhost_driver_set_max_queue_num(const char *path, uint32_t > max_queue_pairs) > +{ > + struct vhost_user_socket *vsocket; > + int ret = 0; > + > + VHOST_LOG_CONFIG(path, INFO, "Setting max queue pairs to %u\n", > max_queue_pairs); > + > + if (max_queue_pairs > VHOST_MAX_QUEUE_PAIRS) { > + VHOST_LOG_CONFIG(path, ERR, "Library only supports up to %u > queue pairs\n", > + VHOST_MAX_QUEUE_PAIRS); > + return -1; > + } > + > + pthread_mutex_lock(&vhost_user.mutex); > + vsocket = find_vhost_user_socket(path); > + if (!vsocket) { > + VHOST_LOG_CONFIG(path, ERR, "socket file is not registered > yet.\n"); > + ret = -1; > + goto unlock_exit; > + } > + > + vsocket->max_queue_pairs = max_queue_pairs; > > unlock_exit: > pthread_mutex_unlock(&vhost_user.mutex); > @@ -890,6 +921,7 @@ rte_vhost_driver_register(const char *path, uint64_t > flags) > goto out_free; > } > vsocket->vdpa_dev = NULL; > + vsocket->max_queue_pairs = VHOST_MAX_QUEUE_PAIRS; > vsocke
RE: [RFC 16/27] net/vhost: use API to set max queue pairs
> -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 16/27] net/vhost: use API to set max queue pairs > > In order to support multiqueue with VDUSE, we need to > be able to limit the maximum number of queue pairs, to > avoid unnecessary memory consumption since the maximum > number of queue pairs need to be allocated at device > creation time, as opposed to Vhost-user which allocate > only when the frontend initialize them. > > Signed-off-by: Maxime Coquelin > --- > drivers/net/vhost/rte_eth_vhost.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/vhost/rte_eth_vhost.c > b/drivers/net/vhost/rte_eth_vhost.c > index 62ef955ebc..8d37ec9775 100644 > --- a/drivers/net/vhost/rte_eth_vhost.c > +++ b/drivers/net/vhost/rte_eth_vhost.c > @@ -1013,6 +1013,9 @@ vhost_driver_setup(struct rte_eth_dev *eth_dev) > goto drv_unreg; > } > > + if (rte_vhost_driver_set_max_queue_num(internal->iface_name, > internal->max_queues)) > + goto drv_unreg; > + > if (rte_vhost_driver_callback_register(internal->iface_name, > &vhost_ops) < 0) { > VHOST_LOG(ERR, "Can't register callbacks\n"); > -- > 2.39.2 Reviewed-by: Chenbo Xia
RE: [RFC 14/27] vhost: add helper for interrupt injection
Hi Maxime, > -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 14/27] vhost: add helper for interrupt injection > > Vhost-user uses eventfd to inject IRQs, but VDUSE uses > an ioctl. > > This patch prepares vhost_vring_call_split() and > vhost_vring_call_packed() to support VDUSE by introducing > a new helper. > > It also adds a new counter to for guest notification to for -> for? With this fixed: Reviewed-by: Chenbo Xia > failures, which could happen in case of uninitialized call > file descriptor for example. > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/vhost.c | 6 + > lib/vhost/vhost.h | 54 +++--- > lib/vhost/vhost_user.c | 10 > 3 files changed, 46 insertions(+), 24 deletions(-) > > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c > index 790eb06b28..c07028f2b3 100644 > --- a/lib/vhost/vhost.c > +++ b/lib/vhost/vhost.c > @@ -44,6 +44,7 @@ static const struct vhost_vq_stats_name_off > vhost_vq_stat_strings[] = { > {"size_1024_1518_packets", offsetof(struct vhost_virtqueue, > stats.size_bins[6])}, > {"size_1519_max_packets", offsetof(struct vhost_virtqueue, > stats.size_bins[7])}, > {"guest_notifications",offsetof(struct vhost_virtqueue, > stats.guest_notifications)}, > + {"guest_notifications_error",offsetof(struct vhost_virtqueue, > stats.guest_notifications_error)}, > {"iotlb_hits", offsetof(struct vhost_virtqueue, > stats.iotlb_hits)}, > {"iotlb_misses", offsetof(struct vhost_virtqueue, > stats.iotlb_misses)}, > {"inflight_submitted", offsetof(struct vhost_virtqueue, > stats.inflight_submitted)}, > @@ -697,6 +698,11 @@ vhost_new_device(struct vhost_backend_ops *ops) > return -1; > } > > + if (ops->inject_irq == NULL) { > + VHOST_LOG_CONFIG("device", ERR, "missing IRQ injection backend > op.\n"); > + return -1; > + } > + > pthread_mutex_lock(&vhost_dev_lock); > for (i = 0; i < RTE_MAX_VHOST_DEVICE; i++) { > if (vhost_devices[i] == NULL) > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h > index ee7640e901..8f0875b4e2 100644 > --- a/lib/vhost/vhost.h > +++ b/lib/vhost/vhost.h > @@ -90,16 +90,20 @@ > #endif > > struct virtio_net; > +struct vhost_virtqueue; > + > typedef void (*vhost_iotlb_remove_notify)(uint64_t addr, uint64_t off, > uint64_t size); > > typedef int (*vhost_iotlb_miss_cb)(struct virtio_net *dev, uint64_t iova, > uint8_t perm); > > +typedef int (*vhost_vring_inject_irq_cb)(struct virtio_net *dev, struct > vhost_virtqueue *vq); > /** > * Structure that contains backend-specific ops. > */ > struct vhost_backend_ops { > vhost_iotlb_remove_notify iotlb_remove_notify; > vhost_iotlb_miss_cb iotlb_miss; > + vhost_vring_inject_irq_cb inject_irq; > }; > > /** > @@ -149,6 +153,7 @@ struct virtqueue_stats { > /* Size bins in array as RFC 2819, undersized [0], 64 [1], etc */ > uint64_t size_bins[8]; > uint64_t guest_notifications; > + uint64_t guest_notifications_error; > uint64_t iotlb_hits; > uint64_t iotlb_misses; > uint64_t inflight_submitted; > @@ -900,6 +905,24 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, > uint16_t old) > return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - > old); > } > > +static __rte_always_inline void > +vhost_vring_inject_irq(struct virtio_net *dev, struct vhost_virtqueue *vq) > +{ > + int ret; > + > + ret = dev->backend_ops->inject_irq(dev, vq); > + if (ret) { > + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > + vq->stats.guest_notifications_error++; > + return; > + } > + > + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > + vq->stats.guest_notifications++; > + if (dev->notify_ops->guest_notified) > + dev->notify_ops->guest_notified(dev->vid); > +} > + > static __rte_always_inline void > vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) > { > @@ -919,25 +942,13 @@ vhost_vring_call_split(struct virtio_net *dev, > struct vhost_virtqueue *vq) > "%s: used_event_idx=%d, old=%d, new=%d\n", > __func__, vhost_used_event(vq), old, new); > > - if ((vhost_need_event(vhost_used_event(vq), new, old) || > - unlikely(!signalled_used_valid)) && > - vq->callfd >= 0) { > - eventfd_write(vq->callfd, (eventfd_t) 1); > - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > -
RE: [RFC 00/27] Add VDUSE support to Vhost library
Hi Maxime, > -Original Message- > From: Maxime Coquelin > Sent: Friday, March 31, 2023 11:43 PM > To: dev@dpdk.org; david.march...@redhat.com; Xia, Chenbo > ; m...@redhat.com; f...@redhat.com; > jasow...@redhat.com; Liang, Cunming ; Xie, Yongji > ; echau...@redhat.com; epere...@redhat.com; > amore...@redhat.com > Cc: Maxime Coquelin > Subject: [RFC 00/27] Add VDUSE support to Vhost library > > This series introduces a new type of backend, VDUSE, > to the Vhost library. > > VDUSE stands for vDPA device in Userspace, it enables > implementing a Virtio device in userspace and have it > attached to the Kernel vDPA bus. > > Once attached to the vDPA bus, it can be used either by > Kernel Virtio drivers, like virtio-net in our case, via > the virtio-vdpa driver. Doing that, the device is visible > to the Kernel networking stack and is exposed to userspace > as a regular netdev. > > It can also be exposed to userspace thanks to the > vhost-vdpa driver, via a vhost-vdpa chardev that can be > passed to QEMU or Virtio-user PMD. > > While VDUSE support is already available in upstream > Kernel, a couple of patches are required to support > network device type: > > https://gitlab.com/mcoquelin/linux/-/tree/vduse_networking_poc > > In order to attach the created VDUSE device to the vDPA > bus, a recent iproute2 version containing the vdpa tool is > required. > -- > 2.39.2 Btw: when this series merged in future, will Redhat run all the test cases of vduse in every release? Thanks, Chenbo
RE: 22.11.2 patches review and test
> -Original Message- > From: Xu, HailinX > Sent: Thursday, April 27, 2023 4:10 PM > To: Xueming Li ; sta...@dpdk.org > Cc: dev@dpdk.org; Abhishek Marathe ; > Ali Alnubani ; Walker, Benjamin > ; David Christensen ; > Hemant Agrawal ; Stokes, Ian > ; Jerin Jacob ; Mcnamara, John > ; Ju-Hyoung Lee ; Kevin > Traynor ; Luca Boccassi ; Pei > Zhang ; Xu, Qian Q ; Raslan > Darawsheh ; Thomas Monjalon > ; Yanghang Liu ; Peng, Yuan > ; Chen, Zhaoyan > Subject: RE: 22.11.2 patches review and test > > > > -Original Message- > > From: Xueming Li > > Sent: Sunday, April 23, 2023 5:34 PM > > To: sta...@dpdk.org > > Cc: xuemi...@nvidia.com; dev@dpdk.org; Abhishek Marathe > > ; Ali Alnubani ; > > Walker, Benjamin ; David Christensen > > ; Hemant Agrawal ; > > Stokes, Ian ; Jerin Jacob ; > > Mcnamara, John ; Ju-Hyoung Lee > > ; Kevin Traynor ; Luca > > Boccassi ; Pei Zhang ; Xu, Qian > > Q ; Raslan Darawsheh ; > Thomas > > Monjalon ; Yanghang Liu ; > > Peng, Yuan ; Chen, Zhaoyan > > > > Subject: 22.11.2 patches review and test > > > > Hi all, > > > > Here is a list of patches targeted for stable release 22.11.2. > > > > The planned date for the final release is 5th MAY. > > > > Please help with testing and validation of your use cases and report > > any issues/results with reply-all to this mail. For the final release > > the fixes and reported validations will be added to the release notes. > > > > A release candidate tarball can be found at: > > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v22.11.2-rc1 > > > > These patches are located at branch 22.11 of dpdk-stable repo: > > https://dpdk.org/browse/dpdk-stable/ > > > > Thanks. > > > > Xueming Li > > Update the test status for Intel part. Till now dpdk22.11.2-rc1 validation > test > rate is 85%. No critical issue is found. > # Basic Intel(R) NIC testing > * Build & CFLAG compile: cover the build test combination with latest > GCC/Clang version and the popular OS revision such as > Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4, > FreeBSD13.1, SUSE15, CentOS7.9, openEuler22.03-SP1 etc. > - All test done. No new dpdk issue is found. > * PF(i40e, ixgbe): test scenarios including > RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. > - All test done. No new dpdk issue is found. > * VF(i40e, ixgbe): test scenarios including > VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. > - All test done. No new dpdk issue is found. > * PF/VF(ice): test scenarios including Switch features/Package > Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible > Descriptor, etc. > - All test done. No new dpdk issue is found. > * Intel NIC single core/NIC performance: test scenarios including PF/VF single > core performance test, etc. > - All test done. No new dpdk issue is found. > * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - > QAT&SW/FIB library, etc. > - Execution rate is 90%. No new dpdk issue is found. > > # Basic cryptodev and virtio testing > * Virtio: both function and performance test are covered. Such as > PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf > testing/VMAWARE ESXI 8.0, etc. > - All test done. No new dpdk issue is found. > * Cryptodev: > *Function test: test scenarios including Cryptodev API testing/CompressDev > ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. > - Execution rate is 75%. No new dpdk issue is found. > *Performance test: test scenarios including Thoughput > Performance/Cryptodev Latency, etc. > - All test done. No new dpdk issue is found. > > Regards, > Xu, Hailin > Update the test status for Intel part. dpdk22.11.2-rc1 all validation test done. No critical issue is found. 1 new issue is under confirming by Intel Dev. New bugs: --20.11.8-rc1 and 21.11.4-rc1 also has this issue 1. some of the virtio tests are failing: -- Intel dev is under investigating # Basic Intel(R) NIC testing * Build & CFLAG compile: cover the build test combination with latest GCC/Clang version and the popular OS revision such as Ubuntu20.04, Ubuntu22.04, Fedora35, Fedora37, RHEL8.6, RHEL8.4, FreeBSD13.1, SUSE15, CentOS7.9, openEuler22.03-SP1 etc. - All test done. No new dpdk issue is found. * PF(i40e, ixgbe): test scenarios including RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * VF(i40e, ixgbe): test scenarios including VF-RTE_FLOW/TSO/Jumboframe/checksum offload/VLAN/VXLAN, etc. - All test done. No new dpdk issue is found. * PF/VF(ice): test scenarios including Switch features/Package Management/Flow Director/Advanced Tx/Advanced RSS/ACL/DCF/Flexible Descriptor, etc. - All test done. No new dpdk issue is found. * Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test, etc. - All test done. No new dpdk issue is found. * IPsec: test scenarios including ipsec/ipsec-gw/ipsec library basic test - QAT&SW/FIB library, etc. - All t
[dpdk-dev] [PATCH v3] ring: fix use after free in ring release
After the memzone is freed, it is not removed from the 'rte_ring_tailq'. If rte_ring_lookup is called at this time, it will cause a use-after-free problem. This change prevents that from happening. Fixes: 4e32101f9b01 ("ring: support freeing") Cc: sta...@dpdk.org Signed-off-by: Yunjian Wang Acked-by: Konstantin Ananyev Reviewed-by: Honnappa Nagarahalli --- v3: move memzone free outside the lock --- lib/ring/rte_ring.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/lib/ring/rte_ring.c b/lib/ring/rte_ring.c index 8ed455043d..057d25ff6f 100644 --- a/lib/ring/rte_ring.c +++ b/lib/ring/rte_ring.c @@ -333,11 +333,6 @@ rte_ring_free(struct rte_ring *r) return; } - if (rte_memzone_free(r->memzone) != 0) { - RTE_LOG(ERR, RING, "Cannot free memory\n"); - return; - } - ring_list = RTE_TAILQ_CAST(rte_ring_tailq.head, rte_ring_list); rte_mcfg_tailq_write_lock(); @@ -356,6 +351,9 @@ rte_ring_free(struct rte_ring *r) rte_mcfg_tailq_write_unlock(); + if (rte_memzone_free(r->memzone) != 0) + RTE_LOG(ERR, RING, "Cannot free memory\n"); + rte_free(te); } -- 2.33.0