RE: [PATCH] gro : fix pkt length when extra bytes are padded to ethernet frame

2022-10-16 Thread Hu, Jiayu
Hi Kumara,

Agree. We need to trim padding bytes first, then do the length check.

Thanks,
Jiayu

From: Jun Qiu 
Sent: Wednesday, October 12, 2022 9:35 AM
To: kumaraparameshwaran rathinavel ; dev@dpdk.org; 
David Marchand ; Hu, Jiayu 
Subject: RE: [PATCH] gro : fix pkt length when extra bytes are padded to 
ethernet frame

Yes,  It's better to do it before the tcp_dl check.


  1.  /* trim the tail padding bytes */
  2.  ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
  3.  if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
  4.  rte_pktmbuf_trim(pkt, pkt->pkt_len - ip_tlen - pkt->l2_len);
  5.
  6.  /*
  7.   * Don't process the packet whose payload length is less than or
  8.   * equal to 0.
  9.   */
  10. tcp_dl = pkt->pkt_len - hdr_len;
  11. if (tcp_dl <= 0)
  12. return -1;
From: kumaraparameshwaran rathinavel 
mailto:kumaraparames...@gmail.com>>
Sent: Tuesday, October 11, 2022 1:48 PM
To: dev@dpdk.org; David Marchand 
mailto:david.march...@redhat.com>>; Hu, Jiayu 
mailto:jiayu...@intel.com>>; Jun Qiu 
mailto:jun@jaguarmicro.com>>
Subject: Fwd: [PATCH] gro : fix pkt length when extra bytes are padded to 
ethernet frame



On Tue, Oct 11, 2022 at 1:03 AM David Marchand 
mailto:david.march...@redhat.com>> wrote:
On Mon, Oct 10, 2022 at 7:51 PM Kumara Parameshwaran
mailto:kumaraparames...@gmail.com>> wrote:
>
> From: Kumara Parameshwaran 
> mailto:kumaraparames...@gmail.com>>
>
> When GRO packets are merged the packet length is used while
> merging the adjacent packets. If the padded bytes are accounted
> we would end up acking unsent TCP segments.
>
> Signed-off-by: Kumara Parameshwaran 
> mailto:kumaraparames...@gmail.com>>
> v1:
> If there is padding to the ethernet frame cases where timestamp is 
> disabled
> the packet length should be validated with the total ip length as 
> packet length
> is used in the GRO merging logic

Please, can you test with current main branch?
We recently merged a fix:
https://git.dpdk.org/dpdk/commit/?id=b8a55871d5af6f5b8694b1cb5eacbc629734e403
Thanks, David. I did look at the patch. This would fix the problem, but lets 
say for a case of an ACK packet where TCP data length would be zero and if the 
packet contains the padded bytes, this check should be done before the follwing 
check and not after this. @Hu, Jiayu  @Jun 
Qiu  please let me know your thoughts.
/*
* Don't process the packet whose payload length is less than or
* equal to 0.
*/
tcp_dl = pkt->pkt_len - hdr_len;
if (tcp_dl <= 0)
return -1;
--
David Marchand


Re: [PATCH v4] eal: non-temporal memcpy

2022-10-16 Thread Mattias Rönnblom

On 2022-10-10 08:46, Morten Brørup wrote:

This patch provides a function for memory copy using non-temporal store,
load or both, controlled by flags passed to the function.

Applications sometimes copy data to another memory location, which is only
used much later.
In this case, it is inefficient to pollute the data cache with the copied
data.

An example use case (originating from a real life application):
Copying filtered packets, or the first part of them, into a capture buffer
for offline analysis.

The purpose of the function is to achieve a performance gain by not
polluting the cache when copying data.
Although the throughput can be improved by further optimization, I do not
have time to do it now.



The above section is a little repetitive, and only indirectly explains 
what NT loads/stores are.


"This patch provides a new function rte_memcpy_ex() for copying data 
between non-overlapping memory regions. The primary aim of 
rte_memcpy_ex() is to provide a rte_memcpy() (and memcpy()) plug-in 
replacement, where the user may opt for loads and/or stores with 
non-temporal hints to be used while copying the data.


By using a non-temporal hint, the program informs the system that it 
does not intended to further access the data any time soon.


This in turn allows the CPU to bypass the CPU caches or by other means 
avoid this unlikely-to-be-used-soon data to evict cache lines or force 
the future evictions of more useful cache lines."


You should also say something about the memory ordering issue.


The functional tests and performance tests for memory copy have been
expanded to include non-temporal copying.

A non-temporal version of the mbuf library's function to create a full
copy of a given packet mbuf is provided.

The packet capture and packet dump libraries have been updated to use
non-temporal memory copy of the packets.
 > Implementation notes:

Implementations for non-x86 architectures can be provided by anyone at a
later time. I am not going to do it.

x86 non-temporal load instructions must be 16 byte aligned [1], and
non-temporal store instructions must be 4, 8 or 16 byte aligned [2].

ARM non-temporal load and store instructions seem to require 4 byte
alignment [3].



Would this patch be better off as a series? And maybe leave some of this 
information to a cover letter?



[1] https://www.intel.com/content/www/us/en/docs/intrinsics-guide/
index.html#text=_mm_stream_load
[2] https://www.intel.com/content/www/us/en/docs/intrinsics-guide/
index.html#text=_mm_stream_si
[3] https://developer.arm.com/documentation/100076/0100/
A64-Instruction-Set-Reference/A64-Floating-point-Instructions/
LDNP--SIMD-and-FP-

This patch is a major rewrite from the RFC v3, so no version log comparing
to the RFC is provided.

v4
* Also ignore the warning for clang int the workaround for
   _mm_stream_load_si128() missing const in the parameter.
* Add missing C linkage specifier in rte_memcpy.h.

v3
* _mm_stream_si64() is not supported on 32-bit x86 architecture, so only
   use it on 64-bit x86 architecture.
* CLANG warns that _mm_stream_load_si128_const() and
   rte_memcpy_nt_15_or_less_s16a() are not public,
   so remove __rte_internal from them. It also affects the documentation
   for the functions, so the fix can't be limited to CLANG.
* Use __rte_experimental instead of __rte_internal.
* Replace  with nnn in function documentation; it doesn't look like
   HTML.
* Slightly modify the workaround for _mm_stream_load_si128() missing const
   in the parameter; the ancient GCC 4.5.8 in RHEL7 doesn't understand
   #pragma GCC diagnostic ignored "-Wdiscarded-qualifiers", so use
   #pragma GCC diagnostic ignored "-Wcast-qual" instead. I hope that works.
* Fixed one coding style issue missed in v2.

v2
* The last 16 byte block of data, incl. any trailing bytes, were not
   copied from the source memory area in rte_memcpy_nt_buf().
* Fix many coding style issues.
* Add some missing header files.
* Fix build time warning for non-x86 architectures by using a different
   method to mark the flags parameter unused.
* CLANG doesn't understand RTE_BUILD_BUG_ON(!__builtin_constant_p(flags)),
   so omit it when using CLANG.

Signed-off-by: Morten Brørup 
---
  app/test/test_memcpy.c   |   65 +-
  app/test/test_memcpy_perf.c  |  187 ++--
  lib/eal/include/generic/rte_memcpy.h |  127 +++
  lib/eal/x86/include/rte_memcpy.h | 1238 ++
  lib/mbuf/rte_mbuf.c  |   77 ++
  lib/mbuf/rte_mbuf.h  |   32 +
  lib/mbuf/version.map |1 +
  lib/pcapng/rte_pcapng.c  |3 +-
  lib/pdump/rte_pdump.c|6 +-
  9 files changed, 1645 insertions(+), 91 deletions(-)

diff --git a/app/test/test_memcpy.c b/app/test/test_memcpy.c
index 1ab86f4967..12410ce413 100644
--- a/app/test/test_memcpy.c
+++ b/app/test/test_memcpy.c
@@ -1,5 +1,6 @@
  /* SPDX-License-Identifier: BSD-3-Clause
   * Copyright(c) 2010-2014 Intel Corporat

[PATCH v2] gro : check for payload length after the trim

2022-10-16 Thread Kumara Parameshwaran
From: Kumara Parameshwaran 

When packet is padded with extra bytes the
the validation of the payload length should be done
after the trim operation

Fixes: b8a55871d5af ("gro: trim tail padding bytes")
Cc: sta...@dpdk.org

Signed-off-by: Kumara Parameshwaran 
---
v1:
If there is padding to the ethernet frame cases where timestamp is 
disabled
the packet length should be validated with the total ip length as 
packet length 
is used in the GRO merging logic

v2:
Trim the packet length and then check for the protocol payload 
validation
 lib/gro/gro_tcp4.c | 11 ++-
 lib/gro/gro_udp4.c | 10 +-
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c
index 8f5e800250..0014096e63 100644
--- a/lib/gro/gro_tcp4.c
+++ b/lib/gro/gro_tcp4.c
@@ -225,6 +225,12 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
 */
if (tcp_hdr->tcp_flags != RTE_TCP_ACK_FLAG)
return -1;
+
+   /* trim the tail padding bytes */
+   ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
+   if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
+   rte_pktmbuf_trim(pkt, pkt->pkt_len - ip_tlen - pkt->l2_len);
+
/*
 * Don't process the packet whose payload length is less than or
 * equal to 0.
@@ -233,11 +239,6 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
if (tcp_dl <= 0)
return -1;
 
-   /* trim the tail padding bytes */
-   ip_tlen = rte_be_to_cpu_16(ipv4_hdr->total_length);
-   if (pkt->pkt_len > (uint32_t)(ip_tlen + pkt->l2_len))
-   rte_pktmbuf_trim(pkt, pkt->pkt_len - ip_tlen - pkt->l2_len);
-
/*
 * Save IPv4 ID for the packet whose DF bit is 0. For the packet
 * whose DF bit is 1, IPv4 ID is ignored.
diff --git a/lib/gro/gro_udp4.c b/lib/gro/gro_udp4.c
index 839f9748b7..42596d33b6 100644
--- a/lib/gro/gro_udp4.c
+++ b/lib/gro/gro_udp4.c
@@ -220,6 +220,11 @@ gro_udp4_reassemble(struct rte_mbuf *pkt,
if (!is_ipv4_fragment(ipv4_hdr))
return -1;
 
+   ip_dl = rte_be_to_cpu_16(ipv4_hdr->total_length);
+   /* trim the tail padding bytes */
+   if (pkt->pkt_len > (uint32_t)(ip_dl + pkt->l2_len))
+   rte_pktmbuf_trim(pkt, pkt->pkt_len - ip_dl - pkt->l2_len);
+
/*
 * Don't process the packet whose payload length is less than or
 * equal to 0.
@@ -227,14 +232,9 @@ gro_udp4_reassemble(struct rte_mbuf *pkt,
if (pkt->pkt_len <= hdr_len)
return -1;
 
-   ip_dl = rte_be_to_cpu_16(ipv4_hdr->total_length);
if (ip_dl <= pkt->l3_len)
return -1;
 
-   /* trim the tail padding bytes */
-   if (pkt->pkt_len > (uint32_t)(ip_dl + pkt->l2_len))
-   rte_pktmbuf_trim(pkt, pkt->pkt_len - ip_dl - pkt->l2_len);
-
ip_dl -= pkt->l3_len;
ip_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
frag_offset = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
-- 
2.25.1



Re: [PATCH v4] eal: non-temporal memcpy

2022-10-16 Thread Mattias Rönnblom

On 2022-10-10 08:46, Morten Brørup wrote:

This patch provides a function for memory copy using non-temporal store,
load or both, controlled by flags passed to the function.

Applications sometimes copy data to another memory location, which is only
used much later.
In this case, it is inefficient to pollute the data cache with the copied
data.

An example use case (originating from a real life application):
Copying filtered packets, or the first part of them, into a capture buffer
for offline analysis.

The purpose of the function is to achieve a performance gain by not
polluting the cache when copying data.
Although the throughput can be improved by further optimization, I do not
have time to do it now.

The functional tests and performance tests for memory copy have been
expanded to include non-temporal copying.

A non-temporal version of the mbuf library's function to create a full
copy of a given packet mbuf is provided.

The packet capture and packet dump libraries have been updated to use
non-temporal memory copy of the packets.

Implementation notes:

Implementations for non-x86 architectures can be provided by anyone at a
later time. I am not going to do it.

x86 non-temporal load instructions must be 16 byte aligned [1], and
non-temporal store instructions must be 4, 8 or 16 byte aligned [2].

ARM non-temporal load and store instructions seem to require 4 byte
alignment [3].

[1] https://www.intel.com/content/www/us/en/docs/intrinsics-guide/
index.html#text=_mm_stream_load
[2] https://www.intel.com/content/www/us/en/docs/intrinsics-guide/
index.html#text=_mm_stream_si
[3] https://developer.arm.com/documentation/100076/0100/
A64-Instruction-Set-Reference/A64-Floating-point-Instructions/
LDNP--SIMD-and-FP-

This patch is a major rewrite from the RFC v3, so no version log comparing
to the RFC is provided.

v4
* Also ignore the warning for clang int the workaround for
   _mm_stream_load_si128() missing const in the parameter.
* Add missing C linkage specifier in rte_memcpy.h.

v3
* _mm_stream_si64() is not supported on 32-bit x86 architecture, so only
   use it on 64-bit x86 architecture.
* CLANG warns that _mm_stream_load_si128_const() and
   rte_memcpy_nt_15_or_less_s16a() are not public,
   so remove __rte_internal from them. It also affects the documentation
   for the functions, so the fix can't be limited to CLANG.
* Use __rte_experimental instead of __rte_internal.
* Replace  with nnn in function documentation; it doesn't look like
   HTML.
* Slightly modify the workaround for _mm_stream_load_si128() missing const
   in the parameter; the ancient GCC 4.5.8 in RHEL7 doesn't understand
   #pragma GCC diagnostic ignored "-Wdiscarded-qualifiers", so use
   #pragma GCC diagnostic ignored "-Wcast-qual" instead. I hope that works.
* Fixed one coding style issue missed in v2.

v2
* The last 16 byte block of data, incl. any trailing bytes, were not
   copied from the source memory area in rte_memcpy_nt_buf().
* Fix many coding style issues.
* Add some missing header files.
* Fix build time warning for non-x86 architectures by using a different
   method to mark the flags parameter unused.
* CLANG doesn't understand RTE_BUILD_BUG_ON(!__builtin_constant_p(flags)),
   so omit it when using CLANG.

Signed-off-by: Morten Brørup 
---
  app/test/test_memcpy.c   |   65 +-
  app/test/test_memcpy_perf.c  |  187 ++--
  lib/eal/include/generic/rte_memcpy.h |  127 +++
  lib/eal/x86/include/rte_memcpy.h | 1238 ++
  lib/mbuf/rte_mbuf.c  |   77 ++
  lib/mbuf/rte_mbuf.h  |   32 +
  lib/mbuf/version.map |1 +
  lib/pcapng/rte_pcapng.c  |3 +-
  lib/pdump/rte_pdump.c|6 +-
  9 files changed, 1645 insertions(+), 91 deletions(-)

diff --git a/app/test/test_memcpy.c b/app/test/test_memcpy.c
index 1ab86f4967..12410ce413 100644
--- a/app/test/test_memcpy.c
+++ b/app/test/test_memcpy.c
@@ -1,5 +1,6 @@
  /* SPDX-License-Identifier: BSD-3-Clause
   * Copyright(c) 2010-2014 Intel Corporation
+ * Copyright(c) 2022 SmartShare Systems
   */
  
  #include 

@@ -36,6 +37,19 @@ static size_t buf_sizes[TEST_VALUE_RANGE];
  /* Data is aligned on this many bytes (power of 2) */
  #define ALIGNMENT_UNIT  32
  
+const uint64_t nt_mode_flags[4] = {

+   0,
+   RTE_MEMOPS_F_SRC_NT,
+   RTE_MEMOPS_F_DST_NT,
+   RTE_MEMOPS_F_SRC_NT | RTE_MEMOPS_F_DST_NT
+};
+const char * const nt_mode_str[4] = {
+   "none",
+   "src",
+   "dst",
+   "src+dst"
+};
+
  
  /*

   * Create two buffers, and initialise one with random values. These are copied
@@ -44,12 +58,13 @@ static size_t buf_sizes[TEST_VALUE_RANGE];
   * changed.
   */
  static int
-test_single_memcpy(unsigned int off_src, unsigned int off_dst, size_t size)
+test_single_memcpy(unsigned int off_src, unsigned int off_dst, size_t size, 
unsigned int nt_mode)
  {
unsigned int i;
uint8_t dest[SMALL_BUFFER_SIZE + A

RE: [PATCH v2] common/qat: read hw slice configuration

2022-10-16 Thread Ji, Kai


> -Original Message-
> From: Kusztal, ArkadiuszX 
> Sent: Thursday, October 13, 2022 3:03 PM
> To: dev@dpdk.org
> Cc: gak...@marvell.com; Ji, Kai ; Kusztal, ArkadiuszX
> 
> Subject: [PATCH v2] common/qat: read hw slice configuration
> 
> Read slice configuration of QAT capabilities.
> This will allow to recognize if specific hw function is available on 
> particular
> device.
> 
> Signed-off-by: Arek Kusztal 

[KJ] any documentation update needed for qat.rst ?

> ---
>  drivers/common/qat/dev/qat_dev_gen1.c|  8 +++
>  drivers/common/qat/dev/qat_dev_gen2.c|  8 +++
>  drivers/common/qat/dev/qat_dev_gen3.c| 14 +
>  drivers/common/qat/dev/qat_dev_gen4.c|  8 +++
>  drivers/common/qat/qat_adf/icp_qat_hw.h  | 18 +++
>  drivers/common/qat/qat_device.c  | 10 +++-
>  drivers/common/qat/qat_device.h  |  5 ++
>  drivers/crypto/qat/dev/qat_asym_pmd_gen1.c   | 42 ---
>  drivers/crypto/qat/dev/qat_crypto_pmd_gen2.c | 41 +++---
> drivers/crypto/qat/dev/qat_crypto_pmd_gen3.c | 80
> +---
> drivers/crypto/qat/dev/qat_crypto_pmd_gen4.c | 41 +++---
> drivers/crypto/qat/dev/qat_crypto_pmd_gens.h |  5 +-
>  drivers/crypto/qat/dev/qat_sym_pmd_gen1.c| 41 +++---
>  drivers/crypto/qat/qat_asym.c| 49 -
>  drivers/crypto/qat/qat_crypto.h  |  4 +-
>  drivers/crypto/qat/qat_sym.c | 38 +
>  16 files changed, 324 insertions(+), 88 deletions(-)
> 
> diff --git a/drivers/common/qat/dev/qat_dev_gen1.c
> b/drivers/common/qat/dev/qat_dev_gen1.c
> index c34ae5a51c..0be5b9077c 100644
> --- a/drivers/common/qat/dev/qat_dev_gen1.c
> +++ b/drivers/common/qat/dev/qat_dev_gen1.c
> @@ -241,12 +241,20 @@ qat_dev_get_extra_size_gen1(void)
>   return 0;
>  }
> 
> +static int
> +qat_get_slice_map_gen1(uint16_t *map __rte_unused,
> + const struct rte_pci_device *pci_dev __rte_unused) {
> + return 0;
> +}
> +
>  static struct qat_dev_hw_spec_funcs qat_dev_hw_spec_gen1 = {
>   .qat_dev_reset_ring_pairs = qat_reset_ring_pairs_gen1,
>   .qat_dev_get_transport_bar = qat_dev_get_transport_bar_gen1,
>   .qat_dev_get_misc_bar = qat_dev_get_misc_bar_gen1,
>   .qat_dev_read_config = qat_dev_read_config_gen1,
>   .qat_dev_get_extra_size = qat_dev_get_extra_size_gen1,
> + .qat_get_slice_map = qat_get_slice_map_gen1,
[KJ] qat_dev_get_slice_map ?
>  };
> 
>  RTE_INIT(qat_dev_gen_gen1_init)
> diff --git a/drivers/common/qat/dev/qat_dev_gen2.c
> b/drivers/common/qat/dev/qat_dev_gen2.c
> index f077fe9eef..cb60fe5309 100644
> --- a/drivers/common/qat/dev/qat_dev_gen2.c
> +++ b/drivers/common/qat/dev/qat_dev_gen2.c
> @@ -21,12 +21,20 @@ static struct qat_qp_hw_spec_funcs
> qat_qp_hw_spec_gen2 = {
>   .qat_qp_get_hw_data = qat_qp_get_hw_data_gen1,  };
> 
> +static int
> +qat_get_slice_map_gen2(uint16_t *map __rte_unused,
> + const struct rte_pci_device *pci_dev __rte_unused) {
> + return 0;
> +}
> +
>  static struct qat_dev_hw_spec_funcs qat_dev_hw_spec_gen2 = {
>   .qat_dev_reset_ring_pairs = qat_reset_ring_pairs_gen1,
>   .qat_dev_get_transport_bar = qat_dev_get_transport_bar_gen1,
>   .qat_dev_get_misc_bar = qat_dev_get_misc_bar_gen1,
>   .qat_dev_read_config = qat_dev_read_config_gen1,
>   .qat_dev_get_extra_size = qat_dev_get_extra_size_gen1,
> + .qat_get_slice_map = qat_get_slice_map_gen2,
>  };
> 
>  RTE_INIT(qat_dev_gen_gen2_init)
> diff --git a/drivers/common/qat/dev/qat_dev_gen3.c
> b/drivers/common/qat/dev/qat_dev_gen3.c
> index de3fa17fa9..681c946ede 100644
> --- a/drivers/common/qat/dev/qat_dev_gen3.c
> +++ b/drivers/common/qat/dev/qat_dev_gen3.c
> @@ -5,6 +5,7 @@
>  #include "qat_device.h"
>  #include "qat_qp.h"
>  #include "adf_transport_access_macros.h"
> +#include "icp_qat_hw.h"
[KJ] consider move this header file in qat_device.h

>  #include "qat_dev_gens.h"
> 


> diff --git a/drivers/common/qat/qat_device.h
> b/drivers/common/qat/qat_device.h index d1512f3b89..175b801a6c 100644
> --- a/drivers/common/qat/qat_device.h
> +++ b/drivers/common/qat/qat_device.h
> @@ -20,6 +20,8 @@
>  #define SYM_ENQ_THRESHOLD_NAME "qat_sym_enq_threshold"
>  #define ASYM_ENQ_THRESHOLD_NAME "qat_asym_enq_threshold"
>  #define COMP_ENQ_THRESHOLD_NAME "qat_comp_enq_threshold"
> +#define QAT_CMD_SLICE_MAP "qat_slice_disable"
[KJ] consider to rename the command to match the #define name 

> +#define QAT_CMD_SLICE_MAP_POS4
>  #define MAX_QP_THRESHOLD_SIZE32
> 


> 
>  extern struct qat_dev_hw_spec_funcs *qat_dev_hw_spec[]; diff --git
> a/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
> b/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
> index 4499fdaf2d..67b1892c32 100644
> --- a/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
> +++ b/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
> @@ -41,14 +41,42 @@ static struct rte_cryptodev_capabilities
> qat_asym_crypto_caps_gen1[] = {
>   RTE_CRYPTODEV_END_OF_C

[PATCH v10 0/7] introduce memarea library

2022-10-16 Thread Chengwen Feng
The memarea library is an allocator of variable-size object which based
on a memory region. The main features are as follows:

- The memory region can be initialized from the following memory
  sources:
  1. HEAP: e.g. invoke rte_malloc_socket.
  2. LIBC: e.g. invoke posix_memalign.
  3. Another memarea: it can be from another memarea.

- It provides refcnt feature which could be useful in multi-reader
  scenario.

- It supports MT-safe as long as it's specified at creation time.

Note:
a) The memarea is oriented towards the application layer, which could
provides 'region-based memory management' [1] function.
b) The eal library also provide memory zone/heap management, but these
are tied to huge pages management.

[1] https://en.wikipedia.org/wiki/Region-based_memory_management

Signed-off-by: Chengwen Feng 

Chengwen Feng (7):
  memarea: introduce memarea library
  test/memarea: support memarea test
  memarea: support alloc/free/refcnt-update API
  test/memarea: support alloc/free/refcnt-update test
  memarea: support dump API
  test/memarea: support dump test
  app/test: add memarea to malloc-perf-autotest

---
v10:
* support windows platform.
* add rte_memarea.libc perftest to malloc-perf-autotest.
v9:
* address Dmitry's comments.
* drop features of SOURCE_USER and backup memarea mechanism.
* rename rte_memarea_update_refcnt to rte_memarea_refcnt_update
  to keep with rte_mbuf_refcnt_update name style.
* fix memarea perftest compile failed at windows platform.
* fix spell warning.
v8:
* address Mattias's comments (rename ALG_DEFAULT with ALG_NEXTFIT).
* small feature patches are combined.
* enhanced backup memory mechanism.
* add memarea to malloc-perf-autotest.
* other tiny naming optimize.
v7:
* repost patches as there are spread over different series in patchwork.
v6:
* address Mattias's comments.
v5:
* fix 09/10 patch spell warning.
v4:
* repost patches as there are spread over different series in patchwork.
v3:
* add memory source of RTE memory.
* add algorithm field to facilitate the introduction of new algorithms.
* fix memarea log don't output problem.
v2:
* fix compile issues reported by dpdk-test-report.
* address Dimitry and Jerin's comments.
* add no MT-safe test.

 MAINTAINERS|   6 +
 app/test/meson.build   |   2 +
 app/test/test_malloc_perf.c|  82 +
 app/test/test_memarea.c| 308 ++
 doc/api/doxy-api-index.md  |   3 +-
 doc/api/doxy-api.conf.in   |   1 +
 doc/guides/prog_guide/index.rst|   1 +
 doc/guides/prog_guide/memarea_lib.rst  |  52 
 doc/guides/rel_notes/release_22_11.rst |   6 +
 lib/eal/common/eal_common_log.c|   1 +
 lib/eal/include/rte_log.h  |   1 +
 lib/memarea/memarea_private.h  |  36 +++
 lib/memarea/meson.build|  10 +
 lib/memarea/rte_memarea.c  | 413 +
 lib/memarea/rte_memarea.h  | 219 +
 lib/memarea/version.map|  16 +
 lib/meson.build|   1 +
 17 files changed, 1157 insertions(+), 1 deletion(-)
 create mode 100644 app/test/test_memarea.c
 create mode 100644 doc/guides/prog_guide/memarea_lib.rst
 create mode 100644 lib/memarea/memarea_private.h
 create mode 100644 lib/memarea/meson.build
 create mode 100644 lib/memarea/rte_memarea.c
 create mode 100644 lib/memarea/rte_memarea.h
 create mode 100644 lib/memarea/version.map

-- 
2.17.1



[PATCH v10 2/7] test/memarea: support memarea test

2022-10-16 Thread Chengwen Feng
This patch supports memarea test of rte_memarea_create() and
rte_memarea_destroy() API.

Signed-off-by: Chengwen Feng 
---
 MAINTAINERS |   1 +
 app/test/meson.build|   2 +
 app/test/test_memarea.c | 130 
 3 files changed, 133 insertions(+)
 create mode 100644 app/test/test_memarea.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3d8d4c2dbe..38625ae1be 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1555,6 +1555,7 @@ Memarea - EXPERIMENTAL
 M: Chengwen Feng 
 F: lib/memarea
 F: doc/guides/prog_guide/memarea_lib.rst
+F: app/test/test_memarea*
 
 Membership - EXPERIMENTAL
 M: Yipeng Wang 
diff --git a/app/test/meson.build b/app/test/meson.build
index 396b133959..8943f24668 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -84,6 +84,7 @@ test_sources = files(
 'test_malloc.c',
 'test_malloc_perf.c',
 'test_mbuf.c',
+'test_memarea.c',
 'test_member.c',
 'test_member_perf.c',
 'test_memcpy.c',
@@ -200,6 +201,7 @@ fast_tests = [
 ['malloc_autotest', false, true],
 ['mbuf_autotest', false, true],
 ['mcslock_autotest', false, true],
+['memarea_autotest', true, true],
 ['memcpy_autotest', true, true],
 ['memory_autotest', false, true],
 ['mempool_autotest', false, true],
diff --git a/app/test/test_memarea.c b/app/test/test_memarea.c
new file mode 100644
index 00..cfd50de45f
--- /dev/null
+++ b/app/test/test_memarea.c
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 HiSilicon Limited
+ */
+
+#include 
+#include 
+
+#include "test.h"
+
+#include 
+#include 
+
+#define MEMAREA_TEST_DEFAULT_SIZE  0x1000
+
+#define MEMAREA_TEST_API_RUN(test_func) \
+   do { \
+   int ret = test_func(); \
+   if (ret < 0) { \
+   printf("%s Failed\n", #test_func); \
+   fails++; \
+   } else { \
+   printf("%s Passed\n", #test_func); \
+   } \
+   } while (0)
+
+static int fails;
+
+static void
+test_memarea_prepare(void)
+{
+   fails = 0;
+}
+
+static int
+test_memarea_retcode(void)
+{
+   return fails > 0 ? -1 : 0;
+}
+
+static void
+test_memarea_init_param(struct rte_memarea_param *init)
+{
+   memset(init, 0, sizeof(struct rte_memarea_param));
+   sprintf(init->name, "%s", "test-memarea");
+   init->source = RTE_MEMAREA_SOURCE_LIBC;
+   init->total_sz = MEMAREA_TEST_DEFAULT_SIZE;
+   init->mt_safe = 1;
+}
+
+static int
+test_memarea_create_bad_param(void)
+{
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+
+   /* test for NULL */
+   ma = rte_memarea_create(NULL);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   /* test for invalid name */
+   memset(&init, 0, sizeof(init));
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+   memset(&init.name, 1, sizeof(init.name));
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   /* test for invalid source */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_MEMAREA + 1;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   /* test for total_sz */
+   test_memarea_init_param(&init);
+   init.total_sz = 0;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   /* test for memarea NULL */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_MEMAREA;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   /* test for alg invalid */
+   test_memarea_init_param(&init);
+   init.alg = RTE_MEMAREA_ALG_NEXTFIT + 1;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma == NULL, "Expected NULL");
+
+   return 0;
+}
+
+static int
+test_memarea_create_destroy(void)
+{
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+
+   /* test for create with HEAP */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_HEAP;
+   init.numa_socket = SOCKET_ID_ANY;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+   rte_memarea_destroy(ma);
+
+   /* test for create with LIBC */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+   rte_memarea_destroy(ma);
+
+   return 0;
+}
+
+static int
+test_memarea(void)
+{
+   test_memarea_prepare();
+
+   MEMAREA_TEST_API_RUN(test_memarea_create_bad_param);
+   MEMAREA_TEST_API_RUN(test_memarea_create_destroy);
+
+   return test_memarea_retcode();
+}
+
+REGISTER_TEST_CO

[PATCH v10 3/7] memarea: support alloc/free/refcnt-update API

2022-10-16 Thread Chengwen Feng
This patch supports rte_memarea_alloc()/rte_memarea_free()/
rte_memarea_refcnt_update() API.

Signed-off-by: Chengwen Feng 
---
 doc/guides/prog_guide/memarea_lib.rst |  10 ++
 lib/memarea/memarea_private.h |   3 +
 lib/memarea/rte_memarea.c | 157 ++
 lib/memarea/rte_memarea.h |  74 
 lib/memarea/version.map   |   3 +
 5 files changed, 247 insertions(+)

diff --git a/doc/guides/prog_guide/memarea_lib.rst 
b/doc/guides/prog_guide/memarea_lib.rst
index d032b24ce9..8fbfd06a5d 100644
--- a/doc/guides/prog_guide/memarea_lib.rst
+++ b/doc/guides/prog_guide/memarea_lib.rst
@@ -33,6 +33,16 @@ returns the pointer to the created memarea or ``NULL`` if 
the creation failed.
 
 The ``rte_memarea_destroy()`` function is used to destroy a memarea.
 
+The ``rte_memarea_alloc()`` function is used to alloc one memory object from
+the memarea.
+
+The ``rte_memarea_free()`` function is used to free one memory object which
+allocated by ``rte_memarea_alloc()``.
+
+The ``rte_memarea_refcnt_update()`` function is used to update the memory
+object's reference count, if the count reaches zero, the memory object will
+be freed to memarea.
+
 Reference
 -
 
diff --git a/lib/memarea/memarea_private.h b/lib/memarea/memarea_private.h
index 59be9c1d00..f5accf2987 100644
--- a/lib/memarea/memarea_private.h
+++ b/lib/memarea/memarea_private.h
@@ -28,6 +28,9 @@ struct rte_memarea {
void*area_addr;
struct memarea_elem_list elem_list;
struct memarea_elem_list free_list;
+
+   uint64_t alloc_fails;
+   uint64_t refcnt_check_fails;
 } __rte_cache_aligned;
 
 #endif /* MEMAREA_PRIVATE_H */
diff --git a/lib/memarea/rte_memarea.c b/lib/memarea/rte_memarea.c
index 41e5c007c2..a0935dafb6 100644
--- a/lib/memarea/rte_memarea.c
+++ b/lib/memarea/rte_memarea.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -84,6 +85,8 @@ memarea_alloc_area(const struct rte_memarea_param *init)
init->numa_socket);
else if (init->source == RTE_MEMAREA_SOURCE_LIBC)
ptr = memarea_alloc_from_libc(init->total_sz);
+   else if (init->source == RTE_MEMAREA_SOURCE_MEMAREA)
+   ptr = rte_memarea_alloc(init->src_memarea, init->total_sz, 0);
 
if (ptr == NULL)
RTE_LOG(ERR, MEMAREA, "memarea: %s alloc memory area fail!\n", 
init->name);
@@ -146,6 +149,8 @@ memarea_free_area(struct rte_memarea *ma)
rte_free(ma->area_addr);
else if (ma->init.source == RTE_MEMAREA_SOURCE_LIBC)
free(ma->area_addr);
+   else if (ma->init.source == RTE_MEMAREA_SOURCE_MEMAREA)
+   rte_memarea_free(ma->init.src_memarea, ma->area_addr);
 }
 
 void
@@ -156,3 +161,155 @@ rte_memarea_destroy(struct rte_memarea *ma)
memarea_free_area(ma);
rte_free(ma);
 }
+
+static inline void
+memarea_lock(struct rte_memarea *ma)
+{
+   if (ma->init.mt_safe)
+   rte_spinlock_lock(&ma->lock);
+}
+
+static inline void
+memarea_unlock(struct rte_memarea *ma)
+{
+   if (ma->init.mt_safe)
+   rte_spinlock_unlock(&ma->lock);
+}
+
+static inline bool
+memarea_whether_add_node(size_t free_size, size_t need_size)
+{
+   size_t align_size = RTE_ALIGN_CEIL(need_size, RTE_CACHE_LINE_SIZE);
+   return free_size > align_size && (free_size - align_size) > 
sizeof(struct memarea_elem);
+}
+
+static inline void
+memarea_add_node(struct rte_memarea *ma, struct memarea_elem *elem, size_t 
need_size)
+{
+   size_t align_size = RTE_ALIGN_CEIL(need_size, RTE_CACHE_LINE_SIZE);
+   struct memarea_elem *new_elem;
+
+   new_elem = (struct memarea_elem *)RTE_PTR_ADD(elem, sizeof(struct 
memarea_elem) +
+   align_size);
+   new_elem->size = elem->size - align_size - sizeof(struct memarea_elem);
+   new_elem->magic = MEMAREA_AVAILABLE_ELEM_MAGIC;
+   new_elem->cookie = MEMAREA_AVAILABLE_ELEM_COOKIE;
+   new_elem->refcnt = 0;
+   TAILQ_INSERT_AFTER(&ma->elem_list, elem, new_elem, elem_node);
+   TAILQ_INSERT_AFTER(&ma->free_list, elem, new_elem, free_node);
+   elem->size = align_size;
+}
+
+void *
+rte_memarea_alloc(struct rte_memarea *ma, size_t size, uint32_t cookie)
+{
+   struct memarea_elem *elem;
+   void *ptr = NULL;
+
+   if (unlikely(ma == NULL || size == 0))
+   return NULL;
+
+   memarea_lock(ma);
+   TAILQ_FOREACH(elem, &ma->free_list, free_node) {
+   if (unlikely(elem->magic != MEMAREA_AVAILABLE_ELEM_MAGIC))
+   break;
+   if (elem->size < size)
+   continue;
+   if (memarea_whether_add_node(elem->size, size))
+   memarea_add_node(ma, elem, size);
+   elem->magic = MEMAREA_ALLOCATED_ELEM_MAGIC;
+   elem->cookie = coo

[PATCH v10 1/7] memarea: introduce memarea library

2022-10-16 Thread Chengwen Feng
The memarea library is an allocator of variable-size object which based
on a memory region.

This patch provides rte_memarea_create() and rte_memarea_destroy() API.

Signed-off-by: Chengwen Feng 
---
 MAINTAINERS|   5 +
 doc/api/doxy-api-index.md  |   3 +-
 doc/api/doxy-api.conf.in   |   1 +
 doc/guides/prog_guide/index.rst|   1 +
 doc/guides/prog_guide/memarea_lib.rst  |  39 ++
 doc/guides/rel_notes/release_22_11.rst |   6 +
 lib/eal/common/eal_common_log.c|   1 +
 lib/eal/include/rte_log.h  |   1 +
 lib/memarea/memarea_private.h  |  33 ++
 lib/memarea/meson.build|  10 ++
 lib/memarea/rte_memarea.c  | 158 +
 lib/memarea/rte_memarea.h  | 124 +++
 lib/memarea/version.map|  12 ++
 lib/meson.build|   1 +
 14 files changed, 394 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/prog_guide/memarea_lib.rst
 create mode 100644 lib/memarea/memarea_private.h
 create mode 100644 lib/memarea/meson.build
 create mode 100644 lib/memarea/rte_memarea.c
 create mode 100644 lib/memarea/rte_memarea.h
 create mode 100644 lib/memarea/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 2bd4a55f1b..3d8d4c2dbe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1551,6 +1551,11 @@ F: app/test/test_lpm*
 F: app/test/test_func_reentrancy.c
 F: app/test/test_xmmt_ops.h
 
+Memarea - EXPERIMENTAL
+M: Chengwen Feng 
+F: lib/memarea
+F: doc/guides/prog_guide/memarea_lib.rst
+
 Membership - EXPERIMENTAL
 M: Yipeng Wang 
 M: Sameh Gobriel 
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index de488c7abf..24456604f8 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -62,7 +62,8 @@ The public API headers are grouped by topics:
   [memzone](@ref rte_memzone.h),
   [mempool](@ref rte_mempool.h),
   [malloc](@ref rte_malloc.h),
-  [memcpy](@ref rte_memcpy.h)
+  [memcpy](@ref rte_memcpy.h),
+  [memarea](@ref rte_memarea.h)
 
 - **timers**:
   [cycles](@ref rte_cycles.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index f0886c3bd1..8334ebcbd6 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -53,6 +53,7 @@ INPUT   = @TOPDIR@/doc/api/doxy-api-index.md \
   @TOPDIR@/lib/latencystats \
   @TOPDIR@/lib/lpm \
   @TOPDIR@/lib/mbuf \
+  @TOPDIR@/lib/memarea \
   @TOPDIR@/lib/member \
   @TOPDIR@/lib/mempool \
   @TOPDIR@/lib/meter \
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 8564883018..e9015d65e3 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -37,6 +37,7 @@ Programmer's Guide
 hash_lib
 toeplitz_hash_lib
 efd_lib
+memarea_lib
 member_lib
 lpm_lib
 lpm6_lib
diff --git a/doc/guides/prog_guide/memarea_lib.rst 
b/doc/guides/prog_guide/memarea_lib.rst
new file mode 100644
index 00..d032b24ce9
--- /dev/null
+++ b/doc/guides/prog_guide/memarea_lib.rst
@@ -0,0 +1,39 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2022 HiSilicon Limited
+
+Memarea Library
+===
+
+Introduction
+
+
+The memarea library provides an allocator of variable-size objects, it is
+oriented towards the application layer, providing 'region-based memory
+management' function [1].
+
+The main features are as follows:
+
+* The memory region can be initialized from the following memory sources:
+
+  - HEAP: e.g. invoke ``rte_malloc_socket``.
+
+  - LIBC: e.g. invoke posix_memalign.
+
+  - Another memarea: it can be allocated from another memarea.
+
+* It provides refcnt feature which could be useful in multi-reader scenario.
+
+* It supports MT-safe as long as it's specified at creation time.
+
+Library API Overview
+
+
+The ``rte_memarea_create()`` function is used to create a memarea, the function
+returns the pointer to the created memarea or ``NULL`` if the creation failed.
+
+The ``rte_memarea_destroy()`` function is used to destroy a memarea.
+
+Reference
+-
+
+[1] https://en.wikipedia.org/wiki/Region-based_memory_management
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 2da8bc9661..f8e3a23db7 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -63,6 +63,12 @@ New Features
   In theory this implementation should work with any target based on
   ``LoongArch`` ISA.
 
+* **Added memarea library.**
+
+  The memarea library is an allocator of variable-size objects, it is oriented
+  towards the application layer, providing 'region-based memory management'
+  function.
+
 * **Added support for multiple mbuf pools per ethdev R

[PATCH v10 4/7] test/memarea: support alloc/free/refcnt-update test

2022-10-16 Thread Chengwen Feng
This patch supports rte_memarea_alloc()/rte_memarea_free()/
rte_memarea_refcnt_update() test.

Signed-off-by: Chengwen Feng 
---
 app/test/test_memarea.c | 145 +++-
 1 file changed, 144 insertions(+), 1 deletion(-)

diff --git a/app/test/test_memarea.c b/app/test/test_memarea.c
index cfd50de45f..00f9d07ed4 100644
--- a/app/test/test_memarea.c
+++ b/app/test/test_memarea.c
@@ -47,6 +47,12 @@ test_memarea_init_param(struct rte_memarea_param *init)
init->mt_safe = 1;
 }
 
+static void
+test_memarea_fill_region(void *ptr, size_t size)
+{
+   memset(ptr, 0xff, size);
+}
+
 static int
 test_memarea_create_bad_param(void)
 {
@@ -95,8 +101,8 @@ test_memarea_create_bad_param(void)
 static int
 test_memarea_create_destroy(void)
 {
+   struct rte_memarea *ma, *src_ma;
struct rte_memarea_param init;
-   struct rte_memarea *ma;
 
/* test for create with HEAP */
test_memarea_init_param(&init);
@@ -113,6 +119,140 @@ test_memarea_create_destroy(void)
RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
rte_memarea_destroy(ma);
 
+   /* test for create with another memarea */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   src_ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(src_ma != NULL, "Expected Non-NULL");
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_MEMAREA;
+   init.total_sz = init.total_sz >> 1;
+   init.src_memarea = src_ma;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+   rte_memarea_destroy(ma);
+   rte_memarea_destroy(src_ma);
+
+   return 0;
+}
+
+static int
+test_memarea_alloc_fail(void)
+{
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+   void *ptr[2];
+
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   init.total_sz = MEMAREA_TEST_DEFAULT_SIZE;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+
+   /* test alloc fail with big size */
+   ptr[0] = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE, 0);
+   RTE_TEST_ASSERT(ptr[0] == NULL, "Expected NULL");
+
+   /* test alloc fail because no memory */
+   ptr[0] = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE - 
RTE_CACHE_LINE_SIZE, 0);
+   RTE_TEST_ASSERT(ptr[0] != NULL, "Expected Non-NULL");
+   ptr[1] = rte_memarea_alloc(ma, 1, 0);
+   RTE_TEST_ASSERT(ptr[1] == NULL, "Expected NULL");
+   rte_memarea_free(ma, ptr[0]);
+
+   /* test alloc fail when second fail */
+   ptr[0] = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr[0] != NULL, "Expected Non-NULL");
+   test_memarea_fill_region(ptr[0], MEMAREA_TEST_DEFAULT_SIZE >> 1);
+   ptr[1] = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr[1] == NULL, "Expected NULL");
+   rte_memarea_free(ma, ptr[0]);
+   ptr[1] = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr[1] != NULL, "Expected Non-NULL");
+   test_memarea_fill_region(ptr[1], MEMAREA_TEST_DEFAULT_SIZE >> 1);
+   rte_memarea_free(ma, ptr[1]);
+
+   rte_memarea_destroy(ma);
+
+   return 0;
+}
+
+static int
+test_memarea_free_fail(void)
+{
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+   void *ptr;
+
+   /* prepare env */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   init.total_sz = MEMAREA_TEST_DEFAULT_SIZE;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+
+   /* test free with refcnt fail */
+   ptr = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr != NULL, "Expected Non-NULL");
+   test_memarea_fill_region(ptr, MEMAREA_TEST_DEFAULT_SIZE >> 1);
+   rte_memarea_free(ma, ptr);
+   rte_memarea_free(ma, ptr);
+
+   /* test update refcnt with fail */
+   ptr = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr != NULL, "Expected Non-NULL");
+   test_memarea_fill_region(ptr, MEMAREA_TEST_DEFAULT_SIZE >> 1);
+   rte_memarea_refcnt_update(ma, ptr, -2);
+   ptr = rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE >> 1, 0);
+   RTE_TEST_ASSERT(ptr == NULL, "Expected NULL");
+
+   rte_memarea_destroy(ma);
+
+   return 0;
+}
+
+static int
+test_memarea_alloc_free(void)
+{
+#define ALLOC_MAX_NUM  8
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+   void *ptr[ALLOC_MAX_NUM];
+   int i;
+
+   /* prepare env */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   init.total_sz = MEMAREA_TEST_DEFAULT_SIZE;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Ex

[PATCH v10 6/7] test/memarea: support dump test

2022-10-16 Thread Chengwen Feng
This patch supports rte_memarea_dump() test.

Signed-off-by: Chengwen Feng 
---
 app/test/test_memarea.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/app/test/test_memarea.c b/app/test/test_memarea.c
index 00f9d07ed4..da712a14e8 100644
--- a/app/test/test_memarea.c
+++ b/app/test/test_memarea.c
@@ -256,6 +256,40 @@ test_memarea_alloc_free(void)
return 0;
 }
 
+static int
+test_memarea_dump(void)
+{
+   struct rte_memarea_param init;
+   struct rte_memarea *ma;
+   int ret;
+
+   /* prepare env */
+   test_memarea_init_param(&init);
+   init.source = RTE_MEMAREA_SOURCE_LIBC;
+   init.total_sz = MEMAREA_TEST_DEFAULT_SIZE;
+   ma = rte_memarea_create(&init);
+   RTE_TEST_ASSERT(ma != NULL, "Expected Non-NULL");
+
+   /* test for invalid parameters */
+   ret = rte_memarea_dump(NULL, stderr, false);
+   RTE_TEST_ASSERT(ret == -EINVAL, "Expected EINVAL");
+   ret = rte_memarea_dump(ma, NULL, false);
+   RTE_TEST_ASSERT(ret == -EINVAL, "Expected EINVAL");
+
+   /* test for dump */
+   (void)rte_memarea_alloc(ma, 1, 0);
+   (void)rte_memarea_alloc(ma, 1, 0);
+   (void)rte_memarea_alloc(ma, 1, 0);
+   (void)rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE, 0);
+   (void)rte_memarea_alloc(ma, MEMAREA_TEST_DEFAULT_SIZE, 0);
+   ret = rte_memarea_dump(ma, stderr, true);
+   RTE_TEST_ASSERT(ret == 0, "Expected ZERO");
+
+   rte_memarea_destroy(ma);
+
+   return 0;
+}
+
 static int
 test_memarea(void)
 {
@@ -266,6 +300,7 @@ test_memarea(void)
MEMAREA_TEST_API_RUN(test_memarea_alloc_fail);
MEMAREA_TEST_API_RUN(test_memarea_free_fail);
MEMAREA_TEST_API_RUN(test_memarea_alloc_free);
+   MEMAREA_TEST_API_RUN(test_memarea_dump);
 
return test_memarea_retcode();
 }
-- 
2.17.1



[PATCH v10 5/7] memarea: support dump API

2022-10-16 Thread Chengwen Feng
This patch supports rte_memarea_dump() API which could be used for
debug.

Signed-off-by: Chengwen Feng 
---
 doc/guides/prog_guide/memarea_lib.rst |  3 +
 lib/memarea/rte_memarea.c | 98 +++
 lib/memarea/rte_memarea.h | 21 ++
 lib/memarea/version.map   |  1 +
 4 files changed, 123 insertions(+)

diff --git a/doc/guides/prog_guide/memarea_lib.rst 
b/doc/guides/prog_guide/memarea_lib.rst
index 8fbfd06a5d..c33a548417 100644
--- a/doc/guides/prog_guide/memarea_lib.rst
+++ b/doc/guides/prog_guide/memarea_lib.rst
@@ -43,6 +43,9 @@ The ``rte_memarea_refcnt_update()`` function is used to 
update the memory
 object's reference count, if the count reaches zero, the memory object will
 be freed to memarea.
 
++The ``rte_memarea_dump()`` function is used to dump the internal information
++of a memarea.
+
 Reference
 -
 
diff --git a/lib/memarea/rte_memarea.c b/lib/memarea/rte_memarea.c
index a0935dafb6..c58f68df53 100644
--- a/lib/memarea/rte_memarea.c
+++ b/lib/memarea/rte_memarea.c
@@ -2,6 +2,7 @@
  * Copyright(c) 2022 HiSilicon Limited
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -313,3 +314,100 @@ rte_memarea_refcnt_update(struct rte_memarea *ma, void 
*ptr, int32_t value)
memarea_free_elem(ma, elem);
memarea_unlock(ma);
 }
+
+static const char *
+memarea_source_name(enum rte_memarea_source source)
+{
+   if (source == RTE_MEMAREA_SOURCE_HEAP)
+   return "heap";
+   else if (source == RTE_MEMAREA_SOURCE_LIBC)
+   return "libc";
+   else if (source == RTE_MEMAREA_SOURCE_MEMAREA)
+   return "memarea";
+   else
+   return "unknown";
+}
+
+static const char *
+memarea_alg_name(enum rte_memarea_alg alg)
+{
+   if (alg == RTE_MEMAREA_ALG_NEXTFIT)
+   return "nextfit";
+   else
+   return "unknown";
+}
+
+static uint32_t
+memarea_elem_list_num(struct rte_memarea *ma)
+{
+   struct memarea_elem *elem;
+   uint32_t num = 0;
+
+   TAILQ_FOREACH(elem, &ma->elem_list, elem_node) {
+   if (elem->magic != MEMAREA_AVAILABLE_ELEM_MAGIC &&
+   elem->magic != MEMAREA_ALLOCATED_ELEM_MAGIC)
+   break;
+   num++;
+   }
+
+   return num;
+}
+
+static uint32_t
+memarea_free_list_num(struct rte_memarea *ma)
+{
+   struct memarea_elem *elem;
+   uint32_t num = 0;
+
+   TAILQ_FOREACH(elem, &ma->free_list, free_node) {
+   if (elem->magic != MEMAREA_AVAILABLE_ELEM_MAGIC)
+   break;
+   num++;
+   }
+
+   return num;
+}
+
+static void
+memarea_dump_all(struct rte_memarea *ma, FILE *f)
+{
+   struct memarea_elem *elem;
+
+   fprintf(f, "  regions:\n");
+   TAILQ_FOREACH(elem, &ma->elem_list, elem_node) {
+   if (elem->magic != MEMAREA_AVAILABLE_ELEM_MAGIC &&
+   elem->magic != MEMAREA_ALLOCATED_ELEM_MAGIC) {
+   fprintf(f, "magic: 0x%x check fail!\n", 
elem->magic);
+   break;
+   }
+   fprintf(f, "size: 0x%zx cookie: 0x%x refcnt: %d\n",
+   elem->size, elem->cookie, elem->refcnt);
+   }
+}
+
+int
+rte_memarea_dump(struct rte_memarea *ma, FILE *f, bool dump_all)
+{
+   if (ma == NULL || f == NULL)
+   return -EINVAL;
+
+   memarea_lock(ma);
+   fprintf(f, "memarea name: %s\n", ma->init.name);
+   fprintf(f, "  source: %s\n", memarea_source_name(ma->init.source));
+   if (ma->init.source == RTE_MEMAREA_SOURCE_HEAP)
+   fprintf(f, "  heap-numa-socket: %d\n", ma->init.numa_socket);
+   else if (ma->init.source == RTE_MEMAREA_SOURCE_MEMAREA)
+   fprintf(f, "  source-memarea: %s\n", 
ma->init.src_memarea->init.name);
+   fprintf(f, "  algorithm: %s\n", memarea_alg_name(ma->init.alg));
+   fprintf(f, "  total-size: 0x%zx\n", ma->init.total_sz);
+   fprintf(f, "  mt-safe: %s\n", ma->init.mt_safe ? "yes" : "no");
+   fprintf(f, "  total-regions: %u\n", memarea_elem_list_num(ma));
+   fprintf(f, "  total-free-regions: %u\n", memarea_free_list_num(ma));
+   fprintf(f, "  alloc_fails: %" PRIu64 "\n", ma->alloc_fails);
+   fprintf(f, "  refcnt_check_fails: %" PRIu64 "\n", 
ma->refcnt_check_fails);
+   if (dump_all)
+   memarea_dump_all(ma, f);
+   memarea_unlock(ma);
+
+   return 0;
+}
diff --git a/lib/memarea/rte_memarea.h b/lib/memarea/rte_memarea.h
index 31c499ec07..61042a8d0d 100644
--- a/lib/memarea/rte_memarea.h
+++ b/lib/memarea/rte_memarea.h
@@ -191,6 +191,27 @@ void rte_memarea_free(struct rte_memarea *ma, void *ptr);
 __rte_experimental
 void rte_memarea_refcnt_update(struct rte_memarea *ma, void *ptr, int32_t 
value);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Dump memarea.
+ *
+ * Dump one memarea.
+ *
+ * @param 

[PATCH v10 7/7] app/test: add memarea to malloc-perf-autotest

2022-10-16 Thread Chengwen Feng
This patch adds memarea perftest to malloc_perf_autotest.

The memarea perftest contains two part:
1) rte_memarea.heap: the memory source comes from invoking
rte_malloc_socket().
2) rte_memarea.libc: the memory source comes from invoking malloc().

Test platform: Kunpeng920
Test command: dpdk-test -a :7d:00.3 -l 10-12
Test result:
USER1: Performance: rte_memarea.heap
USER1:Size (B)   Runs  Alloc (us) Free (us) Total (us)  memset (us)
USER1:  64  10.04  0.04   0.08 0.01
USER1: 128  10.03  0.03   0.06 0.01
USER1:1024  10.03  0.04   0.07 0.20
USER1:4096  10.03  0.05   0.08 0.34
USER1:   65536  10.10  0.06   0.16 2.15
USER1: 1048576   10230.11  0.04   0.1529.15
USER1: 20971525110.12  0.04   0.1657.72
USER1: 41943042550.14  0.04   0.17   114.93
USER1:16777216 630.15  0.07   0.21   457.51
USER1:  1073741824 Interrupted: out of memory.

USER1: Performance: rte_memarea.libc
USER1:Size (B)   Runs  Alloc (us) Free (us) Total (us)  memset (us)
USER1:  64  10.12  0.03   0.15 0.01
USER1: 128  10.03  0.03   0.06 0.01
USER1:1024  10.15  0.04   0.19 0.19
USER1:4096  10.51  0.04   0.54 0.34
USER1:   65536  19.40  0.06   9.46 2.14
USER1: 1048576   1023   48.39  0.07  48.4629.15
USER1: 20971525110.11  0.06   0.1757.79
USER1: 41943042550.11  0.06   0.18   114.82
USER1:16777216 630.12  0.07   0.19   457.86
USER1:  1073741824 Interrupted: out of memory.

Compared with rte_malloc:
USER1: Performance: rte_malloc
USER1:Size (B)   Runs  Alloc (us) Free (us) Total (us)  memset (us)
USER1:  64  10.14  0.07   0.21 0.01
USER1: 128  10.10  0.05   0.15 0.01
USER1:1024  10.11  0.18   0.29 0.21
USER1:4096  10.13  0.39   0.53 0.35
USER1:   65536  10.17  2.27   2.44 2.15
USER1: 1048576  1   37.21 71.63 108.8429.08
USER1: 2097152  1 8831.15160.028991.1763.52
USER1: 4194304  147131.88413.75   47545.62   173.79
USER1:16777216   4221   119604.60   2209.73  121814.34   964.42
USER1:  1073741824 31   335058.32 223369.31  558427.63 62440.87

Note: The rte_malloc time includes obtaining memory from the system,
but rte_memarea time don't includes these (it uses pre-allocated
policy).

Signed-off-by: Chengwen Feng 
---
 app/test/test_malloc_perf.c | 82 +
 1 file changed, 82 insertions(+)

diff --git a/app/test/test_malloc_perf.c b/app/test/test_malloc_perf.c
index ccec43ae84..0972f4dec7 100644
--- a/app/test/test_malloc_perf.c
+++ b/app/test/test_malloc_perf.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "test.h"
@@ -20,6 +21,8 @@ typedef void * (memset_t)(void *addr, int value, size_t size);
 static const uint64_t KB = 1 << 10;
 static const uint64_t GB = 1 << 30;
 
+static struct rte_memarea *ma_perftest_handle;
+
 static double
 tsc_to_us(uint64_t tsc, size_t runs)
 {
@@ -147,6 +150,80 @@ memzone_free(void *addr)
rte_memzone_free((struct rte_memzone *)addr);
 }
 
+static const char *
+memarea_perftest_source_name(enum rte_memarea_source source)
+{
+   if (source == RTE_MEMAREA_SOURCE_HEAP)
+   return "heap";
+   else if (source == RTE_MEMAREA_SOURCE_LIBC)
+   return "libc";
+   else
+   return "unknown";
+}
+
+static int
+memarea_perftest_pre_env(enum rte_memarea_source source)
+{
+   struct rte_memarea_param init;
+
+   memset(&init, 0, sizeof(init));
+   snprintf(init.name, sizeof(init.name), "perftest");
+   init.source = source;
+   init.alg = RTE_MEMAREA_ALG_NEXTFIT;
+   init.total_sz = GB;
+   init.mt_safe = 1;
+   init.numa_socket = SOCKET_ID_ANY;
+
+   ma_perftest_handle = rte_memarea_create(&init);
+   if (ma_perftest_handle == NULL) {
+   fprintf(stderr, "memarea create failed, skip memarea source: %s 
perftest!\n",
+   memarea_perftest_source_name(source));
+   return -1;
+   }
+   return 0;
+}
+
+static void
+memarea_perftest_clear_env(void)
+{
+   rte_memarea_destroy(ma_perftest_handle);
+   ma_perftest_handle = NULL;
+}
+
+static void *
+memarea_perftest_alloc(const char *name, size_t size, unsigned int align)
+{
+   RTE_SET_USED(name);
+   RTE_SET_USED(

RE: [RFT] dumpcap: add file-prefix option

2022-10-16 Thread Kaur, Arshdeep
Hi Stephen,

These changes are looking good as compared to 
http://patches.dpdk.org/project/dpdk/patch/20220912190330.73159-1-step...@networkplumber.org/.
I have tested the changes. Works for me.
I am looking for this change in this release. Can you send the v1?

Thanks and regards,
Arshdeep Kaur

> -Original Message-
> From: Stephen Hemminger 
> Sent: Tuesday, September 13, 2022 12:34 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ; Kaur, Arshdeep
> 
> Subject: [RFT] dumpcap: add file-prefix option
> 
> When using dumpcap in container environment or with multiple DPDK
> processes, it is useful to be able to specify file prefix.
> 
> This version only accepts the long format option used by other commands.
> If no prefix is specified then the default is used.
> 
> Suggested-by: Arshdeep Kaur 
> Signed-off-by: Stephen Hemminger 
> ---
> Did basic command line test, but still needs testing with a prefix being used
> (ie multiple apps).
> 
>  app/dumpcap/main.c | 24 ++--
>  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c index
> a6041d4ff495..bdeef96d9c0b 100644
> --- a/app/dumpcap/main.c
> +++ b/app/dumpcap/main.c
> @@ -61,6 +61,7 @@ static char *output_name;  static const char
> *filter_str;  static unsigned int ring_size = 2048;  static const char
> *capture_comment;
> +static const char *file_prefix;
>  static uint32_t snaplen = RTE_MBUF_DEFAULT_BUF_SIZE;  static bool
> dump_bpf;  static struct { @@ -122,6 +123,7 @@ static void usage(void)
>  "   add a capture comment to the output 
> file\n"
>  "\n"
>  "Miscellaneous:\n"
> +"  --file-prefix=   prefix to use for multi-process\n"
>  "  -q   don't report packet capture counts\n"
>  "  -v, --versionprint version information and exit\n"
>  "  -h, --help   display this help and exit\n"
> @@ -310,6 +312,7 @@ static void parse_opts(int argc, char **argv)
>   static const struct option long_options[] = {
>   { "autostop",required_argument, NULL, 'a' },
>   { "capture-comment", required_argument, NULL, 0 },
> + { "file-prefix", required_argument, NULL, 0 },
>   { "help",no_argument,   NULL, 'h' },
>   { "interface",   required_argument, NULL, 'i' },
>   { "list-interfaces", no_argument,   NULL, 'D' },
> @@ -330,11 +333,13 @@ static void parse_opts(int argc, char **argv)
> 
>   switch (c) {
>   case 0:
> - switch (option_index) {
> - case 0:
> + if (!strcmp(long_options[option_index].name,
> + "capture-comment")) {
>   capture_comment = optarg;
> - break;
> - default:
> + } else if (!strcmp(long_options[option_index].name,
> +"file-prefix")) {
> + file_prefix = optarg;
> + } else {
>   usage();
>   exit(1);
>   }
> @@ -512,12 +517,14 @@ static void dpdk_init(void)
>   static const char * const args[] = {
>   "dumpcap", "--proc-type", "secondary",
>   "--log-level", "notice"
> -
>   };
> - const int eal_argc = RTE_DIM(args);
> + int eal_argc = RTE_DIM(args);
>   char **eal_argv;
>   unsigned int i;
> 
> + if (file_prefix != NULL)
> + eal_argc += 2;
> +
>   /* DPDK API requires mutable versions of command line arguments.
> */
>   eal_argv = calloc(eal_argc + 1, sizeof(char *));
>   if (eal_argv == NULL)
> @@ -527,6 +534,11 @@ static void dpdk_init(void)
>   for (i = 1; i < RTE_DIM(args); i++)
>   eal_argv[i] = strdup(args[i]);
> 
> + if (file_prefix != NULL) {
> + eal_argv[i++] = strdup("--file-prefix");
> + eal_argv[i++] = strdup(file_prefix);
> + }
> +
>   if (rte_eal_init(eal_argc, eal_argv) < 0)
>   rte_exit(EXIT_FAILURE, "EAL init failed: is primary process
> running?\n");
> 
> --
> 2.35.1



RE: [PATCH 1/4] eventdev: have crypto adapter appropriately report idle

2022-10-16 Thread Gujjar, Abhinandan S
Acked-by: Abhinandan Gujjar 

> -Original Message-
> From: Mattias Rönnblom 
> Sent: Monday, October 10, 2022 8:24 PM
> To: Jayatheerthan, Jay ; Carrillo, Erik G
> ; Gujjar, Abhinandan S
> ; Jerin Jacob 
> Cc: dev@dpdk.org; Van Haaren, Harry ;
> hof...@lysator.liu.se; mattias.ronnblom 
> Subject: [PATCH 1/4] eventdev: have crypto adapter appropriately report idle
> 
> Update the event crypto adapter's service function to report as idle (i.e., 
> return
> -EAGAIN) in case no crypto operations were performed.
> 
> Signed-off-by: Mattias Rönnblom 
> ---
>  lib/eventdev/rte_event_crypto_adapter.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/eventdev/rte_event_crypto_adapter.c
> b/lib/eventdev/rte_event_crypto_adapter.c
> index a11cbcf4f3..5926f6 100644
> --- a/lib/eventdev/rte_event_crypto_adapter.c
> +++ b/lib/eventdev/rte_event_crypto_adapter.c
> @@ -771,7 +771,7 @@ eca_crypto_adapter_deq_run(struct
> event_crypto_adapter *adapter,
>   return nb_deq;
>  }
> 
> -static void
> +static int
>  eca_crypto_adapter_run(struct event_crypto_adapter *adapter,
>  unsigned int max_ops)
>  {
> @@ -791,22 +791,26 @@ eca_crypto_adapter_run(struct
> event_crypto_adapter *adapter,
> 
>   }
> 
> - if (ops_left == max_ops)
> + if (ops_left == max_ops) {
>   rte_event_maintain(adapter->eventdev_id,
>  adapter->event_port_id, 0);
> + return -EAGAIN;
> + } else
> + return 0;
>  }
> 
>  static int
>  eca_service_func(void *args)
>  {
>   struct event_crypto_adapter *adapter = args;
> + int ret;
> 
>   if (rte_spinlock_trylock(&adapter->lock) == 0)
>   return 0;
> - eca_crypto_adapter_run(adapter, adapter->max_nb);
> + ret = eca_crypto_adapter_run(adapter, adapter->max_nb);
>   rte_spinlock_unlock(&adapter->lock);
> 
> - return 0;
> + return ret;
>  }
> 
>  static int
> --
> 2.34.1



RE: [PATCH v4 2/8] vdpa/ifc: add multi-queue support

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:44 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 2/8] vdpa/ifc: add multi-queue support
> 
> Enable VHOST_USER_PROTOCOL_F_MQ feature.
> Expose IFCVF_MQ_OFFSET register to enable multi-queue.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 9 +
>  drivers/vdpa/ifc/base/ifcvf.h | 2 ++
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 1 +
>  3 files changed, 12 insertions(+)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index f1e1474..81c68c0 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -90,6 +90,15 @@
>   if (!hw->lm_cfg)
>   WARNINGOUT("HW support live migration not support!\n");
> 
> + /* For some hardware implementation, for example:
> +  * the BAR 4 of PF is NULL, while BAR 4 of VF is not.
> +  * This code makes sure hw->mq_cfg is a valid address.
> +  */
> + if (hw->mem_resource[4].addr)
> + hw->mq_cfg = hw->mem_resource[4].addr + IFCVF_MQ_OFFSET;
> + else
> + hw->mq_cfg = NULL;
> +
>   if (hw->common_cfg == NULL || hw->notify_base == NULL ||
>   hw->isr == NULL || hw->dev_cfg == NULL) {
>   DEBUGOUT("capability incomplete\n");
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index ef7697a..d16d9ab 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -50,6 +50,7 @@
> 
>  #define IFCVF_LM_CFG_SIZE0x40
>  #define IFCVF_LM_RING_STATE_OFFSET   0x20
> +#define IFCVF_MQ_OFFSET  0x28
> 
>  #define IFCVF_LM_LOGGING_CTRL0x0
> 
> @@ -149,6 +150,7 @@ struct ifcvf_hw {
>   u16*notify_base;
>   u16*notify_addr[IFCVF_MAX_QUEUES * 2];
>   u8 *lm_cfg;
> + u8 *mq_cfg;
>   struct vring_info vring[IFCVF_MAX_QUEUES * 2];
>   u8 nr_vring;
>   int device_type;
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index b4389a0..008cf89 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -1248,6 +1248,7 @@ struct rte_vdpa_dev_info {
>1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
>1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
>1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | \
> +  1ULL << VHOST_USER_PROTOCOL_F_MQ | \
>1ULL << VHOST_USER_PROTOCOL_F_STATUS)
> 
>  #define VDPA_BLK_PROTOCOL_FEATURES \
> --
> 1.8.3.1

Reviewed-by: Chenbo Xia 


RE: [PATCH v4 3/8] vdpa/ifc: set max queues based on virtio spec

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:44 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 3/8] vdpa/ifc: set max queues based on virtio spec
> 
> Set max_queues according to virtio spec.
> For virtio BLK device, set max_queues to the value of num_queues
> in struct virtio_blk_config

Missing '.'

With this fixed:

Reviewed-by: Chenbo Xia 

> For virtio NET device, read num_queues from struct ifcvf_pci_common_cfg,
> get the queue pair number using num_queues and set max_queues to it.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.h |  2 +-
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 19 ++-
>  2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h
> index d16d9ab..1e133c0 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.h
> +++ b/drivers/vdpa/ifc/base/ifcvf.h
> @@ -21,7 +21,7 @@
>  #define IFCVF_SUBSYS_NET_DEVICE_ID  0x0001
>  #define IFCVF_SUBSYS_BLK_DEVICE_ID  0x0002
> 
> -#define IFCVF_MAX_QUEUES 1
> +#define IFCVF_MAX_QUEUES 32
> 
>  #ifndef VIRTIO_F_IOMMU_PLATFORM
>  #define VIRTIO_F_IOMMU_PLATFORM  33
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 008cf89..5a24204 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -26,6 +26,12 @@
> 
>  #include "base/ifcvf.h"
> 
> +/*
> + * RTE_MIN() cannot be used since braced-group within expression allowed
> + * only inside a function.
> + */
> +#define MIN(v1, v2)  ((v1) < (v2) ? (v1) : (v2))
> +
>  RTE_LOG_REGISTER(ifcvf_vdpa_logtype, pmd.vdpa.ifcvf, NOTICE);
>  #define DRV_LOG(level, fmt, args...) \
>   rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \
> @@ -1512,6 +1518,7 @@ struct rte_vdpa_dev_info dev_info[] = {
>   uint64_t capacity = 0;
>   uint8_t *byte;
>   uint32_t i;
> + uint16_t queue_pairs;
> 
>   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
>   return 0;
> @@ -1559,7 +1566,6 @@ struct rte_vdpa_dev_info dev_info[] = {
>   }
> 
>   internal->configured = 0;
> - internal->max_queues = IFCVF_MAX_QUEUES;
>   features = ifcvf_get_features(&internal->hw);
> 
>   device_id = ifcvf_pci_get_device_type(pci_dev);
> @@ -1570,6 +1576,14 @@ struct rte_vdpa_dev_info dev_info[] = {
> 
>   if (device_id == VIRTIO_ID_NET) {
>   internal->hw.device_type = IFCVF_NET;
> + /*
> +  * ifc device always has CTRL_VQ,
> +  * and supports VIRTIO_NET_F_CTRL_VQ feature.
> +  */
> + queue_pairs = (internal->hw.common_cfg->num_queues - 1) / 2;
> + DRV_LOG(INFO, "%s support %u queue pairs", pci_dev->name,
> + queue_pairs);
> + internal->max_queues = MIN(IFCVF_MAX_QUEUES, queue_pairs);
>   internal->features = features &
>   ~(1ULL << VIRTIO_F_IOMMU_PLATFORM);
>   internal->features |= dev_info[IFCVF_NET].features;
> @@ -1609,6 +1623,9 @@ struct rte_vdpa_dev_info dev_info[] = {
>   internal->hw.blk_cfg->geometry.sectors);
>   DRV_LOG(DEBUG, "num_queues: 0x%08x",
>   internal->hw.blk_cfg->num_queues);
> +
> + internal->max_queues = MIN(IFCVF_MAX_QUEUES,
> + internal->hw.blk_cfg->num_queues);
>   }
> 
>   list->internal = internal;
> --
> 1.8.3.1



RE: [PATCH v4 4/8] vdpa/ifc: write queue count to MQ register

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:45 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 4/8] vdpa/ifc: write queue count to MQ register
> 
> Write queue count to IFCVF_MQ_OFFSET register
> to enable multi-queue feature.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 32 
>  1 file changed, 32 insertions(+)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 81c68c0..60c7017 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -202,6 +202,37 @@
>   IFCVF_WRITE_REG32(val >> 32, hi);
>  }
> 
> +STATIC void
> +ifcvf_enable_mq(struct ifcvf_hw *hw)
> +{
> + u8 *mq_cfg;
> + u8 qid;
> + int nr_queue = 0;
> +
> + for (qid = 0; qid < hw->nr_vring; qid++) {
> + if (!hw->vring[qid].enable)
> + continue;
> + nr_queue++;
> + }
> +
> + if (nr_queue == 0) {
> + WARNINGOUT("no enabled vring\n");
> + return;
> + }
> +
> + mq_cfg = hw->mq_cfg;
> + if (mq_cfg) {
> + if (hw->device_type == IFCVF_BLK) {
> + *(u32 *)mq_cfg = nr_queue;
> + RTE_LOG(INFO, PMD, "%d queue are enabled\n", nr_queue);

queue -> queues

With this fixed:

Reviewed-by: Chenbo Xia 

> + } else {
> + *(u32 *)mq_cfg = nr_queue / 2;
> + RTE_LOG(INFO, PMD, "%d queue pairs are enabled\n",
> + nr_queue / 2);
> + }
> + }
> +}
> +
>  STATIC int
>  ifcvf_hw_enable(struct ifcvf_hw *hw)
>  {
> @@ -219,6 +250,7 @@
>   return -1;
>   }
> 
> + ifcvf_enable_mq(hw);
>   for (i = 0; i < hw->nr_vring; i++) {
>   IFCVF_WRITE_REG16(i, &cfg->queue_select);
>   io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
> --
> 1.8.3.1



RE: [PATCH v4 5/8] vdpa/ifc: only configure enabled queue

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:45 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 5/8] vdpa/ifc: only configure enabled queue
> 
> When configuring the hardware queue, we only configure queues which
> have been enabled by vhost.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> Reviewed-by: Chenbo Xia 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c |  3 +++
>  drivers/vdpa/ifc/ifcvf_vdpa.c | 16 ++--
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 60c7017..4d85911 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -252,6 +252,9 @@
> 
>   ifcvf_enable_mq(hw);
>   for (i = 0; i < hw->nr_vring; i++) {
> + if (!hw->vring[i].enable)
> + continue;
> +
>   IFCVF_WRITE_REG16(i, &cfg->queue_select);
>   io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
>   &cfg->queue_desc_hi);
> diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c b/drivers/vdpa/ifc/ifcvf_vdpa.c
> index 5a24204..0c3407a 100644
> --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> @@ -284,6 +284,8 @@ struct rte_vdpa_dev_info {
>   rte_vhost_get_negotiated_features(vid, &hw->req_features);
> 
>   for (i = 0; i < nr_vring; i++) {
> + if (!hw->vring[i].enable)
> + continue;
>   rte_vhost_get_vhost_vring(vid, i, &vq);
>   gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
>   if (gpa == 0) {
> @@ -499,6 +501,8 @@ struct rte_vdpa_dev_info {
> 
>   vring.kickfd = -1;
>   for (qid = 0; qid < q_num; qid++) {
> + if (!hw->vring[qid].enable)
> + continue;
>   ev.events = EPOLLIN | EPOLLPRI;
>   rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
>   ev.data.u64 = qid | (uint64_t)vring.kickfd << 32;
> @@ -1058,6 +1062,8 @@ struct rte_vdpa_dev_info {
>   struct rte_vdpa_device *vdev;
>   struct internal_list *list;
>   struct ifcvf_internal *internal;
> + struct ifcvf_hw *hw;
> + uint16_t i;
> 
>   vdev = rte_vhost_get_vdpa_device(vid);
>   list = find_internal_resource_by_vdev(vdev);
> @@ -1071,11 +1077,17 @@ struct rte_vdpa_dev_info {
>   rte_atomic32_set(&internal->dev_attached, 1);
>   update_datapath(internal);
> 
> - if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, true) !=
> 0)
> - DRV_LOG(NOTICE, "vDPA (%s): software relay is used.",
> + hw = &internal->hw;
> + for (i = 0; i < hw->nr_vring; i++) {
> + if (!hw->vring[i].enable)
> + continue;
> + if (rte_vhost_host_notifier_ctrl(vid, i, true) != 0)
> + DRV_LOG(NOTICE, "vDPA (%s): software relay is used.",
>   vdev->device->name);
> + }
> 
>   internal->configured = 1;
> + DRV_LOG(INFO, "vDPA device %s is configured", vdev->device->name);
>   return 0;
>  }
> 
> --
> 1.8.3.1

Reviewed-by: Chenbo Xia 


RE: [PATCH v4 6/8] vdpa/ifc: support dynamic enable/disable queue

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:45 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 6/8] vdpa/ifc: support dynamic enable/disable queue
> 
> From: Huang Wei 
> 
> Support dynamic enable or disable queue.
> For front end, like QEMU, user can use ethtool to configure queue.
> For example, "ethtool -L eth0 combined 3" to enable 3 queues pairs.
> 
> Signed-off-by: Huang Wei 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 100
> ++
>  drivers/vdpa/ifc/base/ifcvf.h |   6 +++
>  drivers/vdpa/ifc/ifcvf_vdpa.c |  93 -
> --
>  3 files changed, 184 insertions(+), 15 deletions(-)
> 

Reviewed-by: Chenbo Xia 


RE: [PATCH v4 2/8] vdpa/ifc: add multi-queue support

2022-10-16 Thread Pei, Andy
Hi Chenbo,

Thanks for your effort.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:21 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 2/8] vdpa/ifc: add multi-queue support
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:44 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 2/8] vdpa/ifc: add multi-queue support
> >
> > Enable VHOST_USER_PROTOCOL_F_MQ feature.
> > Expose IFCVF_MQ_OFFSET register to enable multi-queue.
> >
> > Signed-off-by: Andy Pei 
> > Signed-off-by: Huang Wei 
> > ---
> >  drivers/vdpa/ifc/base/ifcvf.c | 9 +
> > drivers/vdpa/ifc/base/ifcvf.h | 2 ++  drivers/vdpa/ifc/ifcvf_vdpa.c |
> > 1 +
> >  3 files changed, 12 insertions(+)
> >
> > diff --git a/drivers/vdpa/ifc/base/ifcvf.c
> > b/drivers/vdpa/ifc/base/ifcvf.c index f1e1474..81c68c0 100644
> > --- a/drivers/vdpa/ifc/base/ifcvf.c
> > +++ b/drivers/vdpa/ifc/base/ifcvf.c
> > @@ -90,6 +90,15 @@
> > if (!hw->lm_cfg)
> > WARNINGOUT("HW support live migration not support!\n");
> >
> > +   /* For some hardware implementation, for example:
> > +* the BAR 4 of PF is NULL, while BAR 4 of VF is not.
> > +* This code makes sure hw->mq_cfg is a valid address.
> > +*/
> > +   if (hw->mem_resource[4].addr)
> > +   hw->mq_cfg = hw->mem_resource[4].addr +
> IFCVF_MQ_OFFSET;
> > +   else
> > +   hw->mq_cfg = NULL;
> > +
> > if (hw->common_cfg == NULL || hw->notify_base == NULL ||
> > hw->isr == NULL || hw->dev_cfg == NULL) {
> > DEBUGOUT("capability incomplete\n"); diff --git
> > a/drivers/vdpa/ifc/base/ifcvf.h b/drivers/vdpa/ifc/base/ifcvf.h index
> > ef7697a..d16d9ab 100644
> > --- a/drivers/vdpa/ifc/base/ifcvf.h
> > +++ b/drivers/vdpa/ifc/base/ifcvf.h
> > @@ -50,6 +50,7 @@
> >
> >  #define IFCVF_LM_CFG_SIZE  0x40
> >  #define IFCVF_LM_RING_STATE_OFFSET 0x20
> > +#define IFCVF_MQ_OFFSET0x28
> >
> >  #define IFCVF_LM_LOGGING_CTRL  0x0
> >
> > @@ -149,6 +150,7 @@ struct ifcvf_hw {
> > u16*notify_base;
> > u16*notify_addr[IFCVF_MAX_QUEUES * 2];
> > u8 *lm_cfg;
> > +   u8 *mq_cfg;
> > struct vring_info vring[IFCVF_MAX_QUEUES * 2];
> > u8 nr_vring;
> > int device_type;
> > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > b/drivers/vdpa/ifc/ifcvf_vdpa.c index b4389a0..008cf89 100644
> > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> > @@ -1248,6 +1248,7 @@ struct rte_vdpa_dev_info {
> >  1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD | \
> >  1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER | \
> >  1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD | \
> > +1ULL << VHOST_USER_PROTOCOL_F_MQ | \
> >  1ULL << VHOST_USER_PROTOCOL_F_STATUS)
> >
> >  #define VDPA_BLK_PROTOCOL_FEATURES \
> > --
> > 1.8.3.1
> 
> Reviewed-by: Chenbo Xia 


RE: [PATCH v4 3/8] vdpa/ifc: set max queues based on virtio spec

2022-10-16 Thread Pei, Andy
Hi Chenbo,

Thanks for your effort.


> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:22 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 3/8] vdpa/ifc: set max queues based on virtio spec
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:44 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 3/8] vdpa/ifc: set max queues based on virtio spec
> >
> > Set max_queues according to virtio spec.
> > For virtio BLK device, set max_queues to the value of num_queues in
> > struct virtio_blk_config
> 
> Missing '.'
> 
I will fix in next version.

> With this fixed:
> 
> Reviewed-by: Chenbo Xia 
> 
> > For virtio NET device, read num_queues from struct
> > ifcvf_pci_common_cfg, get the queue pair number using num_queues and
> set max_queues to it.
> >
> > Signed-off-by: Andy Pei 
> > Signed-off-by: Huang Wei 
> > ---
> >  drivers/vdpa/ifc/base/ifcvf.h |  2 +-  drivers/vdpa/ifc/ifcvf_vdpa.c
> > | 19 ++-
> >  2 files changed, 19 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vdpa/ifc/base/ifcvf.h
> > b/drivers/vdpa/ifc/base/ifcvf.h index d16d9ab..1e133c0 100644
> > --- a/drivers/vdpa/ifc/base/ifcvf.h
> > +++ b/drivers/vdpa/ifc/base/ifcvf.h
> > @@ -21,7 +21,7 @@
> >  #define IFCVF_SUBSYS_NET_DEVICE_ID  0x0001
> >  #define IFCVF_SUBSYS_BLK_DEVICE_ID  0x0002
> >
> > -#define IFCVF_MAX_QUEUES   1
> > +#define IFCVF_MAX_QUEUES   32
> >
> >  #ifndef VIRTIO_F_IOMMU_PLATFORM
> >  #define VIRTIO_F_IOMMU_PLATFORM33
> > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > b/drivers/vdpa/ifc/ifcvf_vdpa.c index 008cf89..5a24204 100644
> > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> > @@ -26,6 +26,12 @@
> >
> >  #include "base/ifcvf.h"
> >
> > +/*
> > + * RTE_MIN() cannot be used since braced-group within expression
> > +allowed
> > + * only inside a function.
> > + */
> > +#define MIN(v1, v2)((v1) < (v2) ? (v1) : (v2))
> > +
> >  RTE_LOG_REGISTER(ifcvf_vdpa_logtype, pmd.vdpa.ifcvf, NOTICE);
> > #define DRV_LOG(level, fmt, args...) \
> > rte_log(RTE_LOG_ ## level, ifcvf_vdpa_logtype, \ @@ -1512,6
> +1518,7
> > @@ struct rte_vdpa_dev_info dev_info[] = {
> > uint64_t capacity = 0;
> > uint8_t *byte;
> > uint32_t i;
> > +   uint16_t queue_pairs;
> >
> > if (rte_eal_process_type() != RTE_PROC_PRIMARY)
> > return 0;
> > @@ -1559,7 +1566,6 @@ struct rte_vdpa_dev_info dev_info[] = {
> > }
> >
> > internal->configured = 0;
> > -   internal->max_queues = IFCVF_MAX_QUEUES;
> > features = ifcvf_get_features(&internal->hw);
> >
> > device_id = ifcvf_pci_get_device_type(pci_dev);
> > @@ -1570,6 +1576,14 @@ struct rte_vdpa_dev_info dev_info[] = {
> >
> > if (device_id == VIRTIO_ID_NET) {
> > internal->hw.device_type = IFCVF_NET;
> > +   /*
> > +* ifc device always has CTRL_VQ,
> > +* and supports VIRTIO_NET_F_CTRL_VQ feature.
> > +*/
> > +   queue_pairs = (internal->hw.common_cfg->num_queues - 1)
> / 2;
> > +   DRV_LOG(INFO, "%s support %u queue pairs", pci_dev-
> >name,
> > +   queue_pairs);
> > +   internal->max_queues = MIN(IFCVF_MAX_QUEUES,
> queue_pairs);
> > internal->features = features &
> > ~(1ULL <<
> VIRTIO_F_IOMMU_PLATFORM);
> > internal->features |= dev_info[IFCVF_NET].features; @@ -
> 1609,6
> > +1623,9 @@ struct rte_vdpa_dev_info dev_info[] = {
> > internal->hw.blk_cfg->geometry.sectors);
> > DRV_LOG(DEBUG, "num_queues: 0x%08x",
> > internal->hw.blk_cfg->num_queues);
> > +
> > +   internal->max_queues = MIN(IFCVF_MAX_QUEUES,
> > +   internal->hw.blk_cfg->num_queues);
> > }
> >
> > list->internal = internal;
> > --
> > 1.8.3.1



RE: [PATCH v4 7/8] vhost: vDPA blk device gets ready when any queue is ready

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:45 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 7/8] vhost: vDPA blk device gets ready when any queue
> is ready

This title does not match to the code. You mean first queue?

> 
> When boot from virtio blk device, seabios in QEMU only enables one queue.
> To work in this scenario, vDPA BLK device back-end configure device
> when any queue is ready.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  lib/vhost/vhost_user.c | 49 ++---
> 
>  1 file changed, 30 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index cd65257..0509025 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -1441,9 +1441,10 @@
>  }
> 
>  #define VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY 2u
> +#define VIRTIO_BLK_NUM_VQS_TO_BE_READY 1u
> 
>  static int
> -virtio_is_ready(struct virtio_net *dev)
> +virtio_is_ready(struct virtio_net *dev, uint32_t vdpa_type)

I agree with Maxime's v3 comment. We don't need a new parameter.

Thanks,
Chenbo

>  {
>   struct vhost_virtqueue *vq;
>   uint32_t i, nr_vring = dev->nr_vring;
> @@ -1454,13 +1455,16 @@
>   if (!dev->nr_vring)
>   return 0;
> 
> - if (dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) {
> - nr_vring = VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY;
> -
> - if (dev->nr_vring < nr_vring)
> - return 0;
> + if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK) {
> + nr_vring = VIRTIO_BLK_NUM_VQS_TO_BE_READY;
> + } else {
> + if (dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET)
> + nr_vring = VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY;
>   }
> 
> + if (dev->nr_vring < nr_vring)
> + return 0;
> +
>   for (i = 0; i < nr_vring; i++) {
>   vq = dev->virtqueue[i];
> 
> @@ -2958,7 +2962,7 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   int ret;
>   int unlock_required = 0;
>   bool handled;
> - uint32_t vdpa_type = 0;
> + uint32_t vdpa_type = -1;
>   uint32_t request;
>   uint32_t i;
> 
> @@ -3152,7 +3156,25 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   if (unlock_required)
>   vhost_user_unlock_all_queue_pairs(dev);
> 
> - if (ret != 0 || !virtio_is_ready(dev))
> + if (ret != 0)
> + goto out;
> +
> + vdpa_dev = dev->vdpa_dev;
> + if (vdpa_dev) {
> + if (vdpa_dev->ops->get_dev_type) {
> + ret = vdpa_dev->ops->get_dev_type(vdpa_dev, &vdpa_type);
> + if (ret) {
> + VHOST_LOG_CONFIG(dev->ifname, ERR,
> + "failed to get vdpa dev type.\n");
> + ret = -1;
> + goto out;
> + }
> + } else {
> + vdpa_type = RTE_VHOST_VDPA_DEVICE_TYPE_NET;
> + }
> + }
> +
> + if (!virtio_is_ready(dev, vdpa_type))
>   goto out;
> 
>   /*
> @@ -3166,20 +3188,9 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   dev->flags |= VIRTIO_DEV_RUNNING;
>   }
> 
> - vdpa_dev = dev->vdpa_dev;
>   if (!vdpa_dev)
>   goto out;
> 
> - if (vdpa_dev->ops->get_dev_type) {
> - ret = vdpa_dev->ops->get_dev_type(vdpa_dev, &vdpa_type);
> - if (ret) {
> - VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to get vdpa
> dev type.\n");
> - ret = -1;
> - goto out;
> - }
> - } else {
> - vdpa_type = RTE_VHOST_VDPA_DEVICE_TYPE_NET;
> - }
>   if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK
>   && request != VHOST_USER_SET_VRING_CALL)
>   goto out;
> --
> 1.8.3.1



RE: [PATCH v4 8/8] vhost: improve vDPA blk device configure condition

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Thursday, October 13, 2022 4:45 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; Xu, Rosen ;
> Huang, Wei ; Cao, Gang ;
> maxime.coque...@redhat.com
> Subject: [PATCH v4 8/8] vhost: improve vDPA blk device configure condition
> 
> To support multi-queue, configure device
> after call fd of all queues are set.
> 
> Signed-off-by: Andy Pei 
> Signed-off-by: Huang Wei 
> ---
>  lib/vhost/vhost_user.c | 13 ++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 0509025..3ede83f 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -2965,6 +2965,7 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   uint32_t vdpa_type = -1;
>   uint32_t request;
>   uint32_t i;
> + uint16_t blk_call_fd;
> 
>   dev = get_device(vid);
>   if (dev == NULL)
> @@ -3191,9 +3192,15 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   if (!vdpa_dev)
>   goto out;
> 
> - if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK
> - && request != VHOST_USER_SET_VRING_CALL)
> - goto out;
> + if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK) {
> + if (request == VHOST_USER_SET_VRING_CALL) {
> + blk_call_fd = ctx.msg.payload.u64 &
> VHOST_USER_VRING_IDX_MASK;
> + if (blk_call_fd != dev->nr_vring - 1)
> + goto out;
> + } else {
> + goto out;
> + }
> + }
> 
>   if (!(dev->flags & VIRTIO_DEV_VDPA_CONFIGURED)) {
>   if (vdpa_dev->ops->dev_conf(dev->vid))
> --
> 1.8.3.1

Reviewed-by: Chenbo Xia 


RE: [PATCH v4 4/8] vdpa/ifc: write queue count to MQ register

2022-10-16 Thread Pei, Andy
Hi Chenbo,

Thanks for your effort.
I will send a new version to fix this.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:24 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 4/8] vdpa/ifc: write queue count to MQ register
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:45 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 4/8] vdpa/ifc: write queue count to MQ register
> >
> > Write queue count to IFCVF_MQ_OFFSET register to enable multi-queue
> > feature.
> >
> > Signed-off-by: Andy Pei 
> > Signed-off-by: Huang Wei 
> > ---
> >  drivers/vdpa/ifc/base/ifcvf.c | 32 
> >  1 file changed, 32 insertions(+)
> >
> > diff --git a/drivers/vdpa/ifc/base/ifcvf.c
> > b/drivers/vdpa/ifc/base/ifcvf.c index 81c68c0..60c7017 100644
> > --- a/drivers/vdpa/ifc/base/ifcvf.c
> > +++ b/drivers/vdpa/ifc/base/ifcvf.c
> > @@ -202,6 +202,37 @@
> > IFCVF_WRITE_REG32(val >> 32, hi);
> >  }
> >
> > +STATIC void
> > +ifcvf_enable_mq(struct ifcvf_hw *hw)
> > +{
> > +   u8 *mq_cfg;
> > +   u8 qid;
> > +   int nr_queue = 0;
> > +
> > +   for (qid = 0; qid < hw->nr_vring; qid++) {
> > +   if (!hw->vring[qid].enable)
> > +   continue;
> > +   nr_queue++;
> > +   }
> > +
> > +   if (nr_queue == 0) {
> > +   WARNINGOUT("no enabled vring\n");
> > +   return;
> > +   }
> > +
> > +   mq_cfg = hw->mq_cfg;
> > +   if (mq_cfg) {
> > +   if (hw->device_type == IFCVF_BLK) {
> > +   *(u32 *)mq_cfg = nr_queue;
> > +   RTE_LOG(INFO, PMD, "%d queue are enabled\n",
> nr_queue);
> 
> queue -> queues
> 
> With this fixed:
> 
> Reviewed-by: Chenbo Xia 
> 
> > +   } else {
> > +   *(u32 *)mq_cfg = nr_queue / 2;
> > +   RTE_LOG(INFO, PMD, "%d queue pairs are
> enabled\n",
> > +   nr_queue / 2);
> > +   }
> > +   }
> > +}
> > +
> >  STATIC int
> >  ifcvf_hw_enable(struct ifcvf_hw *hw)
> >  {
> > @@ -219,6 +250,7 @@
> > return -1;
> > }
> >
> > +   ifcvf_enable_mq(hw);
> > for (i = 0; i < hw->nr_vring; i++) {
> > IFCVF_WRITE_REG16(i, &cfg->queue_select);
> > io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
> > --
> > 1.8.3.1



RE: [PATCH v4 5/8] vdpa/ifc: only configure enabled queue

2022-10-16 Thread Pei, Andy
Hi  Chenbo,
Thanks for your efforts.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:24 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 5/8] vdpa/ifc: only configure enabled queue
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:45 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 5/8] vdpa/ifc: only configure enabled queue
> >
> > When configuring the hardware queue, we only configure queues which
> > have been enabled by vhost.
> >
> > Signed-off-by: Andy Pei 
> > Signed-off-by: Huang Wei 
> > Reviewed-by: Chenbo Xia 
> > ---
> >  drivers/vdpa/ifc/base/ifcvf.c |  3 +++  drivers/vdpa/ifc/ifcvf_vdpa.c
> > | 16 ++--
> >  2 files changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/vdpa/ifc/base/ifcvf.c
> > b/drivers/vdpa/ifc/base/ifcvf.c index 60c7017..4d85911 100644
> > --- a/drivers/vdpa/ifc/base/ifcvf.c
> > +++ b/drivers/vdpa/ifc/base/ifcvf.c
> > @@ -252,6 +252,9 @@
> >
> > ifcvf_enable_mq(hw);
> > for (i = 0; i < hw->nr_vring; i++) {
> > +   if (!hw->vring[i].enable)
> > +   continue;
> > +
> > IFCVF_WRITE_REG16(i, &cfg->queue_select);
> > io_write64_twopart(hw->vring[i].desc, &cfg->queue_desc_lo,
> > &cfg->queue_desc_hi);
> > diff --git a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > b/drivers/vdpa/ifc/ifcvf_vdpa.c index 5a24204..0c3407a 100644
> > --- a/drivers/vdpa/ifc/ifcvf_vdpa.c
> > +++ b/drivers/vdpa/ifc/ifcvf_vdpa.c
> > @@ -284,6 +284,8 @@ struct rte_vdpa_dev_info {
> > rte_vhost_get_negotiated_features(vid, &hw->req_features);
> >
> > for (i = 0; i < nr_vring; i++) {
> > +   if (!hw->vring[i].enable)
> > +   continue;
> > rte_vhost_get_vhost_vring(vid, i, &vq);
> > gpa = hva_to_gpa(vid, (uint64_t)(uintptr_t)vq.desc);
> > if (gpa == 0) {
> > @@ -499,6 +501,8 @@ struct rte_vdpa_dev_info {
> >
> > vring.kickfd = -1;
> > for (qid = 0; qid < q_num; qid++) {
> > +   if (!hw->vring[qid].enable)
> > +   continue;
> > ev.events = EPOLLIN | EPOLLPRI;
> > rte_vhost_get_vhost_vring(internal->vid, qid, &vring);
> > ev.data.u64 = qid | (uint64_t)vring.kickfd << 32; @@ -1058,6
> > +1062,8 @@ struct rte_vdpa_dev_info {
> > struct rte_vdpa_device *vdev;
> > struct internal_list *list;
> > struct ifcvf_internal *internal;
> > +   struct ifcvf_hw *hw;
> > +   uint16_t i;
> >
> > vdev = rte_vhost_get_vdpa_device(vid);
> > list = find_internal_resource_by_vdev(vdev);
> > @@ -1071,11 +1077,17 @@ struct rte_vdpa_dev_info {
> > rte_atomic32_set(&internal->dev_attached, 1);
> > update_datapath(internal);
> >
> > -   if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, true) !=
> > 0)
> > -   DRV_LOG(NOTICE, "vDPA (%s): software relay is used.",
> > +   hw = &internal->hw;
> > +   for (i = 0; i < hw->nr_vring; i++) {
> > +   if (!hw->vring[i].enable)
> > +   continue;
> > +   if (rte_vhost_host_notifier_ctrl(vid, i, true) != 0)
> > +   DRV_LOG(NOTICE, "vDPA (%s): software relay is
> used.",
> > vdev->device->name);
> > +   }
> >
> > internal->configured = 1;
> > +   DRV_LOG(INFO, "vDPA device %s is configured", vdev->device-
> >name);
> > return 0;
> >  }
> >
> > --
> > 1.8.3.1
> 
> Reviewed-by: Chenbo Xia 


RE: [PATCH v4 6/8] vdpa/ifc: support dynamic enable/disable queue

2022-10-16 Thread Pei, Andy
Hi Chenbo,

Thanks for your efforts.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:26 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 6/8] vdpa/ifc: support dynamic enable/disable queue
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:45 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 6/8] vdpa/ifc: support dynamic enable/disable queue
> >
> > From: Huang Wei 
> >
> > Support dynamic enable or disable queue.
> > For front end, like QEMU, user can use ethtool to configure queue.
> > For example, "ethtool -L eth0 combined 3" to enable 3 queues pairs.
> >
> > Signed-off-by: Huang Wei 
> > Signed-off-by: Andy Pei 
> > ---
> >  drivers/vdpa/ifc/base/ifcvf.c | 100
> > ++
> >  drivers/vdpa/ifc/base/ifcvf.h |   6 +++
> >  drivers/vdpa/ifc/ifcvf_vdpa.c |  93
> > -
> > --
> >  3 files changed, 184 insertions(+), 15 deletions(-)
> >
> 
> Reviewed-by: Chenbo Xia 


RE: [PATCH v4 7/8] vhost: vDPA blk device gets ready when any queue is ready

2022-10-16 Thread Pei, Andy
Hi Chenbo,

Thanks for your efforts, my reply is inline.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, October 17, 2022 2:34 PM
> To: Pei, Andy ; dev@dpdk.org
> Cc: Xu, Rosen ; Huang, Wei ;
> Cao, Gang ; maxime.coque...@redhat.com
> Subject: RE: [PATCH v4 7/8] vhost: vDPA blk device gets ready when any
> queue is ready
> 
> > -Original Message-
> > From: Pei, Andy 
> > Sent: Thursday, October 13, 2022 4:45 PM
> > To: dev@dpdk.org
> > Cc: Xia, Chenbo ; Xu, Rosen
> > ; Huang, Wei ; Cao, Gang
> > ; maxime.coque...@redhat.com
> > Subject: [PATCH v4 7/8] vhost: vDPA blk device gets ready when any
> > queue is ready
> 
> This title does not match to the code. You mean first queue?
> 
Yes, I think I can make the title the first queue.
> >
> > When boot from virtio blk device, seabios in QEMU only enables one queue.
> > To work in this scenario, vDPA BLK device back-end configure device
> > when any queue is ready.
> >
> > Signed-off-by: Andy Pei 
> > Signed-off-by: Huang Wei 
> > ---
> >  lib/vhost/vhost_user.c | 49
> > ++---
> > 
> >  1 file changed, 30 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index
> > cd65257..0509025 100644
> > --- a/lib/vhost/vhost_user.c
> > +++ b/lib/vhost/vhost_user.c
> > @@ -1441,9 +1441,10 @@
> >  }
> >
> >  #define VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY 2u
> > +#define VIRTIO_BLK_NUM_VQS_TO_BE_READY 1u
> >
> >  static int
> > -virtio_is_ready(struct virtio_net *dev)
> > +virtio_is_ready(struct virtio_net *dev, uint32_t vdpa_type)
> 
> I agree with Maxime's v3 comment. We don't need a new parameter.
> 
OK, I will send a new version to fix this.
> Thanks,
> Chenbo
> 
> >  {
> > struct vhost_virtqueue *vq;
> > uint32_t i, nr_vring = dev->nr_vring; @@ -1454,13 +1455,16 @@
> > if (!dev->nr_vring)
> > return 0;
> >
> > -   if (dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET) {
> > -   nr_vring = VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY;
> > -
> > -   if (dev->nr_vring < nr_vring)
> > -   return 0;
> > +   if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK) {
> > +   nr_vring = VIRTIO_BLK_NUM_VQS_TO_BE_READY;
> > +   } else {
> > +   if (dev->flags & VIRTIO_DEV_BUILTIN_VIRTIO_NET)
> > +   nr_vring =
> VIRTIO_BUILTIN_NUM_VQS_TO_BE_READY;
> > }
> >
> > +   if (dev->nr_vring < nr_vring)
> > +   return 0;
> > +
> > for (i = 0; i < nr_vring; i++) {
> > vq = dev->virtqueue[i];
> >
> > @@ -2958,7 +2962,7 @@ static int is_vring_iotlb(struct virtio_net *dev,
> > int ret;
> > int unlock_required = 0;
> > bool handled;
> > -   uint32_t vdpa_type = 0;
> > +   uint32_t vdpa_type = -1;
> > uint32_t request;
> > uint32_t i;
> >
> > @@ -3152,7 +3156,25 @@ static int is_vring_iotlb(struct virtio_net *dev,
> > if (unlock_required)
> > vhost_user_unlock_all_queue_pairs(dev);
> >
> > -   if (ret != 0 || !virtio_is_ready(dev))
> > +   if (ret != 0)
> > +   goto out;
> > +
> > +   vdpa_dev = dev->vdpa_dev;
> > +   if (vdpa_dev) {
> > +   if (vdpa_dev->ops->get_dev_type) {
> > +   ret = vdpa_dev->ops->get_dev_type(vdpa_dev,
> &vdpa_type);
> > +   if (ret) {
> > +   VHOST_LOG_CONFIG(dev->ifname, ERR,
> > +   "failed to get vdpa dev type.\n");
> > +   ret = -1;
> > +   goto out;
> > +   }
> > +   } else {
> > +   vdpa_type = RTE_VHOST_VDPA_DEVICE_TYPE_NET;
> > +   }
> > +   }
> > +
> > +   if (!virtio_is_ready(dev, vdpa_type))
> > goto out;
> >
> > /*
> > @@ -3166,20 +3188,9 @@ static int is_vring_iotlb(struct virtio_net *dev,
> > dev->flags |= VIRTIO_DEV_RUNNING;
> > }
> >
> > -   vdpa_dev = dev->vdpa_dev;
> > if (!vdpa_dev)
> > goto out;
> >
> > -   if (vdpa_dev->ops->get_dev_type) {
> > -   ret = vdpa_dev->ops->get_dev_type(vdpa_dev, &vdpa_type);
> > -   if (ret) {
> > -   VHOST_LOG_CONFIG(dev->ifname, ERR, "failed to
> get vdpa
> > dev type.\n");
> > -   ret = -1;
> > -   goto out;
> > -   }
> > -   } else {
> > -   vdpa_type = RTE_VHOST_VDPA_DEVICE_TYPE_NET;
> > -   }
> > if (vdpa_type == RTE_VHOST_VDPA_DEVICE_TYPE_BLK
> > && request != VHOST_USER_SET_VRING_CALL)
> > goto out;
> > --
> > 1.8.3.1



RE: [PATCH v2] vhost: add new `rte_vhost_vring_call_nonblock` API

2022-10-16 Thread Xia, Chenbo
Hi Changpeng,

> -Original Message-
> From: Liu, Changpeng 
> Sent: Wednesday, October 12, 2022 2:40 PM
> To: dev@dpdk.org
> Cc: Liu, Changpeng ; Maxime Coquelin
> ; Xia, Chenbo ; David
> Marchand 
> Subject: [PATCH v2] vhost: add new `rte_vhost_vring_call_nonblock` API
> 
> Vhost-user library locks all VQ's access lock when processing
> vring based messages, such as SET_VRING_KICK and SET_VRING_CALL,
> and the data processing thread may already be started, e.g: SPDK
> vhost-blk and vhost-scsi will start the data processing thread
> when one vring is ready, then deadlock may happen when SPDK is
> posting interrupts to VM.  Here, we add a new API which allows
> caller to try again later for this case.
> 
> Bugzilla ID: 1015
> Fixes: c5736998305d ("vhost: fix missing virtqueue lock protection")
> 
> Signed-off-by: Changpeng Liu 
> ---
>  lib/vhost/rte_vhost.h | 15 +++
>  lib/vhost/version.map |  3 +++
>  lib/vhost/vhost.c | 30 ++
>  3 files changed, 48 insertions(+)

For new API, we need to update release_22_11.rst and vhost_lib.rst.

You can refer to 
http://patchwork.dpdk.org/project/dpdk/patch/20221013092708.4922-2-xuan.d...@intel.com/

Thanks,
Chenbo

> 
> diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
> index bb7d86a432..d22b25cd4e 100644
> --- a/lib/vhost/rte_vhost.h
> +++ b/lib/vhost/rte_vhost.h
> @@ -909,6 +909,21 @@ rte_vhost_clr_inflight_desc_packed(int vid, uint16_t
> vring_idx,
>   */
>  int rte_vhost_vring_call(int vid, uint16_t vring_idx);
> 
> +/**
> + * Notify the guest that used descriptors have been added to the vring.
> This
> + * function acts as a memory barrier.  This function will return -EAGAIN
> when
> + * vq's access lock is held by other thread, user should try again later.
> + *
> + * @param vid
> + *  vhost device ID
> + * @param vring_idx
> + *  vring index
> + * @return
> + *  0 on success, -1 on failure, -EAGAIN for another retry
> + */
> +__rte_experimental
> +int rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx);
> +
>  /**
>   * Get vhost RX queue avail count.
>   *
> diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> index 7a00b65740..c8c44b8326 100644
> --- a/lib/vhost/version.map
> +++ b/lib/vhost/version.map
> @@ -94,6 +94,9 @@ EXPERIMENTAL {
>   rte_vhost_async_try_dequeue_burst;
>   rte_vhost_driver_get_vdpa_dev_type;
>   rte_vhost_clear_queue;
> +
> + # added in 22.11
> + rte_vhost_vring_call_nonblock;
>  };
> 
>  INTERNAL {
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 8740aa2788..ed6efb003f 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -1317,6 +1317,36 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
>   return 0;
>  }
> 
> +int
> +rte_vhost_vring_call_nonblock(int vid, uint16_t vring_idx)
> +{
> + struct virtio_net *dev;
> + struct vhost_virtqueue *vq;
> +
> + dev = get_device(vid);
> + if (!dev)
> + return -1;
> +
> + if (vring_idx >= VHOST_MAX_VRING)
> + return -1;
> +
> + vq = dev->virtqueue[vring_idx];
> + if (!vq)
> + return -1;
> +
> + if (!rte_spinlock_trylock(&vq->access_lock))
> + return -EAGAIN;
> +
> + if (vq_is_packed(dev))
> + vhost_vring_call_packed(dev, vq);
> + else
> + vhost_vring_call_split(dev, vq);
> +
> + rte_spinlock_unlock(&vq->access_lock);
> +
> + return 0;
> +}
> +
>  uint16_t
>  rte_vhost_avail_entries(int vid, uint16_t queue_id)
>  {
> --
> 2.21.3



RE: [PATCH] net/virtio: remove declaration of undefined function

2022-10-16 Thread Xia, Chenbo
> -Original Message-
> From: Olivier Matz 
> Sent: Thursday, September 29, 2022 8:22 PM
> To: dev@dpdk.org
> Cc: Maxime Coquelin ; Xia, Chenbo
> 
> Subject: [PATCH] net/virtio: remove declaration of undefined function
> 
> This function is not defined, remove its declaration.
> 
> Fixes: c1f86306a026 ("virtio: add new driver")
> 
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/virtio/virtqueue.h | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtqueue.h
> b/drivers/net/virtio/virtqueue.h
> index d100ed8762..f5d8b40cad 100644
> --- a/drivers/net/virtio/virtqueue.h
> +++ b/drivers/net/virtio/virtqueue.h
> @@ -474,10 +474,6 @@ virtqueue_enable_intr(struct virtqueue *vq)
>   virtqueue_enable_intr_split(vq);
>  }
> 
> -/**
> - *  Dump virtqueue internal structures, for debug purpose only.
> - */
> -void virtqueue_dump(struct virtqueue *vq);
>  /**
>   *  Get all mbufs to be freed.
>   */
> --
> 2.30.2

Reviewed-by: Chenbo Xia 


RE: [PATCH] net/virtio: add queue and port ID in some logs

2022-10-16 Thread Xia, Chenbo
Hi Olivier,

> -Original Message-
> From: Olivier Matz 
> Sent: Thursday, September 29, 2022 8:22 PM
> To: dev@dpdk.org
> Cc: Maxime Coquelin ; Xia, Chenbo
> 
> Subject: [PATCH] net/virtio: add queue and port ID in some logs
> 
> Add the queue id and/or the port id in some logs, so it is easier to
> understand what happens.
> 
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 6 --
>  drivers/net/virtio/virtio_rxtx.c   | 3 ++-
>  2 files changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index 7e07270a8b..44811c299b 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -2807,7 +2807,8 @@ virtio_dev_start(struct rte_eth_dev *dev)
>   return -EINVAL;
>   }
> 
> - PMD_INIT_LOG(DEBUG, "nb_queues=%d", nb_queues);
> + PMD_INIT_LOG(DEBUG, "nb_queues=%d (port=%d)", nb_queues,
> +  dev->data->port_id);

Better to use %u for all port_id since it's uint16_t

Thanks,
Chenbo

> 
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {
>   vq = virtnet_rxq_to_vq(dev->data->rx_queues[i]);
> @@ -2821,7 +2822,8 @@ virtio_dev_start(struct rte_eth_dev *dev)
>   virtqueue_notify(vq);
>   }
> 
> - PMD_INIT_LOG(DEBUG, "Notified backend at initialization");
> + PMD_INIT_LOG(DEBUG, "Notified backend at initialization (port=%d)",
> +  dev->data->port_id);
> 
>   for (i = 0; i < dev->data->nb_rx_queues; i++) {
>   vq = virtnet_rxq_to_vq(dev->data->rx_queues[i]);
> diff --git a/drivers/net/virtio/virtio_rxtx.c
> b/drivers/net/virtio/virtio_rxtx.c
> index 4795893ec7..f8a9ee5cdb 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -793,7 +793,8 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev
> *dev, uint16_t queue_idx)
>   vq_update_avail_idx(vq);
>   }
> 
> - PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs);
> + PMD_INIT_LOG(DEBUG, "Allocated %d bufs (port=%d queue=%d)", nbufs,
> +  dev->data->port_id, queue_idx);
> 
>   VIRTQUEUE_DUMP(vq);
> 
> --
> 2.30.2