Re: [PATCH v12 2/7] bbdev: add device status info

2022-10-05 Thread Maxime Coquelin




On 10/4/22 19:16, Nicolas Chautru wrote:

Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru 
Acked-by: Mingshan Zhang 
Acked-by: Hemant Agrawal 
---
  doc/guides/rel_notes/deprecation.rst  |  3 --
  doc/guides/rel_notes/release_22_11.rst|  3 ++
  drivers/baseband/acc100/rte_acc100_pmd.c  |  1 +
  .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |  1 +
  drivers/baseband/la12xx/bbdev_la12xx.c|  1 +
  drivers/baseband/null/bbdev_null.c|  1 +
  .../baseband/turbo_sw/bbdev_turbo_software.c  |  1 +
  lib/bbdev/rte_bbdev.c | 22 
  lib/bbdev/rte_bbdev.h | 35 +--
  lib/bbdev/rte_bbdev_op.h  |  2 +-
  lib/bbdev/version.map |  7 
  12 files changed, 72 insertions(+), 6 deletions(-)


Acked-by: Maxime Coquelin 

Thanks,
Maxime



Re: [PATCH v9] eal: add bus cleanup to eal cleanup

2022-10-05 Thread David Marchand
On Tue, Oct 4, 2022 at 6:47 PM Kevin Laatz  wrote:
>
> During EAL init, all buses are probed and the devices found are
> initialized. On eal_cleanup(), the inverse does not happen, meaning any
> allocated memory and other configuration will not be cleaned up
> appropriately on exit.
>
> Currently, in order for device cleanup to take place, applications must
> call the driver-relevant functions to ensure proper cleanup is done before
> the application exits. Since initialization occurs for all devices on the
> bus, not just the devices used by an application, it requires a)
> application awareness of all bus devices that could have been probed on the
> system, and b) code duplication across applications to ensure cleanup is
> performed. An example of this is rte_eth_dev_close() which is commonly used
> across the example applications.
>
> This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
> init/exit more symmetrical, ensuring all bus devices are cleaned up
> appropriately without the application needing to be aware of all bus types
> that may have been probed during initialization.
>
> Contained in this patch are the changes required to perform cleanup for
> devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
> ask for bus maintainers to add the relevant cleanup for their buses since
> they have the domain expertise.

Cc: maintainers for info.


> Signed-off-by: Kevin Laatz 
> Acked-by: Morten Brørup 
> Reviewed-by: Bruce Richardson 

Applied, thanks.


-- 
David Marchand



[PATCH v2] doc: add removal note for power empty poll API

2022-10-05 Thread Reshma Pattan
Add removal note for experimental empty poll API.

CC: David Hunt 

Signed-off-by: Reshma Pattan 
Acked-by: David Hunt 
---
 doc/guides/prog_guide/power_man.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/doc/guides/prog_guide/power_man.rst 
b/doc/guides/prog_guide/power_man.rst
index 98cfd3c1f3..9a40242dd0 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -192,6 +192,12 @@ User Cases
 --
 The mechanism can applied to any device which is based on polling. e.g. NIC, 
FPGA.
 
+Removal Note
+
+The experimental empty poll APIs will be removed from the library in a future
+DPDK release.
+
+
 Ethernet PMD Power Management API
 -
 
-- 
2.31.1



Re: [PATCH v2] drivers/bus: set device NUMA node to unknown by default

2022-10-05 Thread David Marchand
On Tue, Oct 4, 2022 at 4:59 PM Olivier Matz  wrote:
>
> The dev->device.numa_node field is set by each bus driver for
> every device it manages to indicate on which NUMA node this device lies.
>
> When this information is unknown, the assigned value is not consistent
> across the bus drivers.
>
> Set the default value to SOCKET_ID_ANY (-1) by all bus drivers
> when the NUMA information is unavailable. This change impacts
> rte_eth_dev_socket_id() in the same manner.
>
> Signed-off-by: Olivier Matz 
> ---
>
> v2
> * use SOCKET_ID_ANY instead of -1 in drivers/dma/idxd (David)
> * document the behavior change of rte_eth_dev_socket_id()
> * fix few examples where rte_eth_dev_socket_id() was expected to
>   return 0 on unknown socket

Cc: ethdev maintainers.

[snip]

> diff --git a/doc/guides/rel_notes/release_22_11.rst 
> b/doc/guides/rel_notes/release_22_11.rst
> index 53fe21453c..d52f823694 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -317,6 +317,12 @@ ABI Changes
>  * eventdev: Added ``weight`` and ``affinity`` fields
>to ``rte_event_queue_conf`` structure.
>
> +* bus: Changed the device numa node to -1 when NUMA information is 
> unavailable.
> +  The ``dev->device.numa_node`` field is set by each bus driver for
> +  every device it manages to indicate on which NUMA node this device lies.
> +  When this information is unknown, the assigned value was not consistent
> +  across the bus drivers. This similarly impacts ``rte_eth_dev_socket_id()``.
> +
>
>  Known Issues
>  

[snip]

> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index a21f58b9cd..dd8d25d6d4 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2445,8 +2445,8 @@ int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t 
> rx_port);
>   *   The port identifier of the Ethernet device
>   * @return
>   *   The NUMA socket ID to which the Ethernet device is connected or
> - *   a default of zero if the socket could not be determined.
> - *   -1 is returned is the port_id value is out of range.
> + *   a default of -1 (SOCKET_ID_ANY) if the socket could not be determined.
> + *   -1 is also returned if the port_id is invalid.
>   */
>  int rte_eth_dev_socket_id(uint16_t port_id);

It would be better to distinguish the two cases, using rte_errno.
Something like:

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 2821770e2d..1baf302804 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -562,8 +562,16 @@ rte_eth_dev_owner_get(const uint16_t port_id,
struct rte_eth_dev_owner *owner)
 int
 rte_eth_dev_socket_id(uint16_t port_id)
 {
-   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
-   return rte_eth_devices[port_id].data->numa_node;
+   int socket_id = SOCKET_ID_ANY;
+
+   if (!rte_eth_dev_is_valid_port(port_id))
+   rte_errno = EINVAL;
+   } else {
+   socket_id = rte_eth_devices[port_id].data->numa_node;
+   if (socket_id == SOCKET_ID_ANY)
+   rte_errno = 0;
+   }
+   return socket_id;
 }

 void *
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index dd8d25d6d4..03456b2dbb 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2444,9 +2444,11 @@ int rte_eth_hairpin_unbind(uint16_t tx_port,
uint16_t rx_port);
  * @param port_id
  *   The port identifier of the Ethernet device
  * @return
- *   The NUMA socket ID to which the Ethernet device is connected or
- *   a default of -1 (SOCKET_ID_ANY) if the socket could not be determined.
- *   -1 is also returned if the port_id is invalid.
+ *   - The NUMA socket ID which the Ethernet device is connected to.
+ *   - -1 (which translates to SOCKET_ID_ANY) if the socket could not be
+ * determined. rte_errno is then set to:
+ * - EINVAL is the port_id is invalid,
+ * - 0 is the socket could not be determined,
  */
 int rte_eth_dev_socket_id(uint16_t port_id);

WDYT?


-- 
David Marchand



Re: [PATCH 2/3] eal: change rte_fls and rte_bsf to return uint32_t

2022-10-05 Thread Thomas Monjalon
08/08/2022 23:21, Tyler Retzlaff:
> From: Tyler Retzlaff 
> 
> return fixed width uint32_t to be consistent with what appears to
> be the original authors intent. it doesn't make much sense to return
> signed integers for these functions.
> 
> Signed-off-by: Tyler Retzlaff 
> ---
>  lib/eal/include/rte_common.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
> index a96cc2a..bd4184d 100644
> --- a/lib/eal/include/rte_common.h
> +++ b/lib/eal/include/rte_common.h
> @@ -707,7 +707,7 @@ static void __attribute__((destructor(RTE_PRIO(prio)), 
> used)) func(void)
>   * @return
>   * The last (most-significant) bit set, or 0 if the input is 0.
>   */
> -static inline int
> +static inline uint32_t
>  rte_fls_u32(uint32_t x)
>  {
>   return (x == 0) ? 0 : 32 - __builtin_clz(x);
> @@ -724,7 +724,7 @@ static void __attribute__((destructor(RTE_PRIO(prio)), 
> used)) func(void)
>   * @return
>   * least significant set bit in the input parameter.
>   */
> -static inline int
> +static inline uint32_t
>  rte_bsf64(uint64_t v)
>  {
>   return (uint32_t)__builtin_ctzll(v);
> @@ -766,7 +766,7 @@ static void __attribute__((destructor(RTE_PRIO(prio)), 
> used)) func(void)
>   * @return
>   * The last (most-significant) bit set, or 0 if the input is 0.
>   */
> -static inline int
> +static inline uint32_t
>  rte_fls_u64(uint64_t x)
>  {
>   return (x == 0) ? 0 : 64 - __builtin_clzll(x);
> 

You forgot the _safe versions:

--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -660,7 +660,7 @@ rte_bsf32(uint32_t v)
  * @return
  * Returns 0 if ``v`` was 0, otherwise returns 1.
  */
-static inline int
+static inline uint32_t
 rte_bsf32_safe(uint32_t v, uint32_t *pos)
 {
if (v == 0)
@@ -739,7 +739,7 @@ rte_bsf64(uint64_t v)
  * @return
  * Returns 0 if ``v`` was 0, otherwise returns 1.
  */
-static inline int
+static inline uint32_t
 rte_bsf64_safe(uint64_t v, uint32_t *pos)
 {
if (v == 0)







Re: [PATCH v2] drivers/bus: set device NUMA node to unknown by default

2022-10-05 Thread Olivier Matz
Hi David,

On Wed, Oct 05, 2022 at 10:52:49AM +0200, David Marchand wrote:
> On Tue, Oct 4, 2022 at 4:59 PM Olivier Matz  wrote:
> >
> > The dev->device.numa_node field is set by each bus driver for
> > every device it manages to indicate on which NUMA node this device lies.
> >
> > When this information is unknown, the assigned value is not consistent
> > across the bus drivers.
> >
> > Set the default value to SOCKET_ID_ANY (-1) by all bus drivers
> > when the NUMA information is unavailable. This change impacts
> > rte_eth_dev_socket_id() in the same manner.
> >
> > Signed-off-by: Olivier Matz 
> > ---
> >
> > v2
> > * use SOCKET_ID_ANY instead of -1 in drivers/dma/idxd (David)
> > * document the behavior change of rte_eth_dev_socket_id()
> > * fix few examples where rte_eth_dev_socket_id() was expected to
> >   return 0 on unknown socket
> 
> Cc: ethdev maintainers.
> 
> [snip]
> 
> > diff --git a/doc/guides/rel_notes/release_22_11.rst 
> > b/doc/guides/rel_notes/release_22_11.rst
> > index 53fe21453c..d52f823694 100644
> > --- a/doc/guides/rel_notes/release_22_11.rst
> > +++ b/doc/guides/rel_notes/release_22_11.rst
> > @@ -317,6 +317,12 @@ ABI Changes
> >  * eventdev: Added ``weight`` and ``affinity`` fields
> >to ``rte_event_queue_conf`` structure.
> >
> > +* bus: Changed the device numa node to -1 when NUMA information is 
> > unavailable.
> > +  The ``dev->device.numa_node`` field is set by each bus driver for
> > +  every device it manages to indicate on which NUMA node this device lies.
> > +  When this information is unknown, the assigned value was not consistent
> > +  across the bus drivers. This similarly impacts 
> > ``rte_eth_dev_socket_id()``.
> > +
> >
> >  Known Issues
> >  
> 
> [snip]
> 
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index a21f58b9cd..dd8d25d6d4 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -2445,8 +2445,8 @@ int rte_eth_hairpin_unbind(uint16_t tx_port, uint16_t 
> > rx_port);
> >   *   The port identifier of the Ethernet device
> >   * @return
> >   *   The NUMA socket ID to which the Ethernet device is connected or
> > - *   a default of zero if the socket could not be determined.
> > - *   -1 is returned is the port_id value is out of range.
> > + *   a default of -1 (SOCKET_ID_ANY) if the socket could not be determined.
> > + *   -1 is also returned if the port_id is invalid.
> >   */
> >  int rte_eth_dev_socket_id(uint16_t port_id);
> 
> It would be better to distinguish the two cases, using rte_errno.
> Something like:
> 
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index 2821770e2d..1baf302804 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -562,8 +562,16 @@ rte_eth_dev_owner_get(const uint16_t port_id,
> struct rte_eth_dev_owner *owner)
>  int
>  rte_eth_dev_socket_id(uint16_t port_id)
>  {
> -   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
> -   return rte_eth_devices[port_id].data->numa_node;
> +   int socket_id = SOCKET_ID_ANY;
> +
> +   if (!rte_eth_dev_is_valid_port(port_id))
> +   rte_errno = EINVAL;
> +   } else {
> +   socket_id = rte_eth_devices[port_id].data->numa_node;
> +   if (socket_id == SOCKET_ID_ANY)
> +   rte_errno = 0;
> +   }
> +   return socket_id;
>  }
> 
>  void *
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index dd8d25d6d4..03456b2dbb 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -2444,9 +2444,11 @@ int rte_eth_hairpin_unbind(uint16_t tx_port,
> uint16_t rx_port);
>   * @param port_id
>   *   The port identifier of the Ethernet device
>   * @return
> - *   The NUMA socket ID to which the Ethernet device is connected or
> - *   a default of -1 (SOCKET_ID_ANY) if the socket could not be determined.
> - *   -1 is also returned if the port_id is invalid.
> + *   - The NUMA socket ID which the Ethernet device is connected to.
> + *   - -1 (which translates to SOCKET_ID_ANY) if the socket could not be
> + * determined. rte_errno is then set to:
> + * - EINVAL is the port_id is invalid,
> + * - 0 is the socket could not be determined,
>   */
>  int rte_eth_dev_socket_id(uint16_t port_id);
> 
> WDYT?

As discussed off-list, it is indeed better.

Thanks for the review
Olivier


Re: [PATCH 1/3] doc: announce cleanup of rte_{bsf, fls} inline functions type use

2022-10-05 Thread Thomas Monjalon
08/08/2022 23:21, Tyler Retzlaff:
> From: Tyler Retzlaff 
> 
> The cleanup resulted from request to review [1] the following
> functions where there appeared to be inconsistency in return type
> or parameter type selections for the following inline functions.
> 
> rte_bsf32()
> rte_bsf32_safe()
> rte_bsf64()
> rte_bsf64_safe()

Why _safe functions are not changed?

> rte_fls_u32()
> rte_fls_u64()
> rte_log2_u32()
> rte_log2_u64()
> 
> [1] http://mails.dpdk.org/archives/dev/2021-March/201590.html
> 
> Signed-off-by: Tyler Retzlaff 
> ---
>  doc/guides/rel_notes/deprecation.rst | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index e7583ca..58f4c24 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> +* eal: Fix inline function return and parameter types for rte_{bsf,fls}
> +  inline functions to be consistent in DPDK 22.11.
> +  Change ``rte_bsf64`` return type to  ``uint32_t`` instead of ``int``.
> +  Change ``rte_fls_u32`` return type to ``uint32_t`` instead of ``int``.
> +  Change ``rte_fls_u64`` return type to ``uint32_t`` instead of ``int``.

It is too late for the announce,
but the release notes requires to be updated.




[PATCH v2 0/6] Service cores performance and statistics improvements

2022-10-05 Thread Mattias Rönnblom
This series contains performance improvements and new statistics-
related functionality for the EAL service cores framework.

A new per-lcore TSC cycle counter is introduced, which reflect the
total amount of cycles spent by that lcore running services. This may
be used to estimate service lcore load.

The patchset introduces a backward-compatible convention, where a DPDK
service may signal to the framework that no useful work was performed,
which in turn is used to make the busy cycles statistics more
accurate.

Depends-on: series-23959 ("test/service: add perf measurements for with stats 
mode ")

Mattias Rönnblom (6):
  service: reduce statistics overhead for parallel services
  service: introduce per-lcore cycles counter
  service: reduce average case service core overhead
  service: tweak cycle statistics semantics
  event/sw: report idle when no work is performed
  service: provide links to functions in documentation

 app/test/test_service_cores.c   |   2 +-
 drivers/event/sw/sw_evdev.c |   3 +-
 drivers/event/sw/sw_evdev.h |   2 +-
 drivers/event/sw/sw_evdev_scheduler.c   |   6 +-
 lib/eal/common/rte_service.c| 228 +---
 lib/eal/include/rte_service.h   |  32 ++--
 lib/eal/include/rte_service_component.h |   5 +
 7 files changed, 192 insertions(+), 86 deletions(-)

-- 
2.34.1



[PATCH v2 4/6] service: tweak cycle statistics semantics

2022-10-05 Thread Mattias Rönnblom
As a part of its service function, a service usually polls some kind
of source (e.g., an RX queue, a ring, an eventdev port, or a timer
wheel) to retrieve one or more items of work.

In low-load situations, the service framework reports a significant
amount of cycles spent for all running services, despite the fact they
have performed little or no actual work.

The per-call cycle expenditure for an idle service (i.e., a service
currently without pending jobs) is typically very low. Polling an
empty ring or RX queue is inexpensive. However, since the service
function call frequency on an idle or lightly loaded lcore is going to
be very high indeed, the service function calls' cycles adds up to a
significant amount. The only thing preventing the idle services'
cycles counters to make up 100% of the available CPU cycles is the
overhead of the service framework itself.

If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are
used to estimate service core load, the cores may look very busy when
the system is mostly doing nothing useful at all.

This patch allows for an idle service to indicate that no actual work
was performed during a particular service function call (by returning
-EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and
RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.

The convention of returning -EAGAIN for idle services may in the
future also be used to have the lcore enter a short sleep, or reduce
its operating frequency, in case all services are currently idle.

This change is backward-compatible.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/common/rte_service.c| 26 +
 lib/eal/include/rte_service_component.h |  5 +
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 035c36b8bb..718371814f 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -373,28 +374,29 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
-   s->spec.callback(userdata);
-   uint64_t end = rte_rdtsc();
-   uint64_t cycles = end - start;
+   int rc = s->spec.callback(userdata);
 
/* The lcore service worker thread is the only writer,
 * and thus only a non-atomic load and an atomic store
 * is needed, and not the more expensive atomic
 * add.
 */
-   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
-__ATOMIC_RELAXED);
-
struct service_stats *service_stats =
&cs->service_stats[service_idx];
 
-   __atomic_store_n(&service_stats->calls,
-service_stats->calls + 1,
-__ATOMIC_RELAXED);
+   if (likely(rc != -EAGAIN)) {
+   uint64_t end = rte_rdtsc();
+   uint64_t cycles = end - start;
 
-   __atomic_store_n(&service_stats->cycles,
-service_stats->cycles + cycles,
-__ATOMIC_RELAXED);
+   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
+__ATOMIC_RELAXED);
+   __atomic_store_n(&service_stats->cycles,
+service_stats->cycles + cycles,
+__ATOMIC_RELAXED);
+   }
+
+   __atomic_store_n(&service_stats->calls,
+service_stats->calls + 1, __ATOMIC_RELAXED);
} else
s->spec.callback(userdata);
 }
diff --git a/lib/eal/include/rte_service_component.h 
b/lib/eal/include/rte_service_component.h
index 9e66ee7e29..9be49d698a 100644
--- a/lib/eal/include/rte_service_component.h
+++ b/lib/eal/include/rte_service_component.h
@@ -19,6 +19,11 @@ extern "C" {
 
 /**
  * Signature of callback function to run a service.
+ *
+ * A service function call resulting in no actual work being
+ * performed, should return -EAGAIN. In that case, the (presumbly few)
+ * cycles spent will not be counted toward the lcore or service-level
+ * cycles attributes.
  */
 typedef int32_t (*rte_service_func)(void *args);
 
-- 
2.34.1



[PATCH v2 2/6] service: introduce per-lcore cycles counter

2022-10-05 Thread Mattias Rönnblom
Introduce a per-lcore counter for the total time spent on processing
services on that core.

This counter is useful when measuring individual lcore load.

Signed-off-by: Mattias Rönnblom 
---
 app/test/test_service_cores.c |  2 +-
 lib/eal/common/rte_service.c  | 15 +++
 lib/eal/include/rte_service.h |  6 ++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index 7415b6b686..096405133b 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -403,7 +403,7 @@ service_lcore_attr_get(void)
"lcore_attr_get() failed to get loops "
"(expected > zero)");
 
-   lcore_attr_id++;  // invalid lcore attr id
+   lcore_attr_id = 42; /* invalid lcore attr id */
TEST_ASSERT_EQUAL(-EINVAL, rte_service_lcore_attr_get(slcore_id,
lcore_attr_id, &lcore_attr_value),
"Invalid lcore attr didn't return -EINVAL");
diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 6b807afacf..4d51de638d 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -66,6 +66,7 @@ struct core_state {
uint8_t is_service_core; /* set if core is currently a service core */
uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
uint64_t loops;
+   uint64_t cycles;
struct service_stats service_stats[RTE_SERVICE_NUM_MAX];
 } __rte_cache_aligned;
 
@@ -376,6 +377,9 @@ service_runner_do_callback(struct rte_service_spec_impl *s,
 * is needed, and not the more expensive atomic
 * add.
 */
+   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
+__ATOMIC_RELAXED);
+
struct service_stats *service_stats =
&cs->service_stats[service_idx];
 
@@ -819,6 +823,14 @@ lcore_attr_get_loops(unsigned int lcore)
return __atomic_load_n(&cs->loops, __ATOMIC_RELAXED);
 }
 
+static uint64_t
+lcore_attr_get_cycles(unsigned int lcore)
+{
+   struct core_state *cs = &lcore_states[lcore];
+
+   return __atomic_load_n(&cs->cycles, __ATOMIC_RELAXED);
+}
+
 static uint64_t
 lcore_attr_get_service_calls(uint32_t service_id, unsigned int lcore)
 {
@@ -903,6 +915,9 @@ rte_service_lcore_attr_get(uint32_t lcore, uint32_t attr_id,
case RTE_SERVICE_LCORE_ATTR_LOOPS:
*attr_value = lcore_attr_get_loops(lcore);
return 0;
+   case RTE_SERVICE_LCORE_ATTR_CYCLES:
+   *attr_value = lcore_attr_get_cycles(lcore);
+   return 0;
default:
return -EINVAL;
}
diff --git a/lib/eal/include/rte_service.h b/lib/eal/include/rte_service.h
index 35d8018684..70deb6e53a 100644
--- a/lib/eal/include/rte_service.h
+++ b/lib/eal/include/rte_service.h
@@ -407,6 +407,12 @@ int32_t rte_service_attr_reset_all(uint32_t id);
  */
 #define RTE_SERVICE_LCORE_ATTR_LOOPS 0
 
+/**
+ * Returns the total number of cycles that the lcore has spent on
+ * running services.
+ */
+#define RTE_SERVICE_LCORE_ATTR_CYCLES 1
+
 /**
  * Get an attribute from a service core.
  *
-- 
2.34.1



[PATCH v2 3/6] service: reduce average case service core overhead

2022-10-05 Thread Mattias Rönnblom
Optimize service loop so that the starting point is the lowest-indexed
service mapped to the lcore in question, and terminate the loop at the
highest-indexed service.

While the worst case latency remains the same, this patch
significantly reduces the service framework overhead for the average
case. In particular, scenarios where an lcore only runs a single
service, or multiple services which id values are close (e.g., three
services with ids 17, 18 and 22), show significant improvements.

The worse case is a where the lcore two services mapped to it; one
with service id 0 and the other with id 63.

On a service lcore serving a single service, the service loop overhead
is reduced from ~190 core clock cycles to ~46, on an Intel Cascade
Lake generation Xeon. On weakly ordered CPUs, the gain is larger,
since the loop included load-acquire atomic operations.

Signed-off-by: Mattias Rönnblom 

---

v2: Added build-time assertion to prevent the maximum number of
services to accidentally be changed to a higher value than
the implementation supports. (Harry van Haaren)
---
 lib/eal/common/rte_service.c | 19 +++
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 4d51de638d..035c36b8bb 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -78,6 +78,11 @@ static uint32_t rte_service_library_initialized;
 int32_t
 rte_service_init(void)
 {
+   /* Hard limit due to the use of an uint64_t-based bitmask (and the
+* clzl intrinsic).
+*/
+   RTE_BUILD_BUG_ON(RTE_SERVICE_NUM_MAX > 64);
+
if (rte_service_library_initialized) {
RTE_LOG(NOTICE, EAL,
"service library init() called, init flag %d\n",
@@ -472,7 +477,6 @@ static int32_t
 service_runner_func(void *arg)
 {
RTE_SET_USED(arg);
-   uint32_t i;
const int lcore = rte_lcore_id();
struct core_state *cs = &lcore_states[lcore];
 
@@ -486,10 +490,17 @@ service_runner_func(void *arg)
RUNSTATE_RUNNING) {
 
const uint64_t service_mask = cs->service_mask;
+   uint8_t start_id;
+   uint8_t end_id;
+   uint8_t i;
 
-   for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (!service_registered(i))
-   continue;
+   if (service_mask == 0)
+   continue;
+
+   start_id = __builtin_ctzl(service_mask);
+   end_id = 64 - __builtin_clzl(service_mask);
+
+   for (i = start_id; i < end_id; i++) {
/* return value ignored as no change to code flow */
service_run(i, cs, service_mask, service_get(i), 1);
}
-- 
2.34.1



[PATCH v2 5/6] event/sw: report idle when no work is performed

2022-10-05 Thread Mattias Rönnblom
Have the SW event device conform to the service core convention, where
-EAGAIN is return in case no work was performed.

Prior to this patch, for an idle SW event device, a service lcore load
estimate based on RTE_SERVICE_ATTR_CYCLES would suggest 48% core
load.

At 7% of its maximum capacity, the SW event device needs about 15% of
the available CPU cycles* to perform its duties, but
RTE_SERVICE_ATTR_CYCLES would suggest the SW service used 48% of the
service core.

After this change, load deduced from RTE_SERVICE_ATTR_CYCLES will only
be a minor overestimation of the actual cycles used.

* The SW scheduler becomes more efficient at higher loads.

Signed-off-by: Mattias Rönnblom 
---
 drivers/event/sw/sw_evdev.c   | 3 +--
 drivers/event/sw/sw_evdev.h   | 2 +-
 drivers/event/sw/sw_evdev_scheduler.c | 6 --
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index bfa9469e29..3531821dd4 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -934,8 +934,7 @@ set_refill_once(const char *key __rte_unused, const char 
*value, void *opaque)
 static int32_t sw_sched_service_func(void *args)
 {
struct rte_eventdev *dev = args;
-   sw_event_schedule(dev);
-   return 0;
+   return sw_event_schedule(dev);
 }
 
 static int
diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
index 4fd1054470..8542b7d34d 100644
--- a/drivers/event/sw/sw_evdev.h
+++ b/drivers/event/sw/sw_evdev.h
@@ -295,7 +295,7 @@ uint16_t sw_event_enqueue_burst(void *port, const struct 
rte_event ev[],
 uint16_t sw_event_dequeue(void *port, struct rte_event *ev, uint64_t wait);
 uint16_t sw_event_dequeue_burst(void *port, struct rte_event *ev, uint16_t num,
uint64_t wait);
-void sw_event_schedule(struct rte_eventdev *dev);
+int32_t sw_event_schedule(struct rte_eventdev *dev);
 int sw_xstats_init(struct sw_evdev *dev);
 int sw_xstats_uninit(struct sw_evdev *dev);
 int sw_xstats_get_names(const struct rte_eventdev *dev,
diff --git a/drivers/event/sw/sw_evdev_scheduler.c 
b/drivers/event/sw/sw_evdev_scheduler.c
index 809a54d731..8bc21944f5 100644
--- a/drivers/event/sw/sw_evdev_scheduler.c
+++ b/drivers/event/sw/sw_evdev_scheduler.c
@@ -506,7 +506,7 @@ sw_schedule_pull_port_dir(struct sw_evdev *sw, uint32_t 
port_id)
return pkts_iter;
 }
 
-void
+int32_t
 sw_event_schedule(struct rte_eventdev *dev)
 {
struct sw_evdev *sw = sw_pmd_priv(dev);
@@ -517,7 +517,7 @@ sw_event_schedule(struct rte_eventdev *dev)
 
sw->sched_called++;
if (unlikely(!sw->started))
-   return;
+   return -EAGAIN;
 
do {
uint32_t in_pkts_this_iteration = 0;
@@ -610,4 +610,6 @@ sw_event_schedule(struct rte_eventdev *dev)
sw->sched_last_iter_bitmask = cqs_scheds_last_iter;
if (unlikely(sw->port_count >= 64))
sw->sched_last_iter_bitmask = UINT64_MAX;
+
+   return work_done ? 0 : -EAGAIN;
 }
-- 
2.34.1



[PATCH v2 6/6] service: provide links to functions in documentation

2022-10-05 Thread Mattias Rönnblom
Refer to API functions with parenthesis, making doxygen create
hyperlinks.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/include/rte_service.h | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/eal/include/rte_service.h b/lib/eal/include/rte_service.h
index 70deb6e53a..90116a773a 100644
--- a/lib/eal/include/rte_service.h
+++ b/lib/eal/include/rte_service.h
@@ -37,7 +37,7 @@ extern "C" {
 
 /* Capabilities of a service.
  *
- * Use the *rte_service_probe_capability* function to check if a service is
+ * Use the rte_service_probe_capability() function to check if a service is
  * capable of a specific capability.
  */
 /** When set, the service is capable of having multiple threads run it at the
@@ -147,13 +147,13 @@ int32_t rte_service_map_lcore_get(uint32_t service_id, 
uint32_t lcore);
 int32_t rte_service_runstate_set(uint32_t id, uint32_t runstate);
 
 /**
- * Get the runstate for the service with *id*. See *rte_service_runstate_set*
+ * Get the runstate for the service with *id*. See rte_service_runstate_set()
  * for details of runstates. A service can call this function to ensure that
  * the application has indicated that it will receive CPU cycles. Either a
  * service-core is mapped (default case), or the application has explicitly
  * disabled the check that a service-cores is mapped to the service and takes
  * responsibility to run the service manually using the available function
- * *rte_service_run_iter_on_app_lcore* to do so.
+ * rte_service_run_iter_on_app_lcore() to do so.
  *
  * @retval 1 Service is running
  * @retval 0 Service is stopped
@@ -181,7 +181,7 @@ rte_service_may_be_active(uint32_t id);
 /**
  * Enable or disable the check for a service-core being mapped to the service.
  * An application can disable the check when takes the responsibility to run a
- * service itself using *rte_service_run_iter_on_app_lcore*.
+ * service itself using rte_service_run_iter_on_app_lcore().
  *
  * @param id The id of the service to set the check on
  * @param enable When zero, the check is disabled. Non-zero enables the check.
@@ -216,7 +216,7 @@ int32_t rte_service_set_runstate_mapped_check(uint32_t id, 
int32_t enable);
  *   atomics, applications can choose to enable or disable this feature
  *
  * Note that any thread calling this function MUST be a DPDK EAL thread, as
- * the *rte_lcore_id* function is used to access internal data structures.
+ * the rte_lcore_id() function is used to access internal data structures.
  *
  * @retval 0 Service was run on the calling thread successfully
  * @retval -EBUSY Another lcore is executing the service, and it is not a
@@ -232,7 +232,7 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
  *
  * Starting a core makes the core begin polling. Any services assigned to it
  * will be run as fast as possible. The application must ensure that the lcore
- * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id
+ * is in a launchable state: e.g. call rte_eal_lcore_wait() on the lcore_id
  * before calling this function.
  *
  * @retval 0 Success
@@ -248,7 +248,7 @@ int32_t rte_service_lcore_start(uint32_t lcore_id);
  * service core. Note that the service lcore thread may not have returned from
  * the service it is running when this API returns.
  *
- * The *rte_service_lcore_may_be_active* API can be used to check if the
+ * The rte_service_lcore_may_be_active() API can be used to check if the
  * service lcore is * still active.
  *
  * @retval 0 Success
@@ -265,7 +265,7 @@ int32_t rte_service_lcore_stop(uint32_t lcore_id);
  * Reports if a service lcore is currently running.
  *
  * This function returns if the core has finished service cores code, and has
- * returned to EAL control. If *rte_service_lcore_stop* has been called but
+ * returned to EAL control. If rte_service_lcore_stop() has been called but
  * the lcore has not returned to EAL yet, it might be required to wait and call
  * this function again. The amount of time to wait before the core returns
  * depends on the duration of the services being run.
@@ -293,7 +293,7 @@ int32_t rte_service_lcore_add(uint32_t lcore);
 /**
  * Removes lcore from the list of service cores.
  *
- * This can fail if the core is not stopped, see *rte_service_core_stop*.
+ * This can fail if the core is not stopped, see rte_service_core_stop().
  *
  * @retval 0 Success
  * @retval -EBUSY Lcore is not stopped, stop service core before removing.
@@ -308,7 +308,7 @@ int32_t rte_service_lcore_del(uint32_t lcore);
  * service core count can be used in mapping logic when creating mappings
  * from service cores to services.
  *
- * See *rte_service_lcore_list* for details on retrieving the lcore_id of each
+ * See rte_service_lcore_list() for details on retrieving the lcore_id of each
  * service core.
  *
  * @return The number of service cores currently configured.
@@ -344,14 +344,14 @@ int32_t rte_service_set_stats_enable(uint32_t i

[PATCH v2 1/6] service: reduce statistics overhead for parallel services

2022-10-05 Thread Mattias Rönnblom
Move the statistics from the service data structure to the per-lcore
struct. This eliminates contention for the counter cache lines, which
decreases the producer-side statistics overhead for services deployed
across many lcores.

Prior to this patch, enabling statistics for a service with a
per-service function call latency of 1000 clock cycles deployed across
16 cores on a Intel Xeon 6230N @ 2,3 GHz would incur a cost of ~1
core clock cycles per service call. After this patch, the statistics
overhead is reduce to 22 clock cycles per call.

Signed-off-by: Mattias Rönnblom 

--

v2:
* Introduce service stats struct which hold per-lcore service stats
  state, to improve spatial locality.
---
 lib/eal/common/rte_service.c | 190 +++
 1 file changed, 128 insertions(+), 62 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 94cb056196..6b807afacf 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -50,16 +50,12 @@ struct rte_service_spec_impl {
 * on currently.
 */
uint32_t num_mapped_cores;
-
-   /* 32-bit builds won't naturally align a uint64_t, so force alignment,
-* allowing regular reads to be atomic.
-*/
-   uint64_t calls __rte_aligned(8);
-   uint64_t cycles_spent __rte_aligned(8);
 } __rte_cache_aligned;
 
-/* Mask used to ensure uint64_t 8 byte vars are naturally aligned. */
-#define RTE_SERVICE_STAT_ALIGN_MASK (8 - 1)
+struct service_stats {
+   uint64_t calls;
+   uint64_t cycles;
+};
 
 /* the internal values of a service core */
 struct core_state {
@@ -70,7 +66,7 @@ struct core_state {
uint8_t is_service_core; /* set if core is currently a service core */
uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
uint64_t loops;
-   uint64_t calls_per_service[RTE_SERVICE_NUM_MAX];
+   struct service_stats service_stats[RTE_SERVICE_NUM_MAX];
 } __rte_cache_aligned;
 
 static uint32_t rte_service_count;
@@ -138,13 +134,16 @@ rte_service_finalize(void)
rte_service_library_initialized = 0;
 }
 
-/* returns 1 if service is registered and has not been unregistered
- * Returns 0 if service never registered, or has been unregistered
- */
-static inline int
+static inline bool
+service_registered(uint32_t id)
+{
+   return rte_services[id].internal_flags & SERVICE_F_REGISTERED;
+}
+
+static inline bool
 service_valid(uint32_t id)
 {
-   return !!(rte_services[id].internal_flags & SERVICE_F_REGISTERED);
+   return id < RTE_SERVICE_NUM_MAX && service_registered(id);
 }
 
 static struct rte_service_spec_impl *
@@ -155,7 +154,7 @@ service_get(uint32_t id)
 
 /* validate ID and retrieve service pointer, or return error value */
 #define SERVICE_VALID_GET_OR_ERR_RET(id, service, retval) do {  \
-   if (id >= RTE_SERVICE_NUM_MAX || !service_valid(id))\
+   if (!service_valid(id)) \
return retval;  \
service = &rte_services[id];\
 } while (0)
@@ -217,7 +216,7 @@ rte_service_get_by_name(const char *name, uint32_t 
*service_id)
 
int i;
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (service_valid(i) &&
+   if (service_registered(i) &&
strcmp(name, rte_services[i].spec.name) == 0) {
*service_id = i;
return 0;
@@ -254,7 +253,7 @@ rte_service_component_register(const struct 
rte_service_spec *spec,
return -EINVAL;
 
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (!service_valid(i)) {
+   if (!service_registered(i)) {
free_slot = i;
break;
}
@@ -366,29 +365,27 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 {
void *userdata = s->spec.callback_userdata;
 
-   /* Ensure the atomically stored variables are naturally aligned,
-* as required for regular loads to be atomic.
-*/
-   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, calls)
-   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
-   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, cycles_spent)
-   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
-
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
s->spec.callback(userdata);
uint64_t end = rte_rdtsc();
uint64_t cycles = end - start;
-   cs->calls_per_service[service_idx]++;
-   if (service_mt_safe(s)) {
-   __atomic_fetch_add(&s->cycles_spent, cycles, 
__ATOMIC_RELAXED);
-   __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
-   } else {
-   uint64_t cycles_ne

Re: [PATCH v2] drivers/bus: set device NUMA node to unknown by default

2022-10-05 Thread Thomas Monjalon
05/10/2022 11:04, Olivier Matz:
> Hi David,
> 
> On Wed, Oct 05, 2022 at 10:52:49AM +0200, David Marchand wrote:
> > On Tue, Oct 4, 2022 at 4:59 PM Olivier Matz  wrote:
> > >
> > > The dev->device.numa_node field is set by each bus driver for
> > > every device it manages to indicate on which NUMA node this device lies.
> > >
> > > When this information is unknown, the assigned value is not consistent
> > > across the bus drivers.
> > >
> > > Set the default value to SOCKET_ID_ANY (-1) by all bus drivers
> > > when the NUMA information is unavailable. This change impacts
> > > rte_eth_dev_socket_id() in the same manner.
> > >
> > > Signed-off-by: Olivier Matz 
> > > ---
> > >
> > > v2
> > > * use SOCKET_ID_ANY instead of -1 in drivers/dma/idxd (David)
> > > * document the behavior change of rte_eth_dev_socket_id()
> > > * fix few examples where rte_eth_dev_socket_id() was expected to
> > >   return 0 on unknown socket
> > 
> > Cc: ethdev maintainers.
> > 
> > [snip]
> > 
> > > diff --git a/doc/guides/rel_notes/release_22_11.rst 
> > > b/doc/guides/rel_notes/release_22_11.rst
> > > index 53fe21453c..d52f823694 100644
> > > --- a/doc/guides/rel_notes/release_22_11.rst
> > > +++ b/doc/guides/rel_notes/release_22_11.rst
> > > @@ -317,6 +317,12 @@ ABI Changes
> > >  * eventdev: Added ``weight`` and ``affinity`` fields
> > >to ``rte_event_queue_conf`` structure.
> > >
> > > +* bus: Changed the device numa node to -1 when NUMA information is 
> > > unavailable.
> > > +  The ``dev->device.numa_node`` field is set by each bus driver for
> > > +  every device it manages to indicate on which NUMA node this device 
> > > lies.
> > > +  When this information is unknown, the assigned value was not consistent
> > > +  across the bus drivers. This similarly impacts 
> > > ``rte_eth_dev_socket_id()``.
> > > +
> > >
> > >  Known Issues
> > >  
> > 
> > [snip]
> > 
> > > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > > index a21f58b9cd..dd8d25d6d4 100644
> > > --- a/lib/ethdev/rte_ethdev.h
> > > +++ b/lib/ethdev/rte_ethdev.h
> > > @@ -2445,8 +2445,8 @@ int rte_eth_hairpin_unbind(uint16_t tx_port, 
> > > uint16_t rx_port);
> > >   *   The port identifier of the Ethernet device
> > >   * @return
> > >   *   The NUMA socket ID to which the Ethernet device is connected or
> > > - *   a default of zero if the socket could not be determined.
> > > - *   -1 is returned is the port_id value is out of range.
> > > + *   a default of -1 (SOCKET_ID_ANY) if the socket could not be 
> > > determined.
> > > + *   -1 is also returned if the port_id is invalid.
> > >   */
> > >  int rte_eth_dev_socket_id(uint16_t port_id);
> > 
> > It would be better to distinguish the two cases, using rte_errno.
> > Something like:
> > 
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> > index 2821770e2d..1baf302804 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -562,8 +562,16 @@ rte_eth_dev_owner_get(const uint16_t port_id,
> > struct rte_eth_dev_owner *owner)
> >  int
> >  rte_eth_dev_socket_id(uint16_t port_id)
> >  {
> > -   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -1);
> > -   return rte_eth_devices[port_id].data->numa_node;
> > +   int socket_id = SOCKET_ID_ANY;
> > +
> > +   if (!rte_eth_dev_is_valid_port(port_id))
> > +   rte_errno = EINVAL;
> > +   } else {
> > +   socket_id = rte_eth_devices[port_id].data->numa_node;
> > +   if (socket_id == SOCKET_ID_ANY)
> > +   rte_errno = 0;
> > +   }
> > +   return socket_id;
> >  }
> > 
> >  void *
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index dd8d25d6d4..03456b2dbb 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -2444,9 +2444,11 @@ int rte_eth_hairpin_unbind(uint16_t tx_port,
> > uint16_t rx_port);
> >   * @param port_id
> >   *   The port identifier of the Ethernet device
> >   * @return
> > - *   The NUMA socket ID to which the Ethernet device is connected or
> > - *   a default of -1 (SOCKET_ID_ANY) if the socket could not be determined.
> > - *   -1 is also returned if the port_id is invalid.
> > + *   - The NUMA socket ID which the Ethernet device is connected to.
> > + *   - -1 (which translates to SOCKET_ID_ANY) if the socket could not be
> > + * determined. rte_errno is then set to:
> > + * - EINVAL is the port_id is invalid,
> > + * - 0 is the socket could not be determined,
> >   */
> >  int rte_eth_dev_socket_id(uint16_t port_id);
> > 
> > WDYT?
> 
> As discussed off-list, it is indeed better.

+1

Note that the decision to take in case of EINVAL
requires some work for each call to rte_eth_dev_socket_id().
I suppose we can leave it as a future exercise.




Re: [PATCH v9] eal: add bus cleanup to eal cleanup

2022-10-05 Thread Thomas Monjalon
05/10/2022 09:45, David Marchand:
> On Tue, Oct 4, 2022 at 6:47 PM Kevin Laatz  wrote:
> > This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
> > init/exit more symmetrical, ensuring all bus devices are cleaned up
> > appropriately without the application needing to be aware of all bus types
> > that may have been probed during initialization.
> >
> > Contained in this patch are the changes required to perform cleanup for
> > devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
> > ask for bus maintainers to add the relevant cleanup for their buses since
> > they have the domain expertise.
> 
> Cc: maintainers for info.

Kevin,
When you don't go to the end of a task, you must ask help to complete it.
Here you assume others will do it,
but you don't even Cc the relevant maintainers.

Please could you track that all buses will get the implementation?
You should open a bugzilla ticket for each bus,
and assign it to the relevant maintainer.
Thanks




RE: [PATCH v2 0/6] Service cores performance and statistics improvements

2022-10-05 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
> Sent: Wednesday, 5 October 2022 11.16
> 
> This series contains performance improvements and new statistics-
> related functionality for the EAL service cores framework.
> 
> A new per-lcore TSC cycle counter is introduced, which reflect the
> total amount of cycles spent by that lcore running services. This may
> be used to estimate service lcore load.
> 
> The patchset introduces a backward-compatible convention, where a DPDK
> service may signal to the framework that no useful work was performed,
> which in turn is used to make the busy cycles statistics more
> accurate.
> 
> Depends-on: series-23959 ("test/service: add perf measurements for with
> stats mode ")
> 
> Mattias Rönnblom (6):
>   service: reduce statistics overhead for parallel services
>   service: introduce per-lcore cycles counter
>   service: reduce average case service core overhead
>   service: tweak cycle statistics semantics
>   event/sw: report idle when no work is performed
>   service: provide links to functions in documentation
> 
>  app/test/test_service_cores.c   |   2 +-
>  drivers/event/sw/sw_evdev.c |   3 +-
>  drivers/event/sw/sw_evdev.h |   2 +-
>  drivers/event/sw/sw_evdev_scheduler.c   |   6 +-
>  lib/eal/common/rte_service.c| 228 +---
>  lib/eal/include/rte_service.h   |  32 ++--
>  lib/eal/include/rte_service_component.h |   5 +
>  7 files changed, 192 insertions(+), 86 deletions(-)
> 
> --
> 2.34.1
> 

In case it wasn't noticed with v1 of the patch series...

Series-Acked-by: Morten Brørup 



[PATCH v3 0/6] net/af_xdp: make compatible with libbpf v0.8.0

2022-10-05 Thread Andrew Rybchenko
Update net/af_xdp build to support libbfp v0.8.0.
Avoid library version based checks, check for function presense
instead.

v3:
- avoid version-based checks

Andrew Rybchenko (5):
  net/af_xdp: move XDP library presence flag to right branch
  net/af_xdp: make it clear which libxdp version is required
  net/af_xdp: avoid version-based check for shared UMEM
  net/af_xdp: avoid version-based check for program load mech
  net/af_xdp: log errors on XDP program removal failures

Ciara Loftus (1):
  net/af_xdp: make compatible with libbpf v0.8.0

 doc/guides/nics/af_xdp.rst |  3 +-
 doc/guides/rel_notes/release_22_11.rst |  4 ++
 drivers/net/af_xdp/meson.build | 47 +---
 drivers/net/af_xdp/rte_eth_af_xdp.c| 59 ++
 4 files changed, 88 insertions(+), 25 deletions(-)

-- 
2.30.2



[PATCH v3] mempool: fix get objects from mempool with cache

2022-10-05 Thread Andrew Rybchenko
From: Morten Brørup 

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

Fix the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.

Signed-off-by: Morten Brørup 
Signed-off-by: Andrew Rybchenko 
---
v3 changes (Andrew Rybchenko)
 - Always get first objects from the cache even if request is bigger
   than cache size. Remove one corresponding condition from the path
   when request is fully served from cache.
 - Simplify code to avoid duplication:
- Get objects directly from backend in single place only.
- Share code which gets from the cache first regardless if
  everythihg is obtained from the cache or just the first part.
 - Rollback cache length in unlikely failure branch to avoid cache
   vs NULL check in success branch.

v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

 lib/mempool/rte_mempool.h | 74 +--
 1 file changed, 48 insertions(+), 26 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index a3c4ee351d..58e41ed401 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1443,41 +1443,54 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
   unsigned int n, struct rte_mempool_cache *cache)
 {
int ret;
+   unsigned int remaining = n;
uint32_t index, len;
void **cache_objs;
 
-   /* No cache provided or cannot be satisfied from cache */
-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided */
+   if (unlikely(cache == NULL))
goto ring_dequeue;
 
-   cache_objs = cache->objs;
+   /* Use the cache as much as we have to return hot objects first */
+   len = RTE_MIN(remaining, cache->len);
+   cache_objs = &cache->objs[cache->len];
+   cache->len -= len;
+   remaining -= len;
+   for (index = 0; index < len; index++)
+   *obj_table++ = *--cache_objs;
 
-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, an

[PATCH v3 1/6] net/af_xdp: move XDP library presence flag to right branch

2022-10-05 Thread Andrew Rybchenko
RTE_NET_AF_XDP_LIBXDP is a conditional to include xdp/xsk.h and should
be set as soon as we know that the header is present.
RTE_NET_AF_XDP_SHARED_UMEM is one of conditions to use
xsk_socket__create_shared().
Both do not depend on libbpf and bpf/bpf.h presence.

Since else branch below returns error, there is no functional changes,
just style which will help on further rework.

Signed-off-by: Andrew Rybchenko 
---
 drivers/net/af_xdp/meson.build | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
index 1e0de23705..882d0b9518 100644
--- a/drivers/net/af_xdp/meson.build
+++ b/drivers/net/af_xdp/meson.build
@@ -17,10 +17,10 @@ endif
 
 if cc.has_header('linux/if_xdp.h')
 if xdp_dep.found() and cc.has_header('xdp/xsk.h')
+cflags += ['-DRTE_NET_AF_XDP_LIBXDP']
+cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
+ext_deps += xdp_dep
 if bpf_dep.found() and cc.has_header('bpf/bpf.h')
-cflags += ['-DRTE_NET_AF_XDP_LIBXDP']
-cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
-ext_deps += xdp_dep
 ext_deps += bpf_dep
 bpf_ver_dep = dependency('libbpf', version : '>=0.7.0',
  required: false, method: 'pkg-config')
-- 
2.30.2



[PATCH v3 3/6] net/af_xdp: avoid version-based check for shared UMEM

2022-10-05 Thread Andrew Rybchenko
Check for xsk_socket__create_shared() function instead.

Signed-off-by: Andrew Rybchenko 
---
 drivers/net/af_xdp/meson.build | 21 +++--
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
index fa011c357d..a01a67c7e7 100644
--- a/drivers/net/af_xdp/meson.build
+++ b/drivers/net/af_xdp/meson.build
@@ -19,7 +19,6 @@ endif
 if cc.has_header('linux/if_xdp.h')
 if xdp_dep.found() and cc.has_header('xdp/xsk.h')
 cflags += ['-DRTE_NET_AF_XDP_LIBXDP']
-cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
 ext_deps += xdp_dep
 if bpf_dep.found() and cc.has_header('bpf/bpf.h')
 ext_deps += bpf_dep
@@ -39,11 +38,6 @@ if cc.has_header('linux/if_xdp.h')
  required: false, method: 'pkg-config')
 if bpf_ver_dep.found()
 ext_deps += bpf_dep
-bpf_shumem_ver_dep = dependency('libbpf', version : '>=0.2.0',
-required: false, method: 'pkg-config')
-if bpf_shumem_ver_dep.found()
-cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
-endif
 else
 build = false
 reason = 'missing dependency, "libxdp ' + libxdp_ver + '" or 
"libbpf <= v0.6.0"'
@@ -56,3 +50,18 @@ else
 build = false
 reason = 'missing header, "linux/if_xdp.h"'
 endif
+
+if build
+  xsk_check_prefix = '''
+#ifdef RTE_NET_AF_XDP_LIBXDP
+#include 
+#else
+#include 
+#endif
+  '''
+
+  if cc.has_function('xsk_socket__create_shared', prefix : xsk_check_prefix,
+ dependencies : ext_deps)
+  cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
+  endif
+endif
-- 
2.30.2



[PATCH v3 2/6] net/af_xdp: make it clear which libxdp version is required

2022-10-05 Thread Andrew Rybchenko
Include checked libxdp version in driver build skip reason.

Signed-off-by: Andrew Rybchenko 
---
 drivers/net/af_xdp/meson.build | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
index 882d0b9518..fa011c357d 100644
--- a/drivers/net/af_xdp/meson.build
+++ b/drivers/net/af_xdp/meson.build
@@ -9,7 +9,8 @@ endif
 
 sources = files('rte_eth_af_xdp.c')
 
-xdp_dep = dependency('libxdp', version : '>=1.2.2', required: false, method: 
'pkg-config')
+libxdp_ver = '>=1.2.2'
+xdp_dep = dependency('libxdp', version : libxdp_ver, required: false, method: 
'pkg-config')
 bpf_dep = dependency('libbpf', required: false, method: 'pkg-config')
 if not bpf_dep.found()
 bpf_dep = cc.find_library('bpf', required: false)
@@ -45,11 +46,11 @@ if cc.has_header('linux/if_xdp.h')
 endif
 else
 build = false
-reason = 'missing dependency, "libxdp" or "libbpf <= v0.6.0"'
+reason = 'missing dependency, "libxdp ' + libxdp_ver + '" or 
"libbpf <= v0.6.0"'
 endif
 else
 build = false
-reason = 'missing dependency, "libxdp" and "libbpf"'
+reason = 'missing dependency, "libxdp ' + libxdp_ver + '" and "libbpf"'
 endif
 else
 build = false
-- 
2.30.2



[PATCH v3 4/6] net/af_xdp: avoid version-based check for program load mech

2022-10-05 Thread Andrew Rybchenko
Version-based checks are bad. It is better to check for required
functions. Check for bpf_object__next_program() in this case since
it appears last in libbpf among functions used to load program
without bpf_prog_load() which is deprecated in libbpf v0.7.0.

Signed-off-by: Andrew Rybchenko 
---
 drivers/net/af_xdp/meson.build | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
index a01a67c7e7..9d5ffab96b 100644
--- a/drivers/net/af_xdp/meson.build
+++ b/drivers/net/af_xdp/meson.build
@@ -22,11 +22,6 @@ if cc.has_header('linux/if_xdp.h')
 ext_deps += xdp_dep
 if bpf_dep.found() and cc.has_header('bpf/bpf.h')
 ext_deps += bpf_dep
-bpf_ver_dep = dependency('libbpf', version : '>=0.7.0',
- required: false, method: 'pkg-config')
-if bpf_ver_dep.found()
-cflags += ['-DRTE_NET_AF_XDP_LIBBPF_OBJ_OPEN']
-endif
 else
 build = false
 reason = 'missing dependency, libbpf'
@@ -64,4 +59,9 @@ if build
  dependencies : ext_deps)
   cflags += ['-DRTE_NET_AF_XDP_SHARED_UMEM']
   endif
+  if cc.has_function('bpf_object__next_program',
+ prefix : '#include ',
+ dependencies : bpf_dep)
+  cflags += ['-DRTE_NET_AF_XDP_LIBBPF_OBJ_OPEN']
+  endif
 endif
-- 
2.30.2



[PATCH v3 5/6] net/af_xdp: log errors on XDP program removal failures

2022-10-05 Thread Andrew Rybchenko
Make it visible in logs if something goes wrong on XDP program
removal failure.

Signed-off-by: Andrew Rybchenko 
---
 drivers/net/af_xdp/rte_eth_af_xdp.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index 9957de2314..f7c2321a18 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -866,18 +866,24 @@ eth_stats_reset(struct rte_eth_dev *dev)
return 0;
 }
 
-static void
+static int
 remove_xdp_program(struct pmd_internals *internals)
 {
uint32_t curr_prog_id = 0;
+   int ret;
 
-   if (bpf_get_link_xdp_id(internals->if_index, &curr_prog_id,
-   XDP_FLAGS_UPDATE_IF_NOEXIST)) {
+   ret = bpf_get_link_xdp_id(internals->if_index, &curr_prog_id,
+ XDP_FLAGS_UPDATE_IF_NOEXIST);
+   if (ret != 0) {
AF_XDP_LOG(ERR, "bpf_get_link_xdp_id failed\n");
-   return;
+   return ret;
}
-   bpf_set_link_xdp_fd(internals->if_index, -1,
-   XDP_FLAGS_UPDATE_IF_NOEXIST);
+
+   ret = bpf_set_link_xdp_fd(internals->if_index, -1,
+ XDP_FLAGS_UPDATE_IF_NOEXIST);
+   if (ret != 0)
+   AF_XDP_LOG(ERR, "bpf_set_link_xdp_fd failed\n");
+   return ret;
 }
 
 static void
@@ -932,7 +938,8 @@ eth_dev_close(struct rte_eth_dev *dev)
 */
dev->data->mac_addrs = NULL;
 
-   remove_xdp_program(internals);
+   if (remove_xdp_program(internals) != 0)
+   AF_XDP_LOG(ERR, "Error while removing XDP program.\n");
 
if (internals->shared_umem) {
struct internal_list *list;
-- 
2.30.2



[PATCH v3 6/6] net/af_xdp: make compatible with libbpf v0.8.0

2022-10-05 Thread Andrew Rybchenko
From: Ciara Loftus 

libbpf v0.8.0 deprecates the bpf_get_link_xdp_id() and
bpf_set_link_xdp_fd() functions. Use meson to detect if
bpf_xdp_attach() is available and if so, use the recommended
replacement functions bpf_xdp_query_id(), bpf_xdp_attach()
and bpf_xdp_detach().

Signed-off-by: Ciara Loftus 
Signed-off-by: Andrew Rybchenko 
---
 doc/guides/nics/af_xdp.rst |  3 +-
 doc/guides/rel_notes/release_22_11.rst |  4 +++
 drivers/net/af_xdp/meson.build |  5 
 drivers/net/af_xdp/rte_eth_af_xdp.c| 38 +-
 4 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/doc/guides/nics/af_xdp.rst b/doc/guides/nics/af_xdp.rst
index d42e0f1f79..ca23940e22 100644
--- a/doc/guides/nics/af_xdp.rst
+++ b/doc/guides/nics/af_xdp.rst
@@ -45,7 +45,8 @@ Prerequisites
 This is a Linux-specific PMD, thus the following prerequisites apply:
 
 *  A Linux Kernel (version > v4.18) with XDP sockets configuration enabled;
-*  Both libxdp >=v1.2.2 and libbpf libraries installed, or, libbpf <=v0.6.0
+*  Both libxdp >=v1.2.2 and libbpf <=v0.8.0 libraries installed, or, libbpf
+   <=v0.6.0.
 *  If using libxdp, it requires an environment variable called
LIBXDP_OBJECT_PATH to be set to the location of where libxdp placed its bpf
object files. This is usually in /usr/local/lib/bpf or /usr/local/lib64/bpf.
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 7e03388306..c1f18aecc9 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -78,6 +78,10 @@ New Features
   Added new rte_flow action which allows application to re-route packets
   directly to the kernel without software involvement.
 
+* **Updated AF_XDP PMD.**
+
+  * Made compatible with libbpf v0.8.0 (when used with libxdp).
+
 * **Updated Intel iavf driver.**
 
   * Added flow subscription support.
diff --git a/drivers/net/af_xdp/meson.build b/drivers/net/af_xdp/meson.build
index 9d5ffab96b..858047989e 100644
--- a/drivers/net/af_xdp/meson.build
+++ b/drivers/net/af_xdp/meson.build
@@ -64,4 +64,9 @@ if build
  dependencies : bpf_dep)
   cflags += ['-DRTE_NET_AF_XDP_LIBBPF_OBJ_OPEN']
   endif
+  if cc.has_function('bpf_xdp_attach',
+ prefix : '#include ',
+ dependencies : bpf_dep)
+  cflags += ['-DRTE_NET_AF_XDP_LIBBPF_XDP_ATTACH']
+  endif
 endif
diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c 
b/drivers/net/af_xdp/rte_eth_af_xdp.c
index f7c2321a18..b6ec9bf490 100644
--- a/drivers/net/af_xdp/rte_eth_af_xdp.c
+++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
@@ -866,6 +866,40 @@ eth_stats_reset(struct rte_eth_dev *dev)
return 0;
 }
 
+#ifdef RTE_NET_AF_XDP_LIBBPF_XDP_ATTACH
+
+static int link_xdp_prog_with_dev(int ifindex, int fd, __u32 flags)
+{
+   return bpf_xdp_attach(ifindex, fd, flags, NULL);
+}
+
+static int
+remove_xdp_program(struct pmd_internals *internals)
+{
+   uint32_t curr_prog_id = 0;
+   int ret;
+
+   ret = bpf_xdp_query_id(internals->if_index, XDP_FLAGS_UPDATE_IF_NOEXIST,
+  &curr_prog_id);
+   if (ret != 0) {
+   AF_XDP_LOG(ERR, "bpf_xdp_query_id failed\n");
+   return ret;
+   }
+
+   ret = bpf_xdp_detach(internals->if_index, XDP_FLAGS_UPDATE_IF_NOEXIST,
+NULL);
+   if (ret != 0)
+   AF_XDP_LOG(ERR, "bpf_xdp_detach failed\n");
+   return ret;
+}
+
+#else
+
+static int link_xdp_prog_with_dev(int ifindex, int fd, __u32 flags)
+{
+   return bpf_set_link_xdp_fd(ifindex, fd, flags);
+}
+
 static int
 remove_xdp_program(struct pmd_internals *internals)
 {
@@ -886,6 +920,8 @@ remove_xdp_program(struct pmd_internals *internals)
return ret;
 }
 
+#endif
+
 static void
 xdp_umem_destroy(struct xsk_umem_info *umem)
 {
@@ -1205,7 +1241,7 @@ load_custom_xdp_prog(const char *prog_path, int if_index, 
struct bpf_map **map)
}
 
/* Link the program with the given network device */
-   ret = bpf_set_link_xdp_fd(if_index, prog_fd,
+   ret = link_xdp_prog_with_dev(if_index, prog_fd,
XDP_FLAGS_UPDATE_IF_NOEXIST);
if (ret) {
AF_XDP_LOG(ERR, "Failed to set prog fd %d on interface\n",
-- 
2.30.2



Re: [PATCH v2] net/af_xdp: improve documentation

2022-10-05 Thread Andrew Rybchenko

On 9/6/22 15:06, Zhang, Qi Z wrote:




-Original Message-
From: Koikkara Reeny, Shibin 
Sent: Friday, July 22, 2022 4:51 PM
To: dev@dpdk.org
Cc: Loftus, Ciara ; Zhang, Qi Z 
Subject: [PATCH v2] net/af_xdp: improve documentation

From: Ciara Loftus 

Instead of a one-liner describing each vdev argument, add a description and
example for each. Move the information describing preferred busy polling
from the "Limitations" section to the "Options" section where it is better
placed. Also make general grammar improvements.

Signed-off-by: Ciara Loftus 


Reviewed by: Qi Zhang 



Reviewed-by: Qi Zhang 

Applied to dpdk-next-net/main, thanks.


Re: [PATCH v3] mempool: fix get objects from mempool with cache

2022-10-05 Thread Andrew Rybchenko

I'm sorry, below duplicate is sent my mistake.

On 10/5/22 12:50, Andrew Rybchenko wrote:

From: Morten Brørup 

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

Fix the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.

Signed-off-by: Morten Brørup 
Signed-off-by: Andrew Rybchenko 
---
v3 changes (Andrew Rybchenko)
  - Always get first objects from the cache even if request is bigger
than cache size. Remove one corresponding condition from the path
when request is fully served from cache.
  - Simplify code to avoid duplication:
 - Get objects directly from backend in single place only.
 - Share code which gets from the cache first regardless if
   everythihg is obtained from the cache or just the first part.
  - Rollback cache length in unlikely failure branch to avoid cache
vs NULL check in success branch.

v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

  lib/mempool/rte_mempool.h | 74 +--
  1 file changed, 48 insertions(+), 26 deletions(-)





Re: [PATCH 0/3] cleanup bsf and fls inline function return types

2022-10-05 Thread Thomas Monjalon
09/08/2022 10:26, Morten Brørup:
> > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com]
> > Sent: Monday, 8 August 2022 23.21
> > 
> > The cleanup resulted from request to review [1] the following
> > functions where there appeared to be inconsistency in return type
> > or parameter type selections for the following inline functions.
> > 
> > Since the original request some instances have been fixed so this
> > series completes the outstanding return value type updates.
> > 
> > [1] http://mails.dpdk.org/archives/dev/2021-March/201590.html
> > 
> > Tyler Retzlaff (3):
> >   doc: announce cleanup of rte_{bsf, fls} inline functions type use
> >   eal: change rte_fls and rte_bsf to return uint32_t
> >   test: fix sign compare warning for rte_bsf64 return type change
> > 
> >  app/test/test_mbuf.c | 2 +-
> >  doc/guides/rel_notes/deprecation.rst | 6 ++
> >  lib/eal/include/rte_common.h | 6 +++---
> >  3 files changed, 10 insertions(+), 4 deletions(-)
> 
> Series-Acked-by: Morten Brørup 

Applied and completed.
Release notes added.
All squashed.






Re: [PATCH v2 0/6] Service cores performance and statistics improvements

2022-10-05 Thread Mattias Rönnblom
On 2022-10-05 11:49, Morten Brørup wrote:
>> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
>> Sent: Wednesday, 5 October 2022 11.16
>>
>> This series contains performance improvements and new statistics-
>> related functionality for the EAL service cores framework.
>>
>> A new per-lcore TSC cycle counter is introduced, which reflect the
>> total amount of cycles spent by that lcore running services. This may
>> be used to estimate service lcore load.
>>
>> The patchset introduces a backward-compatible convention, where a DPDK
>> service may signal to the framework that no useful work was performed,
>> which in turn is used to make the busy cycles statistics more
>> accurate.
>>
>> Depends-on: series-23959 ("test/service: add perf measurements for with
>> stats mode ")
>>
>> Mattias Rönnblom (6):
>>service: reduce statistics overhead for parallel services
>>service: introduce per-lcore cycles counter
>>service: reduce average case service core overhead
>>service: tweak cycle statistics semantics
>>event/sw: report idle when no work is performed
>>service: provide links to functions in documentation
>>
>>   app/test/test_service_cores.c   |   2 +-
>>   drivers/event/sw/sw_evdev.c |   3 +-
>>   drivers/event/sw/sw_evdev.h |   2 +-
>>   drivers/event/sw/sw_evdev_scheduler.c   |   6 +-
>>   lib/eal/common/rte_service.c| 228 +---
>>   lib/eal/include/rte_service.h   |  32 ++--
>>   lib/eal/include/rte_service_component.h |   5 +
>>   7 files changed, 192 insertions(+), 86 deletions(-)
>>
>> --
>> 2.34.1
>>
> 
> In case it wasn't noticed with v1 of the patch series...
> 
> Series-Acked-by: Morten Brørup 
> 

It was noticed, but then forgotten. Just like I forgot to add the
Acked-by from Harry van Haaren.



Re: [PATCH v5] examples/vm_power_manager: use safe version of list iterator

2022-10-05 Thread Thomas Monjalon
05/10/2022 00:09, Reshma Pattan:
> From: Hamza Khan 
> 
> Currently, when vm_power_manager exits, we are using a LIST_FOREACH
> macro to iterate over VM info structures while freeing them. This
> leads to use-after-free error. To address this, replace all usages of
> LIST_* with TAILQ_* macros, and use the RTE_TAILQ_FOREACH_SAFE macro
> to iterate and delete VM info structures.
> 
> Fixes: e8ae9b662506 ("examples/vm_power: channel manager and monitor in host")
> Cc: alan.ca...@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Hamza Khan 
> Signed-off-by: Reshma Pattan 
> Acked-by: David Hunt 
> ---
>  examples/vm_power_manager/channel_manager.c | 20 +++-
>  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/examples/vm_power_manager/channel_manager.c 
> b/examples/vm_power_manager/channel_manager.c
> index 838465ab4b..cb872ad2d5 100644
> --- a/examples/vm_power_manager/channel_manager.c
> +++ b/examples/vm_power_manager/channel_manager.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  
> @@ -30,6 +31,7 @@
>  #include "power_manager.h"
>  
>  
> +
>  #define RTE_LOGTYPE_CHANNEL_MANAGER RTE_LOGTYPE_USER1

I suppose we don't need a third blank line here.
Removing.

[...]
> - LIST_FOREACH(vm_info, &vm_list_head, vms_info) {
> + RTE_TAILQ_FOREACH_SAFE(vm_info, &vm_list_head, vms_info, tmp) {

Applied, thanks.




[Bug 1094] Add FPGA bus cleanup

2022-10-05 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1094

Bug ID: 1094
   Summary: Add FPGA bus cleanup
   Product: DPDK
   Version: unspecified
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: other
  Assignee: dev@dpdk.org
  Reporter: kevin.la...@intel.com
  Target Milestone: future

Bus cleanup is now done as part of EAL cleanup [1].

A new cleanup function should be implemented and registered to the bus struct
to ensure proper cleanup is done for devices on the FPGA bus on shutdown.

[1]
https://github.com/DPDK/dpdk/commit/1cab1a40ea9b858821aaf4655486e31ca1b52456

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [PATCH v7 1/3] power: add uncore frequency control API to the power library

2022-10-05 Thread Kearney, Tadhg
> -Original Message-
> From: Thomas Monjalon 
> Sent: Tuesday 4 October 2022 18:09
> To: Kearney, Tadhg 
> Cc: dev@dpdk.org; Hunt, David ; Burakov, Anatoly
> ; Pattan, Reshma 
> Subject: Re: [PATCH v7 1/3] power: add uncore frequency control API to the
> power library
> 
> 28/09/2022 15:30, Tadhg Kearney:
> > Add API to allow uncore frequency adjustment. This is done through
> > manipulating related uncore frequency control sysfs entries to adjust
> > the minimum and maximum uncore frequency values.
> > Nine API's are being added that are all public and experimental.
> 
> You cannot introduce an API without explaining what it is about.
> Maybe I'm an idiot but I don't know what is "uncore".
> I see it is explained in the documentation, but few words in the commit
> message would not be too much.
> At least you must say it for Linux on Intel, and which feature it is 
> controlling.
> 
> > +Uncore API
> > +--
> > +
> > +Abstract
> > +
> > +
> > +Uncore is a term used by Intel to describe the functions of a
> > +microprocessor that are not in the core, but which must be closely
> > +connected to the core to achieve high performance;
> > +L3 cache, on-die memory controller, etc.
> > +Significant power savings can be achieved by reducing the uncore
> frequency to its lowest value.
> 
> So this is an Intel thing.

Yes, so far the uncore kernel implementation covers Intel hardware.

> 
> > +
> > +The Linux kernel provides the driver “intel-uncore-frequency" to
> > +control the uncore frequency limits for x86 platform. The driver is
> available from kernel version 5.6 and above.
> > +Also CONFIG_INTEL_UNCORE_FREQ_CONTROL will need to be enabled in
> the kernel, which was added in 5.6.
> > +This manipulates the contest of MSR 0x620, which sets min/max of the
> uncore for the SKU.
> 
> It is correctly named "intel-uncore" in the Linux kernel.
> Why not having "Intel" in the DPDK feature name?
> 
> > +
> > +
> > +API Overview for Uncore
> > +~~~
> 
> A blank line is missing here.
> 
> > +* **Uncore Power Init**: Initialise uncore power, populate frequency
> > +array and record
> > +  original min & max for pkg & die.
> > +
> > +* **Uncore Power Exit**: Exit uncore power, restoring original min & max
> for pkg & die.
> > +
> > +* **Get Uncore Power Freq**: Get current uncore freq index for pkg &
> die.
> > +
> > +* **Set Uncore Power Freq**: Set min & max uncore freq index for pkg &
> die (min and max will be the same).
> > +
> > +* **Uncore Power Max**: Set max uncore freq index for pkg & die.
> > +
> > +* **Uncore Power Min**: Set min uncore freq index for pkg & die.
> > +
> > +* **Get Num Freqs**: Get the number of frequencies in the index array.
> > +
> > +* **Get Num Pkgs**: Get the number of packages (CPUs) on the system.
> > +
> > +* **Get Num Dies**: Get the number of die's on a given package.
> 
> Not sure what you are listing here. Are they functions?
> If you really want to keep a list, I suggest using a definition list 
> available in RST
> syntax.
> If you want to provide an explanation easy to read, full sentences connecting
> things together would be better.

Agreed.

> 
> > +
> >  References
> >  --
> >
> > diff --git a/doc/guides/rel_notes/release_22_11.rst
> > b/doc/guides/rel_notes/release_22_11.rst
> > index cb7677fd3c..5d3f815b54 100644
> > --- a/doc/guides/rel_notes/release_22_11.rst
> > +++ b/doc/guides/rel_notes/release_22_11.rst
> > @@ -75,6 +75,11 @@ New Features
> >* Added ``rte_event_eth_tx_adapter_instance_get`` to get Tx adapter
> >  instance ID for specified ethernet device ID and Tx queue index.
> >
> > +* **Added uncore frequency control API to the power library.**
> > +
> > +  Add api to allow uncore frequency adjustment. This is done through
> 
> s/api/API/
> 
> > +  manipulating related uncore frequency control sysfs entries to
> > + adjust the minimum and maximum uncore frequency values.
> 
> It is Linux-only for Intel hardware only.
> 
> > --- /dev/null
> > +++ b/lib/power/rte_power_uncore.c
> 
> I would add "intel" in the filename.
> 
> [...]
> > +#define UNCORE_FREQUENCY_DIR
> "/sys/devices/system/cpu/intel_uncore_frequency"
> > +#define POWER_GOVERNOR_PERF "performance"
> > +#define POWER_UNCORE_SYSFILE_MAX_FREQ \
> > +
>   "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> die_%02u/max_freq_khz"
> > +#define POWER_UNCORE_SYSFILE_MIN_FREQ  \
> > +
>   "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> die_%02u/min_freq_khz"
> > +#define POWER_UNCORE_SYSFILE_BASE_MAX_FREQ \
> > +
>   "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> die_%02u/initial_max_freq_khz"
> > +#define POWER_UNCORE_SYSFILE_BASE_MIN_FREQ  \
> > +
>   "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> die_%02u/initial_min_freq_khz"
> 
> It is for Intel CPU only, right?

Currently only Intel CPUs are covered by these sysfs entries, but it is 
possible that other platforms will be i

Re: [PATCH v9] eal: add bus cleanup to eal cleanup

2022-10-05 Thread Kevin Laatz

Hi Thomas, All,

On 05/10/2022 10:41, Thomas Monjalon wrote:

05/10/2022 09:45, David Marchand:

On Tue, Oct 4, 2022 at 6:47 PM Kevin Laatz  wrote:

This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
init/exit more symmetrical, ensuring all bus devices are cleaned up
appropriately without the application needing to be aware of all bus types
that may have been probed during initialization.

Contained in this patch are the changes required to perform cleanup for
devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
ask for bus maintainers to add the relevant cleanup for their buses since
they have the domain expertise.

Cc: maintainers for info.

Kevin,
When you don't go to the end of a task, you must ask help to complete it.
Here you assume others will do it,
but you don't even Cc the relevant maintainers.

Please could you track that all buses will get the implementation?
You should open a bugzilla ticket for each bus,
and assign it to the relevant maintainer.
Thanks


Bugzilla tickets created:

[auxiliary] https://bugs.dpdk.org/show_bug.cgi?id=1090
[dpaa] https://bugs.dpdk.org/show_bug.cgi?id=1091
[fslmc] https://bugs.dpdk.org/show_bug.cgi?id=1092
[vmbus] https://bugs.dpdk.org/show_bug.cgi?id=1093
[ifpga] https://bugs.dpdk.org/show_bug.cgi?id=1094

I've assigned to maintainers that have a registered account with Bugzilla.

I was not able to assign the iFPGA ticket to Rosen Xu since his email is 
not registered.


--
Kevin


[PATCH] doc: clarify deprecation status for flow actions PF and VF

2022-10-05 Thread Ivan Malov
These actions have been deprecated since DPDK 21.11 as
ambiguous and hard-to-use, but their removal might not
be popular because net drivers i40e, ixgbe and txgbe
employ these actions in complicated "PF/VF + QUEUE"
tunnel rule support. Maintainers of these drivers
should voice their attitude to the said problem.

For now, document the status in deprecation notes.

Signed-off-by: Ivan Malov 
Reviewed-by: Andrew Rybchenko 
---
 doc/guides/rel_notes/deprecation.rst | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index c31b92a41f..12867dc3cf 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -83,8 +83,13 @@ Deprecation Notices
   will be limited to maximum 256 queues.
   Also compile time flag ``RTE_ETHDEV_QUEUE_STAT_CNTRS`` will be removed.
 
-* ethdev: Items and actions ``PF``, ``VF``, ``PHY_PORT``, ``PORT_ID`` are
-  deprecated as hard-to-use / ambiguous and will be removed in DPDK 22.11.
+* ethdev: Flow actions ``PF`` and ``VF`` have been deprecated since DPDK 21.11
+  and are yet to be removed. That still has not happened because there are net
+  drivers which support combined use of either action ``PF`` or action ``VF``
+  with action ``QUEUE``, namely, i40e, ixgbe and txgbe (L2 tunnel rule).
+  It is unclear whether it is acceptable to just drop support for
+  such a complex use case, so maintainers of the said drivers
+  should take a closer look at this and provide assistance.
 
 * ethdev: Actions ``OF_DEC_NW_TTL``, ``SET_IPV4_SRC``, ``SET_IPV4_DST``,
   ``SET_IPV6_SRC``, ``SET_IPV6_DST``, ``SET_TP_SRC``, ``SET_TP_DST``,
-- 
2.30.2



Re: [PATCH v10 00/13] preparation for the rte_flow offload of nfp PMD

2022-10-05 Thread Ferruh Yigit

On 9/26/2022 7:59 AM, Chaoyong He wrote:

This is the first patch series to add the support of rte_flow offload for
nfp PMD, includes:
Add the support of flower firmware application
Add the support of representor port
Add the flower service infrastructure
Add the cmsg interactive channels between pmd and fw

* Changes since v9
- Remove the use of rte_eth_tx_burst()
- Remove the logics rely on OvS

* Changes since v8
- Update the nfp.rst
- Fix the 'app_hw' to 'app_fw'
- Remove the ovs compatible header file
- Remove the use of 
rte_eth_dev_configure()/rte_eth_rx_burst()/rte_eth_dev_start() API

* Changes since v7
- Adjust the logics to make sure not break the pci probe process
- Change 'app' to 'app_fw' in all logics to avoid confuse
- Fix problem about log level

* Changes since v6
- Fix the compile error

* Changes since v5
- Compare integer with 0 explicitly
- Change helper macro to function
- Implement the dummy functions
- Remove some unnecessary logics

* Changes since v4
- Remove the unneeded '__rte_unused' attribute
- Fixup a potential memory leak problem

* Changes since v3
- Add the 'Depends-on' tag

* Changes since v2
- Remove the use of rte_panic()

* Changes since v1
- Fix the compile error

Depends-on: series-23707 ("Add support of NFP3800 chip and firmware with NFDk")

Chaoyong He (13):
   net/nfp: move app specific attributes to own struct
   net/nfp: simplify initialization and remove dead code
   net/nfp: move app specific init logic to own function
   net/nfp: add initial flower firmware support
   net/nfp: add flower PF setup logic
   net/nfp: add flower ctrl VNIC related logics
   net/nfp: move common rxtx function for flower use
   net/nfp: add flower ctrl VNIC rxtx logic
   net/nfp: add flower representor framework
   net/nfp: add flower PF related routines
   net/nfp: move rxtx function to header file
   net/nfp: add flower PF rxtx logic
   net/nfp: add the representor port rxtx logic



Series applied to dpdk-next-net/main, thanks.



Re: [PATCH v9] eal: add bus cleanup to eal cleanup

2022-10-05 Thread Thomas Monjalon
05/10/2022 13:03, Kevin Laatz:
> Hi Thomas, All,
> 
> On 05/10/2022 10:41, Thomas Monjalon wrote:
> > 05/10/2022 09:45, David Marchand:
> >> On Tue, Oct 4, 2022 at 6:47 PM Kevin Laatz  wrote:
> >>> This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
> >>> init/exit more symmetrical, ensuring all bus devices are cleaned up
> >>> appropriately without the application needing to be aware of all bus types
> >>> that may have been probed during initialization.
> >>>
> >>> Contained in this patch are the changes required to perform cleanup for
> >>> devices on the PCI bus and VDEV bus during eal_cleanup(). There would be 
> >>> an
> >>> ask for bus maintainers to add the relevant cleanup for their buses since
> >>> they have the domain expertise.
> >> Cc: maintainers for info.
> > Kevin,
> > When you don't go to the end of a task, you must ask help to complete it.
> > Here you assume others will do it,
> > but you don't even Cc the relevant maintainers.
> >
> > Please could you track that all buses will get the implementation?
> > You should open a bugzilla ticket for each bus,
> > and assign it to the relevant maintainer.
> > Thanks
> >
> Bugzilla tickets created:
> 
> [auxiliary] https://bugs.dpdk.org/show_bug.cgi?id=1090
> [dpaa] https://bugs.dpdk.org/show_bug.cgi?id=1091
> [fslmc] https://bugs.dpdk.org/show_bug.cgi?id=1092
> [vmbus] https://bugs.dpdk.org/show_bug.cgi?id=1093
> [ifpga] https://bugs.dpdk.org/show_bug.cgi?id=1094
> 
> I've assigned to maintainers that have a registered account with Bugzilla.
> 
> I was not able to assign the iFPGA ticket to Rosen Xu since his email is 
> not registered.

Thanks, please don't hesitate to ping again if there is no fast enough progress.




Re: [PATCH v7 1/3] power: add uncore frequency control API to the power library

2022-10-05 Thread Thomas Monjalon
05/10/2022 12:50, Kearney, Tadhg:
> > -Original Message-
> > From: Thomas Monjalon 
> > Sent: Tuesday 4 October 2022 18:09
> > To: Kearney, Tadhg 
> > Cc: dev@dpdk.org; Hunt, David ; Burakov, Anatoly
> > ; Pattan, Reshma 
> > Subject: Re: [PATCH v7 1/3] power: add uncore frequency control API to the
> > power library
> > 
> > 28/09/2022 15:30, Tadhg Kearney:
> > > Add API to allow uncore frequency adjustment. This is done through
> > > manipulating related uncore frequency control sysfs entries to adjust
> > > the minimum and maximum uncore frequency values.
> > > Nine API's are being added that are all public and experimental.
> > 
> > You cannot introduce an API without explaining what it is about.
> > Maybe I'm an idiot but I don't know what is "uncore".
> > I see it is explained in the documentation, but few words in the commit
> > message would not be too much.
> > At least you must say it for Linux on Intel, and which feature it is 
> > controlling.
> > 
> > > +Uncore API
> > > +--
> > > +
> > > +Abstract
> > > +
> > > +
> > > +Uncore is a term used by Intel to describe the functions of a
> > > +microprocessor that are not in the core, but which must be closely
> > > +connected to the core to achieve high performance;
> > > +L3 cache, on-die memory controller, etc.
> > > +Significant power savings can be achieved by reducing the uncore
> > frequency to its lowest value.
> > 
> > So this is an Intel thing.
> 
> Yes, so far the uncore kernel implementation covers Intel hardware.
> 
> > 
> > > +
> > > +The Linux kernel provides the driver “intel-uncore-frequency" to
> > > +control the uncore frequency limits for x86 platform. The driver is
> > available from kernel version 5.6 and above.
> > > +Also CONFIG_INTEL_UNCORE_FREQ_CONTROL will need to be enabled in
> > the kernel, which was added in 5.6.
> > > +This manipulates the contest of MSR 0x620, which sets min/max of the
> > uncore for the SKU.
> > 
> > It is correctly named "intel-uncore" in the Linux kernel.
> > Why not having "Intel" in the DPDK feature name?
> > 
> > > +
> > > +
> > > +API Overview for Uncore
> > > +~~~
> > 
> > A blank line is missing here.
> > 
> > > +* **Uncore Power Init**: Initialise uncore power, populate frequency
> > > +array and record
> > > +  original min & max for pkg & die.
> > > +
> > > +* **Uncore Power Exit**: Exit uncore power, restoring original min & max
> > for pkg & die.
> > > +
> > > +* **Get Uncore Power Freq**: Get current uncore freq index for pkg &
> > die.
> > > +
> > > +* **Set Uncore Power Freq**: Set min & max uncore freq index for pkg &
> > die (min and max will be the same).
> > > +
> > > +* **Uncore Power Max**: Set max uncore freq index for pkg & die.
> > > +
> > > +* **Uncore Power Min**: Set min uncore freq index for pkg & die.
> > > +
> > > +* **Get Num Freqs**: Get the number of frequencies in the index array.
> > > +
> > > +* **Get Num Pkgs**: Get the number of packages (CPUs) on the system.
> > > +
> > > +* **Get Num Dies**: Get the number of die's on a given package.
> > 
> > Not sure what you are listing here. Are they functions?
> > If you really want to keep a list, I suggest using a definition list 
> > available in RST
> > syntax.
> > If you want to provide an explanation easy to read, full sentences 
> > connecting
> > things together would be better.
> 
> Agreed.
> 
> > 
> > > +
> > >  References
> > >  --
> > >
> > > diff --git a/doc/guides/rel_notes/release_22_11.rst
> > > b/doc/guides/rel_notes/release_22_11.rst
> > > index cb7677fd3c..5d3f815b54 100644
> > > --- a/doc/guides/rel_notes/release_22_11.rst
> > > +++ b/doc/guides/rel_notes/release_22_11.rst
> > > @@ -75,6 +75,11 @@ New Features
> > >* Added ``rte_event_eth_tx_adapter_instance_get`` to get Tx adapter
> > >  instance ID for specified ethernet device ID and Tx queue index.
> > >
> > > +* **Added uncore frequency control API to the power library.**
> > > +
> > > +  Add api to allow uncore frequency adjustment. This is done through
> > 
> > s/api/API/
> > 
> > > +  manipulating related uncore frequency control sysfs entries to
> > > + adjust the minimum and maximum uncore frequency values.
> > 
> > It is Linux-only for Intel hardware only.
> > 
> > > --- /dev/null
> > > +++ b/lib/power/rte_power_uncore.c
> > 
> > I would add "intel" in the filename.
> > 
> > [...]
> > > +#define UNCORE_FREQUENCY_DIR
> > "/sys/devices/system/cpu/intel_uncore_frequency"
> > > +#define POWER_GOVERNOR_PERF "performance"
> > > +#define POWER_UNCORE_SYSFILE_MAX_FREQ \
> > > +
> > "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> > die_%02u/max_freq_khz"
> > > +#define POWER_UNCORE_SYSFILE_MIN_FREQ  \
> > > +
> > "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> > die_%02u/min_freq_khz"
> > > +#define POWER_UNCORE_SYSFILE_BASE_MAX_FREQ \
> > > +
> > "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_
> > die_%02u/initial_max_freq_khz"
> > 

Re: [PATCH] gro: fix the chain index in insert_new_item for more than 2 packets

2022-10-05 Thread Thomas Monjalon
> > When more than two packets are merged in a flow, and if we receive a 3rd
> > packet which is matching the sequence of the 2nd packet the prev_idx will
> > be 1 and not 2, hence resulting in packet re-ordering
> > 
> > Signed-off-by: Kumara Parameshwaran 
> > ---
> > V1:
> > Initial changes to fix packet reordering issue when
> > more than 2 items are chained in a flow.
> > Ex:
> > 3 mergeable TCP packets received in order.
> > packet_0 - no flow found so insert the packet and new start
> > index -> 0
> > packet_1-> flow found. prev_idx, curr_index = 0. So merge
> > works
> > find packet_0->packet_1
> > packet_2 flow found. prev_indx =0, curr_index = 1. Matching
> > dequence numbers found but chained as
> > packet_0->packet_2->packet_1
> > 
> >  lib/gro/gro_tcp4.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/lib/gro/gro_tcp4.c b/lib/gro/gro_tcp4.c index
> > 7498c66141..9758e28fd5 100644
> > --- a/lib/gro/gro_tcp4.c
> > +++ b/lib/gro/gro_tcp4.c
> > @@ -305,7 +305,7 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
> >  * length is greater than the max value. Store
> >  * the packet into the flow.
> >  */
> > -   if (insert_new_item(tbl, pkt, start_time, prev_idx,
> > +   if (insert_new_item(tbl, pkt, start_time, cur_idx,
> > sent_seq, ip_id, is_atomic)
> 
> Good catch.
> 
> Acked-by: Jiayu Hu 

Applied, thanks.





Re: [PATCH v2] doc: add removal note for power empty poll API

2022-10-05 Thread Thomas Monjalon
05/10/2022 10:44, Reshma Pattan:
> Add removal note for experimental empty poll API.
> 
> CC: David Hunt 
> 
> Signed-off-by: Reshma Pattan 
> Acked-by: David Hunt 
[...]
> --- a/doc/guides/prog_guide/power_man.rst
> +++ b/doc/guides/prog_guide/power_man.rst
> @@ -192,6 +192,12 @@ User Cases
>  --
>  The mechanism can applied to any device which is based on polling. e.g. NIC, 
> FPGA.
>  
> +Removal Note
> +
> +The experimental empty poll APIs will be removed from the library in a future
> +DPDK release.

It looks common in this file to forget the empty line after a title,
but please don't forget it.

I am surprised Sphinx is not complaining.




Re: [PATCH] examples/pipeline: fix memory free

2022-10-05 Thread Thomas Monjalon
27/09/2022 06:37, Harshad Narayane:
> Coverity issue: 380863
> Coverity issue: 380866
> Fixes: 6bc14d9f ("examples/pipeline: add command for shared library build")
> 
> Fix memory resource free for buffer allocation failure at pipeline
> library build.
> 
> Signed-off-by: Harshad Narayane 
> Signed-off-by: Kamalakannan R 
> Acked-by: Cristian Dumitrescu 

Reordered lines in the commit log and applied, thanks.
Note: this is not fixing "free", it was missing,
but it is fixing a leak.





Re: [PATCH] examples/pipeline: fix file close

2022-10-05 Thread Thomas Monjalon
27/09/2022 06:35, Harshad Narayane:
> Coverity issue: 380860
> Fixes: 9043f66a ("examples/pipeline: add command for code generation")
> 
> Fix file close at pipeline code generation.
> 
> Signed-off-by: Harshad Narayane 
> Signed-off-by: Kamalakannan R 
> Acked-by: Cristian Dumitrescu 
> ---
> --- a/examples/pipeline/cli.c
> +++ b/examples/pipeline/cli.c
> @@ -545,6 +545,7 @@ cmd_pipeline_codegen(char **tokens,
>   code_file = fopen(tokens[3], "w");
>   if (!code_file) {
>   snprintf(out, out_size, "Cannot open file %s.\n", tokens[3]);
> + fclose(spec_file);
>   return;
>   }


Reordered lines in the commit log and applied, thanks.
Note: this is not fixing "close", it was missing,
but it is fixing a leak.




Re: [PATCH v3 2/2] vhost: improve error handling in desc_to_mbuf

2022-10-05 Thread Maxime Coquelin




On 8/2/22 02:49, Claudio Fontana wrote:

check when increasing vec_idx that it is still valid
in the (buf_len < dev->vhost_hlen) case too.

Tested-by: Claudio Fontana 
Signed-off-by: Claudio Fontana 
---
  lib/vhost/virtio_net.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index eb19e54c2b..20ed951979 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -2704,12 +2704,15 @@ desc_to_mbuf(struct virtio_net *dev, struct 
vhost_virtqueue *vq,
if (unlikely(buf_len < dev->vhost_hlen)) {
buf_offset = dev->vhost_hlen - buf_len;
vec_idx++;
+   if (unlikely(vec_idx >= nr_vec))
+   goto error;
buf_addr = buf_vec[vec_idx].buf_addr;
buf_iova = buf_vec[vec_idx].buf_iova;
buf_len = buf_vec[vec_idx].buf_len;
buf_avail  = buf_len - buf_offset;
} else if (buf_len == dev->vhost_hlen) {
-   if (unlikely(++vec_idx >= nr_vec))
+   vec_idx++;
+   if (unlikely(vec_idx >= nr_vec))
goto error;
buf_addr = buf_vec[vec_idx].buf_addr;
buf_iova = buf_vec[vec_idx].buf_iova;


This patch is no more required since fixes for CVE-2022-2132 takes care
of this:
dc1516e260a0 ("vhost: fix header spanned across more than two descriptors")
71bd0cc536ad ("vhost: discard too small descriptor chains")

Maxime



Re: [PATCH v7 2/3] timer: fix function to stop all timers

2022-10-05 Thread Thomas Monjalon
> > On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
> >  wrote:
> > >
> > > There is a possibility of deadlock in this API, as same spinlock is
> > > tried to be acquired in nested manner.
> > >
> > > If the lcore that is stopping the timer is different from the lcore
> > > that owns the timer, the timer list lock is acquired in timer_del(),
> > > even if local_is_locked is true. Because the same lock was already
> > > acquired in rte_timer_stop_all(), the thread will hang.
> > >
> > > This patch removes the acquisition of nested lock.
> > >
> > > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > > list")
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Naga Harish K S V 
> Acked-by: Erik Gabriel Carrillo 
> 
> Added missing ack

Applied, thanks.





Re: [PATCH v3 2/2] service: fix potential stats race-condition on MT services

2022-10-05 Thread David Marchand
On Mon, Jul 11, 2022 at 3:18 PM Harry van Haaren
 wrote:
>
> This commit fixes a potential racey-add that could occur if
> multiple service-lcores were executing the same MT-safe service
> at the same time, with service statistics collection enabled.
>
> Because multiple threads can run and execute the service, the
> stats values can have multiple writer threads, resulting in the
> requirement of using atomic addition for correctness.
>
> Note that when a MT unsafe service is executed, a spinlock is
> held, so the stats increments are protected. This fact is used
> to avoid executing atomic add instructions when not required.
> Regular reads and increments are used, and only the store is
> specified as atomic, reducing perf impact on e.g. x86 arch.
>
> This patch causes a 1.25x increase in cycle-cost for polling a
> MT safe service when statistics are enabled. No change was seen
> for MT unsafe services, or when statistics are disabled.

Fixes: 21698354c832 ("service: introduce service cores concept")

I did not mark for backport since the commitlog indicates a performance impact.
You can still ask for backport by pinging LTS maintainers.

>
> Reported-by: Mattias Rönnblom 
> Suggested-by: Honnappa Nagarahalli 
> Suggested-by: Morten Brørup 
> Suggested-by: Bruce Richardson 
> Signed-off-by: Harry van Haaren 


Series applied, thanks.

-- 
David Marchand



Re: [PATCH v2 0/6] Service cores performance and statistics improvements

2022-10-05 Thread David Marchand
On Wed, Oct 5, 2022 at 11:20 AM Mattias Rönnblom
 wrote:
>
> This series contains performance improvements and new statistics-
> related functionality for the EAL service cores framework.
>
> A new per-lcore TSC cycle counter is introduced, which reflect the
> total amount of cycles spent by that lcore running services. This may
> be used to estimate service lcore load.
>
> The patchset introduces a backward-compatible convention, where a DPDK
> service may signal to the framework that no useful work was performed,
> which in turn is used to make the busy cycles statistics more
> accurate.
>
> Depends-on: series-23959 ("test/service: add perf measurements for with stats 
> mode ")
>
> Mattias Rönnblom (6):
>   service: reduce statistics overhead for parallel services
>   service: introduce per-lcore cycles counter
>   service: reduce average case service core overhead
>   service: tweak cycle statistics semantics
>   event/sw: report idle when no work is performed
>   service: provide links to functions in documentation
>
>  app/test/test_service_cores.c   |   2 +-
>  drivers/event/sw/sw_evdev.c |   3 +-
>  drivers/event/sw/sw_evdev.h |   2 +-
>  drivers/event/sw/sw_evdev_scheduler.c   |   6 +-
>  lib/eal/common/rte_service.c| 228 +---
>  lib/eal/include/rte_service.h   |  32 ++--
>  lib/eal/include/rte_service_component.h |   5 +
>  7 files changed, 192 insertions(+), 86 deletions(-)

Added acks.
Series applied, thanks.


-- 
David Marchand



Re: [PATCH v7 0/4] Add lcore poll busyness telemetry

2022-10-05 Thread Kevin Laatz

On 14/09/2022 10:29, Kevin Laatz wrote:

Currently, there is no way to measure lcore polling busyness in a passive
way, without any modifications to the application. This patchset adds a new
EAL API that will be able to passively track core polling busyness. As part
of the set, new telemetry endpoints are added to read the generate metrics.

---
v7:
   * Rename funcs, vars, files to include "poll" where missing.

v6:
   * Add API and perf unit tests

v5:
   * Fix Windows build
   * Make lcore_telemetry_free() an internal interface
   * Minor cleanup

v4:
   * Fix doc build
   * Rename timestamp macro to RTE_LCORE_POLL_BUSYNESS_TIMESTAMP
   * Make enable/disable read and write atomic
   * Change rte_lcore_poll_busyness_enabled_set() param to bool
   * Move mem alloc from enable/disable to init/cleanup
   * Other minor fixes

v3:
   * Fix missing renaming to poll busyness
   * Fix clang compilation
   * Fix arm compilation

v2:
   * Use rte_get_tsc_hz() to adjust the telemetry period
   * Rename to reflect polling busyness vs general busyness
   * Fix segfault when calling telemetry timestamp from an unregistered
 non-EAL thread.
   * Minor cleanup

Anatoly Burakov (2):
   eal: add lcore poll busyness telemetry
   eal: add cpuset lcore telemetry entries

Kevin Laatz (2):
   app/test: add unit tests for lcore poll busyness
   doc: add howto guide for lcore poll busyness

  app/test/meson.build  |   4 +
  app/test/test_lcore_poll_busyness_api.c   | 134 +++
  app/test/test_lcore_poll_busyness_perf.c  |  72 
  config/meson.build|   1 +
  config/rte_config.h   |   1 +
  doc/guides/howto/index.rst|   1 +
  doc/guides/howto/lcore_poll_busyness.rst  |  93 +
  lib/bbdev/rte_bbdev.h |  17 +-
  lib/compressdev/rte_compressdev.c |   2 +
  lib/cryptodev/rte_cryptodev.h |   2 +
  lib/distributor/rte_distributor.c |  21 +-
  lib/distributor/rte_distributor_single.c  |  14 +-
  lib/dmadev/rte_dmadev.h   |  15 +-
  .../common/eal_common_lcore_poll_telemetry.c  | 350 ++
  lib/eal/common/meson.build|   1 +
  lib/eal/freebsd/eal.c |   1 +
  lib/eal/include/rte_lcore.h   |  85 -
  lib/eal/linux/eal.c   |   1 +
  lib/eal/meson.build   |   3 +
  lib/eal/version.map   |   7 +
  lib/ethdev/rte_ethdev.h   |   2 +
  lib/eventdev/rte_eventdev.h   |  10 +-
  lib/rawdev/rte_rawdev.c   |   6 +-
  lib/regexdev/rte_regexdev.h   |   5 +-
  lib/ring/rte_ring_elem_pvt.h  |   1 +
  meson_options.txt |   2 +
  26 files changed, 826 insertions(+), 25 deletions(-)
  create mode 100644 app/test/test_lcore_poll_busyness_api.c
  create mode 100644 app/test/test_lcore_poll_busyness_perf.c
  create mode 100644 doc/guides/howto/lcore_poll_busyness.rst
  create mode 100644 lib/eal/common/eal_common_lcore_poll_telemetry.c


Based on the feedback in the discussions on this patchset, we have 
decided to revoke the submission of this patchset for the 22.11 release.


We will re-evaluate the design with the aim to provide a more acceptable 
solution in a future release.


---
Kevin




Re: [PATCH 06/15] net/dpaa: support ESP packet type in packet parsing

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

Add support of ESP packet type in packet receive path.

Signed-off-by: Gagandeep Singh 
---
  drivers/net/dpaa/dpaa_rxtx.c | 10 ++
  drivers/net/dpaa/dpaa_rxtx.h |  6 ++
  2 files changed, 16 insertions(+)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index 20b75efb63..22205cec30 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -178,6 +178,16 @@ static inline void dpaa_eth_packet_info(struct rte_mbuf 
*m, void *fd_virt_addr)
m->packet_type = RTE_PTYPE_L2_ETHER |
RTE_PTYPE_L3_IPV6 | RTE_PTYPE_L4_UDP;
break;
+   case DPAA_PKT_TYPE_IPSEC_IPV4:
+   if (*((uintptr_t *)&annot->parse) & DPAA_PARSE_ESP_MASK)
+   m->packet_type = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV4 | RTE_PTYPE_TUNNEL_ESP;
+   break;
+   case DPAA_PKT_TYPE_IPSEC_IPV6:
+   if (*((uintptr_t *)&annot->parse) & DPAA_PARSE_ESP_MASK)
+   m->packet_type = RTE_PTYPE_L2_ETHER |
+   RTE_PTYPE_L3_IPV6 | RTE_PTYPE_TUNNEL_ESP;
+   break;
case DPAA_PKT_TYPE_IPV4_EXT_UDP:
m->packet_type = RTE_PTYPE_L2_ETHER |
RTE_PTYPE_L3_IPV4_EXT | RTE_PTYPE_L4_UDP;


Shouldn't 'dpaa_supported_ptypes_get()' needs to be updated to notify 
host on the capability to report ESP types.



diff --git a/drivers/net/dpaa/dpaa_rxtx.h b/drivers/net/dpaa/dpaa_rxtx.h
index 99e09196e9..b2d7c0f2a3 100644
--- a/drivers/net/dpaa/dpaa_rxtx.h
+++ b/drivers/net/dpaa/dpaa_rxtx.h
@@ -47,6 +47,7 @@
   *L4R 0xE0 -
   *0x20 - TCP
   *0x40 - UDP
+ * 0x60 - IPsec
   *0x80 - SCTP
   *L3R 0xEDC4 (in Big Endian) -
   *0x8000 - IPv4
@@ -63,6 +64,7 @@
   */
  #define DPAA_PARSE_MASK   0x00F044EF0080
  #define DPAA_PARSE_VLAN_MASK  0x0070
+#define DPAA_PARSE_ESP_MASK0x0008
  
  /* Parsed values (Little Endian) */

  #define DPAA_PKT_TYPE_NONE0x
@@ -137,6 +139,10 @@
(0x0020 | DPAA_PKT_TYPE_TUNNEL_4_6)
  #define DPAA_PKT_TYPE_TUNNEL_6_4_TCP \
(0x0020 | DPAA_PKT_TYPE_TUNNEL_6_4)
+#define DPAA_PKT_TYPE_IPSEC_IPV4 \
+   (0x0060 | DPAA_PKT_TYPE_IPV4)
+#define DPAA_PKT_TYPE_IPSEC_IPV6 \
+   (0x0060 | DPAA_PKT_TYPE_IPV6)
  
  /* Checksum Errors */

  #define DPAA_PKT_IP_CSUM_ERR  0x4002




Re: [PATCH 07/15] net/dpaa2: use internal mempool for SG table

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

Creating and using driver's mempool for
allocating the SG table memory required for
FD creation instead of relying on user mempool.



As far as I can see this is in the Tx path, can you please explain why 
driver need an internal pktmbuf pool?


And shouldn't mempool needs to be freed on driver close and release.

Same comment applies for dpaa version of the patch (12/15).


Signed-off-by: Gagandeep Singh 
---
  drivers/net/dpaa2/dpaa2_ethdev.c | 14 ++
  drivers/net/dpaa2/dpaa2_ethdev.h |  9 +
  drivers/net/dpaa2/dpaa2_rxtx.c   | 13 ++---
  3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 37a8b43114..c7aae70300 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -78,6 +78,8 @@ bool dpaa2_enable_err_queue;
  #define MAX_NB_RX_DESC11264
  int total_nb_rx_desc;
  
+struct rte_mempool *dpaa2_tx_sg_pool;

+
  struct rte_dpaa2_xstats_name_off {
char name[RTE_ETH_XSTATS_NAME_SIZE];
uint8_t page_id; /* dpni statistics page id */
@@ -2907,6 +2909,18 @@ rte_dpaa2_probe(struct rte_dpaa2_driver *dpaa2_drv,
/* Invoke PMD device initialization function */
diag = dpaa2_dev_init(eth_dev);
if (diag == 0) {
+   if (!dpaa2_tx_sg_pool) {
+   dpaa2_tx_sg_pool =
+   rte_pktmbuf_pool_create("dpaa2_mbuf_tx_sg_pool",
+   DPAA2_POOL_SIZE,
+   DPAA2_POOL_CACHE_SIZE, 0,
+   DPAA2_MAX_SGS * sizeof(struct qbman_sge),
+   rte_socket_id());
+   if (dpaa2_tx_sg_pool == NULL) {
+   DPAA2_PMD_ERR("SG pool creation failed\n");
+   return -ENOMEM;
+   }
+   }
rte_eth_dev_probing_finish(eth_dev);
return 0;
}
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index 32ae762e4a..872dced517 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -121,6 +121,15 @@
  #define DPAA2_PKT_TYPE_VLAN_1 0x0160
  #define DPAA2_PKT_TYPE_VLAN_2 0x0260
  
+/* Global pool used by driver for SG list TX */

+extern struct rte_mempool *dpaa2_tx_sg_pool;
+/* Maximum SG segments */
+#define DPAA2_MAX_SGS 128
+/* SG pool size */
+#define DPAA2_POOL_SIZE 2048
+/* SG pool cache size */
+#define DPAA2_POOL_CACHE_SIZE 256
+
  /* enable timestamp in mbuf*/
  extern bool dpaa2_enable_ts[];
  extern uint64_t dpaa2_timestamp_rx_dynflag;
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index bc0e49b0d4..dcd86c4056 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -403,7 +403,7 @@ eth_fd_to_mbuf(const struct qbman_fd *fd,
  static int __rte_noinline __rte_hot
  eth_mbuf_to_sg_fd(struct rte_mbuf *mbuf,
  struct qbman_fd *fd,
- struct rte_mempool *mp, uint16_t bpid)
+ uint16_t bpid)
  {
struct rte_mbuf *cur_seg = mbuf, *prev_seg, *mi, *temp;
struct qbman_sge *sgt, *sge = NULL;
@@ -433,12 +433,12 @@ eth_mbuf_to_sg_fd(struct rte_mbuf *mbuf,
}
DPAA2_SET_FD_OFFSET(fd, offset);
} else {
-   temp = rte_pktmbuf_alloc(mp);
+   temp = rte_pktmbuf_alloc(dpaa2_tx_sg_pool);
if (temp == NULL) {
DPAA2_PMD_DP_DEBUG("No memory to allocate S/G table\n");
return -ENOMEM;
}
-   DPAA2_SET_ONLY_FD_BPID(fd, bpid);
+   DPAA2_SET_ONLY_FD_BPID(fd, mempool_to_bpid(dpaa2_tx_sg_pool));
DPAA2_SET_FD_OFFSET(fd, temp->data_off);
  #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
rte_mempool_check_cookies(rte_mempool_from_obj((void *)temp),
@@ -1321,9 +1321,10 @@ dpaa2_dev_tx(void *queue, struct rte_mbuf **bufs, 
uint16_t nb_pkts)
  
  			if (unlikely(RTE_MBUF_HAS_EXTBUF(*bufs))) {

if (unlikely((*bufs)->nb_segs > 1)) {
+   mp = (*bufs)->pool;
if (eth_mbuf_to_sg_fd(*bufs,
  &fd_arr[loop],
- mp, 0))
+ 
mempool_to_bpid(mp)))
goto send_n_return;
} else {
eth_mbuf_to_fd(*bufs,
@@ -1372,7 +1373,7 @@ dpaa2_dev_tx(void *queue, struct rte_mbuf **bufs, 
uint16_t nb_pkts)
if (unlikely((*bufs)->nb_segs > 1)) {
  

Re: [PATCH 15/15] net/dpaa: fix buffer free in slow path

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

Adding a check in slow path to free those buffers
which are not external.



Can you please explain what was the error before fix, what was happening 
when you try to free all mbufs?


Also it seems previous logic was different, with 'prev_seg' etc, can you 
explain what/why changed there?



Fixes: 9124e65dd3eb ("net/dpaa: enable Tx queue taildrop")
Cc: sta...@dpdk.org

Signed-off-by: Gagandeep Singh 
---
  drivers/net/dpaa/dpaa_rxtx.c | 23 ---
  1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_rxtx.c b/drivers/net/dpaa/dpaa_rxtx.c
index 4d285b4f38..ce4f3d6c85 100644
--- a/drivers/net/dpaa/dpaa_rxtx.c
+++ b/drivers/net/dpaa/dpaa_rxtx.c
@@ -455,7 +455,7 @@ dpaa_free_mbuf(const struct qm_fd *fd)
bp_info = DPAA_BPID_TO_POOL_INFO(fd->bpid);
format = (fd->opaque & DPAA_FD_FORMAT_MASK) >> DPAA_FD_FORMAT_SHIFT;
if (unlikely(format == qm_fd_sg)) {
-   struct rte_mbuf *first_seg, *prev_seg, *cur_seg, *temp;
+   struct rte_mbuf *first_seg, *cur_seg;
struct qm_sg_entry *sgt, *sg_temp;
void *vaddr, *sg_vaddr;
int i = 0;
@@ -469,32 +469,25 @@ dpaa_free_mbuf(const struct qm_fd *fd)
sgt = vaddr + fd_offset;
sg_temp = &sgt[i++];
hw_sg_to_cpu(sg_temp);
-   temp = (struct rte_mbuf *)
-   ((char *)vaddr - bp_info->meta_data_size);
sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info,
qm_sg_entry_get64(sg_temp));
-
first_seg = (struct rte_mbuf *)((char *)sg_vaddr -
bp_info->meta_data_size);
first_seg->nb_segs = 1;
-   prev_seg = first_seg;
while (i < DPAA_SGT_MAX_ENTRIES) {
sg_temp = &sgt[i++];
hw_sg_to_cpu(sg_temp);
-   sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info,
+   if (sg_temp->bpid != 0xFF) {
+   bp_info = DPAA_BPID_TO_POOL_INFO(sg_temp->bpid);
+   sg_vaddr = DPAA_MEMPOOL_PTOV(bp_info,
qm_sg_entry_get64(sg_temp));
-   cur_seg = (struct rte_mbuf *)((char *)sg_vaddr -
+   cur_seg = (struct rte_mbuf *)((char *)sg_vaddr -
  bp_info->meta_data_size);
-   first_seg->nb_segs += 1;
-   prev_seg->next = cur_seg;
-   if (sg_temp->final) {
-   cur_seg->next = NULL;
-   break;
+   rte_pktmbuf_free_seg(cur_seg);
}
-   prev_seg = cur_seg;
+   if (sg_temp->final)
+   break;
}
-
-   rte_pktmbuf_free_seg(temp);
rte_pktmbuf_free_seg(first_seg);
return 0;
}




Re: [PATCH 11/15] bus/dpaa: pass interface name as a string instead of pointer

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

From: Rohit Raj 

Due to change in latest kernel, passing the interface name to
kernel through IOCTL as string instead of character pointer.



This kernel component is the one that is delivered part of SDK I assume, 
instead of an upstreamed one.


What is the way for user match kernel code and DPDK driver, like is 
there any matching version information documented?



Signed-off-by: Rohit Raj 
---
  drivers/bus/dpaa/base/qbman/process.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/bus/dpaa/base/qbman/process.c 
b/drivers/bus/dpaa/base/qbman/process.c
index 9bc92681cd..3504ec97db 100644
--- a/drivers/bus/dpaa/base/qbman/process.c
+++ b/drivers/bus/dpaa/base/qbman/process.c
@@ -302,7 +302,7 @@ int bman_free_raw_portal(struct dpaa_raw_portal *portal)
_IOW(DPAA_IOCTL_MAGIC, 0x0E, struct usdpaa_ioctl_link_status)
  
  #define DPAA_IOCTL_DISABLE_LINK_STATUS_INTERRUPT \

-   _IOW(DPAA_IOCTL_MAGIC, 0x0F, char*)
+   _IOW(DPAA_IOCTL_MAGIC, 0x0F, char[IF_NAME_MAX_LEN])
  
  int dpaa_intr_enable(char *if_name, int efd)

  {
@@ -330,7 +330,7 @@ int dpaa_intr_disable(char *if_name)
if (ret)
return ret;
  
-	ret = ioctl(fd, DPAA_IOCTL_DISABLE_LINK_STATUS_INTERRUPT, &if_name);

+   ret = ioctl(fd, DPAA_IOCTL_DISABLE_LINK_STATUS_INTERRUPT, if_name);
if (ret) {
if (errno == EINVAL)
printf("Failed to disable interrupt: Not Supported\n");
@@ -472,7 +472,7 @@ int dpaa_update_link_speed(char *if_name, int link_speed, 
int link_duplex)
  }
  
  #define DPAA_IOCTL_RESTART_LINK_AUTONEG \

-   _IOW(DPAA_IOCTL_MAGIC, 0x13, char *)
+   _IOW(DPAA_IOCTL_MAGIC, 0x13, char[IF_NAME_MAX_LEN])
  
  int dpaa_restart_link_autoneg(char *if_name)

  {
@@ -481,7 +481,7 @@ int dpaa_restart_link_autoneg(char *if_name)
if (ret)
return ret;
  
-	ret = ioctl(fd, DPAA_IOCTL_RESTART_LINK_AUTONEG, &if_name);

+   ret = ioctl(fd, DPAA_IOCTL_RESTART_LINK_AUTONEG, if_name);
if (ret) {
if (errno == EINVAL)
printf("Failed to restart autoneg: Not Supported\n");




Re: [PATCH 00/15] DPAA and DPAA2 driver changes

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

This series have list of patch for bug fixes and
some enhancements to DPAA1 and DPAA2 net drivers.

Apeksha Gupta (2):
   net/enetfec: fix restart issue
   net/enetfec: fix buffer leak issue

Gagandeep Singh (7):
   net/dpaa: support ESP packet type in packet parsing
   net/dpaa2: use internal mempool for SG table
   net/dpaa2: fix buffer free on transmit SG packets
   net/dpaa: use internal mempool for SG table
   bus/dpaa: mempool ops registration change
   net/dpaa: fix buffer free on transmit SG packets
   net/dpaa: fix buffer free in slow path

Rohit Raj (3):
   bus/fslmc: add timeout in MC send command API
   net/dpaa: fix Jumbo packet Rx in case of VSP
   bus/dpaa: pass interface name as a string instead of pointer

Vanshika Shukla (2):
   bus/dpaa: use non-block mode for FD open
   net/dpaa2: fix dpdmux configuration for error behaviour

brick (1):
   net/dpaa2: check free enqueue descriptors before Tx



Hi Gagandeep,

I put some comments/questions to some patches, can you please check?

Also patches requires ack/review tag from their maintainer to proceed.



Re: [PATCH 05/15] net/dpaa2: check free enqueue descriptors before Tx

2022-10-05 Thread Ferruh Yigit

On 9/28/2022 6:25 AM, Gagandeep Singh wrote:

From: brick 

Check if there exists free enqueue descriptors before enqueuing Tx
packet. Also try to free enqueue descriptors in case they are not
free.

Fixes: ed1cdbed6a15 ("net/dpaa2: support multiple Tx queues enqueue for 
ordered")
Cc: sta...@dpdk.org

Signed-off-by: brick 


Can you please use name tag as "Name Surname ", like
 Signed-off-by: Brick Yang 

<...>


+   DPAA2_PMD_DP_DEBUG("===> eth_data =%p, fqid =%d\n",
+  eth_data, dpaa2_q[loop]->fqid);
+
+   /*Check if the queue is congested*/


syntax, more common to put space before/after '/* ' & ' */'


+   retry_count = 0;
+   while (qbman_result_SCN_state(dpaa2_q[loop]->cscn)) {
+   retry_count++;
+   /* Retry for some time before giving up */
+   if (retry_count > CONG_RETRY_COUNT)
+   goto send_frames;
+   }
+
+   /*Prepare enqueue descriptor*/


ditto


[PATCH 1/2] kni: flag deprecated status at build time

2022-10-05 Thread Bruce Richardson
To ensure all users are aware of KNI's deprecated status at build time
we can take the following actions:

1. disable the library by default. It can be re-enabled by setting
   disabled_libs to the empty string (or other string not including
   'kni')

2. add support for a list of deprecated libs to the lib/meson.build file
   so the error message for KNI being disabled includes the fact that it
   is a deprecated library.

3. for the dependent NIC driver, drivers/net/kni, modify its disabled
   message to also reference the fact that KNI is disabled.

NOTE: This is part of the deprecation process for KNI agreed by the DPDK
technical board.[1]

[1] http://mails.dpdk.org/archives/dev/2022-June/243596.html

Signed-off-by: Bruce Richardson 
---
 doc/guides/prog_guide/kernel_nic_interface.rst | 4 
 drivers/net/kni/meson.build| 5 +
 lib/meson.build| 7 +++
 meson_options.txt  | 2 +-
 4 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst 
b/doc/guides/prog_guide/kernel_nic_interface.rst
index 03b5bca958..6a564f61ca 100644
--- a/doc/guides/prog_guide/kernel_nic_interface.rst
+++ b/doc/guides/prog_guide/kernel_nic_interface.rst
@@ -14,6 +14,10 @@ Kernel NIC Interface
For an alternative to KNI, that does not require any out-of-tree Linux 
kernel modules,
or a custom library, see :ref:`virtio_user_as_exception_path`.
 
+.. note::
+
+   KNI is disabled by default in the DPDK build.
+   To re-enable the library, remove 'kni' from the "disable_libs" meson option 
when configuring a build.
 
 The DPDK Kernel NIC Interface (KNI) allows userspace applications access to 
the Linux* control plane.
 
diff --git a/drivers/net/kni/meson.build b/drivers/net/kni/meson.build
index 2acc989694..2deb2c4166 100644
--- a/drivers/net/kni/meson.build
+++ b/drivers/net/kni/meson.build
@@ -6,6 +6,11 @@ if is_windows
 reason = 'not supported on Windows'
 subdir_done()
 endif
+if not dpdk_conf.has('kni')
+build = false
+reason = 'needs deprecated lib, "kni"'
+subdir_done()
+endif
 
 deps += 'kni'
 sources = files('rte_eth_kni.c')
diff --git a/lib/meson.build b/lib/meson.build
index c648f7d800..264a1671df 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -85,6 +85,7 @@ optional_libs = [
 'vhost',
 ]
 
+deprecated_libs = ['kni']
 disabled_libs = []
 opt_disabled_libs = run_command(list_dir_globs, get_option('disable_libs'),
 check: true).stdout().split()
@@ -133,7 +134,13 @@ foreach l:libraries
 if disabled_libs.contains(l)
 build = false
 reason = 'explicitly disabled via build config'
+if deprecated_libs.contains(l)
+reason += ' (deprecated lib)'
+endif
 else
+if deprecated_libs.contains(l)
+warning('Enabling deprecated library, "@0@"'.format(l))
+endif
 subdir(l)
 endif
 if name != l
diff --git a/meson_options.txt b/meson_options.txt
index 8640f599ae..39af78787c 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -8,7 +8,7 @@ option('developer_mode', type: 'feature', description:
'turn on additional build checks relevant for DPDK developers')
 option('disable_drivers', type: 'string', value: '', description:
'Comma-separated list of drivers to explicitly disable.')
-option('disable_libs', type: 'string', value: '', description:
+option('disable_libs', type: 'string', value: 'kni', description:
'Comma-separated list of libraries to explicitly disable. [NOTE: not 
all libs can be disabled]')
 option('drivers_install_subdir', type: 'string', value: 'dpdk/pmds-', 
description:
'Subdirectory of libdir where to install PMDs. Defaults to using a 
versioned subdirectory.')
-- 
2.34.1



[PATCH 2/2] kni: add deprecation warning at runtime

2022-10-05 Thread Bruce Richardson
When KNI is being used at runtime, output a warning message about its
deprecated status. This is part of the deprecation process for KNI
agreed by the DPDK technical board.[1]

[1] http://mails.dpdk.org/archives/dev/2022-June/243596.html

Signed-off-by: Bruce Richardson 
---
 doc/guides/rel_notes/deprecation.rst | 6 ++
 lib/kni/rte_kni.c| 2 ++
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 6c2fc15c77..8d99ce5f2f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -44,10 +44,8 @@ Deprecation Notices
   applications - other technologies such as virtio-user are recommended 
instead.
   Following the DPDK technical board
   `decision `_
-  and `refinement `_:
-
-  * Some deprecation warnings will be added in DPDK 22.11.
-  * The KNI kernel module, library and PMD will be removed from the DPDK 23.11.
+  and `refinement `_,
+  the KNI kernel module, library and PMD will be removed from the DPDK 23.11 
release.
 
 * lib: will fix extending some enum/define breaking the ABI. There are multiple
   samples in DPDK that enum/define terminated with a ``.*MAX.*`` value which is
diff --git a/lib/kni/rte_kni.c b/lib/kni/rte_kni.c
index 7971c56bb4..eb7c10ff19 100644
--- a/lib/kni/rte_kni.c
+++ b/lib/kni/rte_kni.c
@@ -96,6 +96,8 @@ static volatile int kni_fd = -1;
 int
 rte_kni_init(unsigned int max_kni_ifaces __rte_unused)
 {
+   RTE_LOG(WARNING, KNI, "WARNING: KNI is deprecated and will be removed 
in DPDK 23.11\n");
+
 #if LINUX_VERSION_CODE < KERNEL_VERSION(4, 10, 0)
if (rte_eal_iova_mode() != RTE_IOVA_PA) {
RTE_LOG(ERR, KNI, "KNI requires IOVA as PA\n");
-- 
2.34.1



RE: [PATCH 1/2] kni: flag deprecated status at build time

2022-10-05 Thread Morten Brørup
> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> Sent: Wednesday, 5 October 2022 16.35
> 
> To ensure all users are aware of KNI's deprecated status at build time
> we can take the following actions:
> 
> 1. disable the library by default. It can be re-enabled by setting
>disabled_libs to the empty string (or other string not including
>'kni')
> 
> 2. add support for a list of deprecated libs to the lib/meson.build
> file
>so the error message for KNI being disabled includes the fact that
> it
>is a deprecated library.
> 
> 3. for the dependent NIC driver, drivers/net/kni, modify its disabled
>message to also reference the fact that KNI is disabled.
> 
> NOTE: This is part of the deprecation process for KNI agreed by the
> DPDK
> technical board.[1]
> 
> [1] http://mails.dpdk.org/archives/dev/2022-June/243596.html
> 
> Signed-off-by: Bruce Richardson 
> ---

Probably not required, but cannot hurt either:

Series-acked-by: Morten Brørup 



[PATCH] vhost: fix compilation issue in async path

2022-10-05 Thread Maxime Coquelin
This patch fixes a compilation issue met with GCC on
Loongarch64:

In function ‘mbuf_to_desc’,
inlined from ‘vhost_enqueue_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1822:6,
inlined from ‘virtio_dev_rx_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1836:6,
inlined from ‘virtio_dev_rx_async_submit_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1895:7:
../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ may 
be used uninitialized [-Werror=maybe-uninitialized]
 1159 | buf_addr = buf_vec[vec_idx].buf_addr;
  | ~^~~
../../../dpdk/lib/vhost/virtio_net.c: In function 
‘virtio_dev_rx_async_submit_packed’:
../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
 1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
  |   ^~~

It happens because the compiler assumes that 'size'
variable in vhost_enqueue_async_packed could wrap to 0 since
'size' is uint32_t and pkt->pkt_len too.

In practice, it would never happen since 'pkt->pkt_len' is
unlikely to be close to UINT32_MAX, but let's just change
'size' to uint64_t to make the compiler happy without
having to add runtime checks.

Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
Cc: sta...@dpdk.org

Signed-off-by: Maxime Coquelin 
---
 lib/vhost/virtio_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 8f4d0f0502..b86fb26040 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -1780,7 +1780,7 @@ vhost_enqueue_async_packed(struct virtio_net *dev,
uint16_t buf_id = 0;
uint32_t len = 0;
uint16_t desc_count = 0;
-   uint32_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   uint64_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
uint32_t buffer_len[vq->size];
uint16_t buffer_buf_id[vq->size];
uint16_t buffer_desc_count[vq->size];
-- 
2.37.3



Re: [PATCH v2] vhost: fix build

2022-10-05 Thread Maxime Coquelin

Hi Min,

On 9/26/22 05:25, zhoumin wrote:

Hi Chenbo,


On Mon, 26 Sep 2022, 10:57, Xia, Chenbo wrote:

Hi Min,


-Original Message-
From: Min Zhou 
Sent: Monday, August 29, 2022 4:29 PM
To: david.march...@redhat.com; maxime.coque...@redhat.com; Xia, Chenbo
; zhou...@loongson.cn
Cc: dev@dpdk.org; maob...@loongson.cn
Subject: [PATCH v2] vhost: fix build

On CentOS 8 or Debian 10.4 systems using gcc 12.1 to cross
compile DPDK, gcc shows a following warning which will cause
build to fail when build is run with -werror:

In function 'mbuf_to_desc',
 inlined from 'vhost_enqueue_async_packed'
at ../lib/vhost/virtio_net.c:1826:6,
 inlined from 'virtio_dev_rx_async_packed'
at ../lib/vhost/virtio_net.c:1840:6,
 inlined from 'virtio_dev_rx_async_submit_packed.constprop'
at ../lib/vhost/virtio_net.c:1900:7:
../lib/vhost/virtio_net.c:1161:35: error: 'buf_vec[0].buf_len' may be 
used

uninitialized [-Werror=maybe-uninitialized]
  1161 | buf_len = buf_vec[vec_idx].buf_len;
   |   ^~~~
../lib/vhost/virtio_net.c: In function
'virtio_dev_rx_async_submit_packed.constprop':
../lib/vhost/virtio_net.c:1838:27: note: 'buf_vec' declared here
  1838 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
   |   ^~~
cc1: all warnings being treated as errors

Actually, there are eight places to see the same codes in the file
lib/vhost/virtio_net.c, and all these `buf_vec` arraies are
initialized by sub-function calls under various conditions.

Although It's hard to understand why gcc just emits warning at one
of the eight places, adding validity checks for array length is
reasonable and can also fix the warning.

Signed-off-by: David Marchand 
Signed-off-by: Min Zhou 
---
  lib/vhost/virtio_net.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
Just want you to know that your patch is still pending because by 
accident
your fix is almost the same as a previous patch that fixes a real 
issue but

that patch is still in progress:

http://patchwork.dpdk.org/project/dpdk/patch/20220802004938.23670-2-cfont...@suse.de/

Thanks,
Chenbo


Thanks for your helpful reply.
I think I can drop this patch if the patch you mentioned above could be 
accepted.


I just sent another patch that fixes your build issue without having to
add runtime checks:

http://patches.dpdk.org/project/dpdk/patch/20221005144859.70717-1-maxime.coque...@redhat.com/

Thanks,
Maxime




Re: [PATCH] vhost: fix compilation issue in async path

2022-10-05 Thread David Marchand
On Wed, Oct 5, 2022 at 4:49 PM Maxime Coquelin
 wrote:
>
> This patch fixes a compilation issue met with GCC on

GCC 12 (it is worth detailing, since I and Thomas hit the same issue
on some other cross compiling toolchains using GCC 12).

> Loongarch64:

LoongArch64

>
> In function ‘mbuf_to_desc’,
> inlined from ‘vhost_enqueue_async_packed’ at 
> ../../../dpdk/lib/vhost/virtio_net.c:1822:6,
> inlined from ‘virtio_dev_rx_async_packed’ at 
> ../../../dpdk/lib/vhost/virtio_net.c:1836:6,
> inlined from ‘virtio_dev_rx_async_submit_packed’ at 
> ../../../dpdk/lib/vhost/virtio_net.c:1895:7:
> ../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ 
> may be used uninitialized [-Werror=maybe-uninitialized]
>  1159 | buf_addr = buf_vec[vec_idx].buf_addr;
>   | ~^~~
> ../../../dpdk/lib/vhost/virtio_net.c: In function 
> ‘virtio_dev_rx_async_submit_packed’:
> ../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
>  1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
>   |   ^~~
>
> It happens because the compiler assumes that 'size'
> variable in vhost_enqueue_async_packed could wrap to 0 since
> 'size' is uint32_t and pkt->pkt_len too.
>
> In practice, it would never happen since 'pkt->pkt_len' is
> unlikely to be close to UINT32_MAX, but let's just change
> 'size' to uint64_t to make the compiler happy without
> having to add runtime checks.
>
> Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Maxime Coquelin 
Reviewed-by: David Marchand 


-- 
David Marchand



Re: [PATCH v3 1/2] vhost: check for nr_vec == 0 in desc_to_mbuf, mbuf_to_desc

2022-10-05 Thread Maxime Coquelin




On 8/2/22 02:49, Claudio Fontana wrote:

in virtio_dev_split we cannot currently call desc_to_mbuf with
nr_vec == 0, or we end up trying to rte_memcpy from a source address
buf_vec[0] that is an uninitialized stack variable.

Improve this in general by having desc_to_mbuf and mbuf_to_desc
return -1 when called with an invalid nr_vec == 0, which should
fix any other instance of this problem.

This should fix errors that have been reported in multiple occasions
from telcos to the DPDK, OVS and QEMU projects, as this affects in
particular the openvswitch/DPDK, QEMU vhost-user setup when the
guest DPDK application abruptly goes away via SIGKILL and then
reconnects.

The back trace looks roughly like this, depending on the specific
rte_memcpy selected, etc, in any case the "src" parameter is garbage
(in this example containing 0 + dev->host_hlen(12 = 0xc)).

Thread 153 "pmd-c88/id:150" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f64e5e6b700 (LWP 141373)]
rte_mov128blocks (n=2048, src=0xc ,
  dst=0x150da4480) at ../lib/eal/x86/include/rte_memcpy.h:384
(gdb) bt
0  rte_mov128blocks (n=2048, src=0xc, dst=0x150da4480)
1  rte_memcpy_generic (n=2048, src=0xc, dst=0x150da4480)
2  rte_memcpy (n=2048, src=0xc, dst=)
3  sync_fill_seg
4  desc_to_mbuf
5  virtio_dev_tx_split
6  virtio_dev_tx_split_legacy
7  0x7f676fea0fef in rte_vhost_dequeue_burst
8  0x7f6772005a62 in netdev_dpdk_vhost_rxq_recv
9  0x7f6771f38116 in netdev_rxq_recv
10 0x7f6771f03d96 in dp_netdev_process_rxq_port
11 0x7f6771f04239 in pmd_thread_main
12 0x7f6771f92aff in ovsthread_wrapper
13 0x7f6771c1b6ea in start_thread
14 0x7f6771933a8f in clone

Tested-by: Claudio Fontana 
Signed-off-by: Claudio Fontana 
---
  lib/vhost/virtio_net.c | 11 ---
  1 file changed, 8 insertions(+), 3 deletions(-)


This patch is also no more necessary since CVE-2022-2132 has been fixed.
Latests LTS versions and upstream main branch contain the fixes:

dc1516e260a0 ("vhost: fix header spanned across more than two descriptors")
71bd0cc536ad ("vhost: discard too small descriptor chains")

desc_to_mbuf callers now check that the descriptors are at least the
size of the virtio_net header, so nr_vec cannot be 0 in desc_to_mbuf.

Since I cannot reproduce, if you are willing to try please let us know
the results.

Maxime



Re: [PATCH] vhost: fix compilation issue in async path

2022-10-05 Thread Maxime Coquelin




On 10/5/22 16:56, David Marchand wrote:

On Wed, Oct 5, 2022 at 4:49 PM Maxime Coquelin
 wrote:


This patch fixes a compilation issue met with GCC on


GCC 12 (it is worth detailing, since I and Thomas hit the same issue
on some other cross compiling toolchains using GCC 12).


Loongarch64:


LoongArch64



Oops, I did the changes you asked me off-list, but forgot to format
the patch again!

Posting v2.



In function ‘mbuf_to_desc’,
 inlined from ‘vhost_enqueue_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1822:6,
 inlined from ‘virtio_dev_rx_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1836:6,
 inlined from ‘virtio_dev_rx_async_submit_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1895:7:
../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ may 
be used uninitialized [-Werror=maybe-uninitialized]
  1159 | buf_addr = buf_vec[vec_idx].buf_addr;
   | ~^~~
../../../dpdk/lib/vhost/virtio_net.c: In function 
‘virtio_dev_rx_async_submit_packed’:
../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
  1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
   |   ^~~

It happens because the compiler assumes that 'size'
variable in vhost_enqueue_async_packed could wrap to 0 since
'size' is uint32_t and pkt->pkt_len too.

In practice, it would never happen since 'pkt->pkt_len' is
unlikely to be close to UINT32_MAX, but let's just change
'size' to uint64_t to make the compiler happy without
having to add runtime checks.

Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
Cc: sta...@dpdk.org

Signed-off-by: Maxime Coquelin 

Reviewed-by: David Marchand 






[PATCH v2] vhost: fix compilation issue in async path

2022-10-05 Thread Maxime Coquelin
This patch fixes a compilation issue met with GCC 12 on
LoongArch64:

In function ‘mbuf_to_desc’,
inlined from ‘vhost_enqueue_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1822:6,
inlined from ‘virtio_dev_rx_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1836:6,
inlined from ‘virtio_dev_rx_async_submit_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1895:7:
../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ may 
be used uninitialized [-Werror=maybe-uninitialized]
 1159 | buf_addr = buf_vec[vec_idx].buf_addr;
  | ~^~~
../../../dpdk/lib/vhost/virtio_net.c: In function 
‘virtio_dev_rx_async_submit_packed’:
../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
 1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
  |   ^~~

It happens because the compiler assumes that 'size'
variable in vhost_enqueue_async_packed could wrap to 0 since
'size' is uint32_t and pkt->pkt_len too.

In practice, it would never happen since 'pkt->pkt_len' is
unlikely to be close to UINT32_MAX, but let's just change
'size' to uint64_t to make the compiler happy without
having to add runtime checks.

Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
Cc: sta...@dpdk.org

Signed-off-by: Maxime Coquelin 
---
 lib/vhost/virtio_net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 8f4d0f0502..b86fb26040 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -1780,7 +1780,7 @@ vhost_enqueue_async_packed(struct virtio_net *dev,
uint16_t buf_id = 0;
uint32_t len = 0;
uint16_t desc_count = 0;
-   uint32_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   uint64_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
uint32_t buffer_len[vq->size];
uint16_t buffer_buf_id[vq->size];
uint16_t buffer_desc_count[vq->size];
-- 
2.37.3



Re: [PATCH 2/3] eal: change rte_fls and rte_bsf to return uint32_t

2022-10-05 Thread Tyler Retzlaff
On Wed, Oct 05, 2022 at 11:02:45AM +0200, Thomas Monjalon wrote:
> 08/08/2022 23:21, Tyler Retzlaff:
> > From: Tyler Retzlaff 
> > 
> 
> You forgot the _safe versions:

> 
> --- a/lib/eal/include/rte_common.h
> +++ b/lib/eal/include/rte_common.h
> @@ -660,7 +660,7 @@ rte_bsf32(uint32_t v)
>   * @return
>   * Returns 0 if ``v`` was 0, otherwise returns 1.
>   */
> -static inline int
> +static inline uint32_t
>  rte_bsf32_safe(uint32_t v, uint32_t *pos)
>  {
> if (v == 0)
> @@ -739,7 +739,7 @@ rte_bsf64(uint64_t v)
>   * @return
>   * Returns 0 if ``v`` was 0, otherwise returns 1.
>   */
> -static inline int
> +static inline uint32_t
>  rte_bsf64_safe(uint64_t v, uint32_t *pos)
>  {
> if (v == 0)
> 
> 
> 
> 

the return values from the _safe versions are `int' treated
like a bool type. they have been left as is to be consistent
with the rest of dpdk return value types.

the non-safe version returns were returning actual values
and not an indication of success or failure.

they could certainly be changed to C99 fixed width types but
if they are changed at all perhaps they should be changed to
_Bool or bool from stdbool.h?

it looks like this change has been merged already but if you
would like to make any further changes let me know i'll take
care of it.

thanks


Re: [PATCH 2/3] eal: change rte_fls and rte_bsf to return uint32_t

2022-10-05 Thread Thomas Monjalon
05/10/2022 17:15, Tyler Retzlaff:
> On Wed, Oct 05, 2022 at 11:02:45AM +0200, Thomas Monjalon wrote:
> > 08/08/2022 23:21, Tyler Retzlaff:
> > > From: Tyler Retzlaff 
> > > 
> > 
> > You forgot the _safe versions:
> 
> > 
> > --- a/lib/eal/include/rte_common.h
> > +++ b/lib/eal/include/rte_common.h
> > @@ -660,7 +660,7 @@ rte_bsf32(uint32_t v)
> >   * @return
> >   * Returns 0 if ``v`` was 0, otherwise returns 1.
> >   */
> > -static inline int
> > +static inline uint32_t
> >  rte_bsf32_safe(uint32_t v, uint32_t *pos)
> >  {
> > if (v == 0)
> > @@ -739,7 +739,7 @@ rte_bsf64(uint64_t v)
> >   * @return
> >   * Returns 0 if ``v`` was 0, otherwise returns 1.
> >   */
> > -static inline int
> > +static inline uint32_t
> >  rte_bsf64_safe(uint64_t v, uint32_t *pos)
> >  {
> > if (v == 0)
> > 
> > 
> > 
> > 
> 
> the return values from the _safe versions are `int' treated
> like a bool type. they have been left as is to be consistent
> with the rest of dpdk return value types.
> 
> the non-safe version returns were returning actual values
> and not an indication of success or failure.
> 
> they could certainly be changed to C99 fixed width types but
> if they are changed at all perhaps they should be changed to
> _Bool or bool from stdbool.h?
> 
> it looks like this change has been merged already but if you
> would like to make any further changes let me know i'll take
> care of it.

Sorry, it's my mistake, I went too fast.
I'll revert them to int.




[PATCH] examples/l2fwd-cat: fix build

2022-10-05 Thread Kevin Traynor
 and  need to be included for the build
since they were removed from .

../examples/l2fwd-cat/cat.c: In function ‘parse_set’:
../examples/l2fwd-cat/cat.c:66:16:
warning: implicit declaration of function ‘isblank’
[-Wimplicit-function-declaration]
   66 | while (isblank(*str))
  |^~~
../examples/l2fwd-cat/cat.c:18:1:
note: include ‘’ or provide a declaration of ‘isblank’
   17 | #include "cat.h"
  +++ |+#include 
   18 |
../examples/l2fwd-cat/cat.c:70:15:
warning: implicit declaration of function ‘isdigit’
[-Wimplicit-function-declaration]
   70 | if ((!isdigit(*str) && *str != '(') || *str == '\0')
  |   ^~~
../examples/l2fwd-cat/cat.c:70:15:
note: include ‘’ or provide a declaration of ‘isdigit’
../examples/l2fwd-cat/cat.c:75:17:
error: ‘errno’ undeclared (first use in this function)
   75 | errno = 0;
  | ^
../examples/l2fwd-cat/cat.c:18:1:
note: ‘errno’ is defined in header ‘’;
did you forget to ‘#include ’?
   17 | #include "cat.h"
  +++ |+#include 

Fixes: 72b452c5f259 ("eal: remove unneeded includes from a public header")

Signed-off-by: Kevin Traynor 
---
 examples/l2fwd-cat/cat.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/examples/l2fwd-cat/cat.c b/examples/l2fwd-cat/cat.c
index 77154c8b7e..00e4cde48b 100644
--- a/examples/l2fwd-cat/cat.c
+++ b/examples/l2fwd-cat/cat.c
@@ -3,4 +3,6 @@
  */
 
+#include 
+#include 
 #include 
 #include 
-- 
2.37.3



Re: [PATCH] examples/l2fwd-cat: fix build

2022-10-05 Thread David Marchand
On Wed, Oct 5, 2022 at 5:24 PM Kevin Traynor  wrote:
>
>  and  need to be included for the build
> since they were removed from .
>
> ../examples/l2fwd-cat/cat.c: In function ‘parse_set’:
> ../examples/l2fwd-cat/cat.c:66:16:
> warning: implicit declaration of function ‘isblank’
> [-Wimplicit-function-declaration]
>66 | while (isblank(*str))
>   |^~~
> ../examples/l2fwd-cat/cat.c:18:1:
> note: include ‘’ or provide a declaration of ‘isblank’
>17 | #include "cat.h"
>   +++ |+#include 
>18 |
> ../examples/l2fwd-cat/cat.c:70:15:
> warning: implicit declaration of function ‘isdigit’
> [-Wimplicit-function-declaration]
>70 | if ((!isdigit(*str) && *str != '(') || *str == '\0')
>   |   ^~~
> ../examples/l2fwd-cat/cat.c:70:15:
> note: include ‘’ or provide a declaration of ‘isdigit’
> ../examples/l2fwd-cat/cat.c:75:17:
> error: ‘errno’ undeclared (first use in this function)
>75 | errno = 0;
>   | ^
> ../examples/l2fwd-cat/cat.c:18:1:
> note: ‘errno’ is defined in header ‘’;
> did you forget to ‘#include ’?
>17 | #include "cat.h"
>   +++ |+#include 
>
> Fixes: 72b452c5f259 ("eal: remove unneeded includes from a public header")
>
> Signed-off-by: Kevin Traynor 
Reviewed-by: David Marchand 


-- 
David Marchand



[PATCH v8 0/4] support protocol based buffer split

2022-10-05 Thread Yuan Wang
Protocol type based buffer split consists of splitting a received packet
into several separate segments based on the packet content. It is useful
in some scenarios, such as GPU acceleration. The splitting will help to
enable true zero copy and hence improve the performance significantly.

This patchset aims to support protocol header split based on current buffer
split. When Rx queue is configured with RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT
offload and corresponding protocol, packets received will be directly split
into different mempools.

Change log:
v8:
Restrict length == 0 and proto_hdr == 0 in another buffer split.
Add check for proto_hdr == 0 in last segment.
Use heap instead of stack for array to avoid vulnerabilities.
Define the proto_hdr for two segments and multi-segments, respectively.
Separate variable definition and code.
Refine the doc and commit log.

v7:
ice: Fix CI issue.

v6:
ice: Fix proto_hdr mappings to NIC configuration.

v5:
Define proto_hdr to use mask instead of single protocol type.
Define PMD to return protocol header mask.
Refine the doc and commit log.
Remove deprecated RTE_FUNC_PTR_OR_ERR_RET.

v4:
Change proto_hdr to a bit mask of RTE_PTYPE_*.
Add the description on how to put the unsplit packages.
Use proto_hdr to determine whether to use protocol based split.

v3:
Fix mail thread.

v2:
Add mbuf dump to the driver's buffer split path.
Add buffer split to the driver feature list.
Remove unsupported header protocols from the driver.

Yuan Wang (4):
  ethdev: introduce protocol header API
  ethdev: introduce protocol hdr based buffer split
  app/testpmd: add rxhdrs commands and parameters
  net/ice: support buffer split in Rx path

 app/test-pmd/cmdline.c | 152 +-
 app/test-pmd/config.c  |  95 +
 app/test-pmd/parameters.c  |  16 +-
 app/test-pmd/testpmd.c |  11 +-
 app/test-pmd/testpmd.h |   6 +
 doc/guides/rel_notes/release_22_11.rst |  13 ++
 drivers/net/ice/ice_ethdev.c   |  97 -
 drivers/net/ice/ice_rxtx.c | 266 ++---
 drivers/net/ice/ice_rxtx.h |  16 ++
 drivers/net/ice/ice_rxtx_vec_common.h  |   3 +
 lib/ethdev/ethdev_driver.h |  15 ++
 lib/ethdev/rte_ethdev.c| 122 +++-
 lib/ethdev/rte_ethdev.h|  64 +-
 lib/ethdev/version.map |   1 +
 14 files changed, 826 insertions(+), 51 deletions(-)

-- 
2.25.1



[PATCH v8 1/4] ethdev: introduce protocol header API

2022-10-05 Thread Yuan Wang
Add a new ethdev API to retrieve supported protocol headers
of a PMD, which helps to configure protocol header based buffer split.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
Reviewed-by: Andrew Rybchenko 
---
 doc/guides/rel_notes/release_22_11.rst |  5 
 lib/ethdev/ethdev_driver.h | 15 
 lib/ethdev/rte_ethdev.c| 33 ++
 lib/ethdev/rte_ethdev.h| 30 +++
 lib/ethdev/version.map |  1 +
 5 files changed, 84 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index ac67e7e710..141fd9442b 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -123,6 +123,11 @@ New Features
   into single event containing ``rte_event_vector``
   whose event type is ``RTE_EVENT_TYPE_CRYPTODEV_VECTOR``.
 
+* **Added protocol header based buffer split.**
+
+  * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
+header protocols of a PMD to split.
+
 
 Removed Items
 -
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 8cd8eb8685..791b264610 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1055,6 +1055,18 @@ typedef int (*eth_ip_reassembly_conf_get_t)(struct 
rte_eth_dev *dev,
 typedef int (*eth_ip_reassembly_conf_set_t)(struct rte_eth_dev *dev,
const struct rte_eth_ip_reassembly_params *conf);
 
+/**
+ * @internal
+ * Get supported header protocols of a PMD to split.
+ *
+ * @param dev
+ *   Ethdev handle of port.
+ *
+ * @return
+ *   An array pointer to store supported protocol headers.
+ */
+typedef const uint32_t *(*eth_buffer_split_supported_hdr_ptypes_get_t)(struct 
rte_eth_dev *dev);
+
 /**
  * @internal
  * Dump private info from device to a file.
@@ -1302,6 +1314,9 @@ struct eth_dev_ops {
/** Set IP reassembly configuration */
eth_ip_reassembly_conf_set_t ip_reassembly_conf_set;
 
+   /** Get supported header ptypes to split */
+   eth_buffer_split_supported_hdr_ptypes_get_t 
buffer_split_supported_hdr_ptypes_get;
+
/** Dump private info from device */
eth_dev_priv_dump_t eth_dev_priv_dump;
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 2821770e2d..ee3b490889 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6045,6 +6045,39 @@ rte_eth_dev_priv_dump(uint16_t port_id, FILE *file)
return eth_err(port_id, (*dev->dev_ops->eth_dev_priv_dump)(dev, file));
 }
 
+int
+rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t 
*ptypes, int num)
+{
+   int i, j;
+   struct rte_eth_dev *dev;
+   const uint32_t *all_types;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+
+   if (ptypes == NULL && num > 0) {
+   RTE_ETHDEV_LOG(ERR,
+   "Cannot get ethdev port %u supported header protocol 
types to NULL when array size is non zero\n",
+   port_id);
+   return -EINVAL;
+   }
+
+   if (*dev->dev_ops->buffer_split_supported_hdr_ptypes_get == NULL)
+   return -ENOTSUP;
+   all_types = (*dev->dev_ops->buffer_split_supported_hdr_ptypes_get)(dev);
+
+   if (all_types == NULL)
+   return 0;
+
+   for (i = 0, j = 0; all_types[i] != RTE_PTYPE_UNKNOWN; ++i) {
+   if (j < num)
+   ptypes[j] = all_types[i];
+   j++;
+   }
+
+   return j;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index a21f58b9cd..c51c1f3fa0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -6025,6 +6025,36 @@ rte_eth_tx_buffer(uint16_t port_id, uint16_t queue_id,
return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Get supported header protocols to split on Rx.
+ *
+ * When a packet type is announced to be split, it *must* be supported by
+ * the PMD. For instance, if eth-ipv4, eth-ipv4-udp is announced, the PMD must
+ * return the following packet types for these packets:
+ * - Ether/IPv4 -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4
+ * - Ether/IPv4/UDP -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4 | 
RTE_PTYPE_L4_UDP
+ *
+ * @param port_id
+ *   The port identifier of the device.
+ * @param[out] ptypes
+ *   An array pointer to store supported protocol headers, allocated by caller.
+ *   These ptypes are composed with RTE_PTYPE_*.
+ * @param num
+ *   Size of the array pointed by param ptypes.
+ * @return
+ *   - (>=0) Number of supported ptypes. If the number of types exceeds num,
+ *   only num entries will be filled into the pty

[PATCH v8 2/4] ethdev: introduce protocol hdr based buffer split

2022-10-05 Thread Yuan Wang
Currently, Rx buffer split supports length based split. With Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
configured, PMD will be able to split the received packets into
multiple segments.

However, length based buffer split is not suitable for NICs that do split
based on protocol headers. Given an arbitrarily variable length in Rx
packet segment, it is almost impossible to pass a fixed protocol header to
driver. Besides, the existence of tunneling results in the composition of
a packet is various, which makes the situation even worse.

This patch extends current buffer split to support protocol header based
buffer split. A new proto_hdr field is introduced in the reserved field
of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
field defines the split position of packet, splitting will always happen
after the protocol header defined in the Rx packet segment. When Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
protocol header is configured, driver will split the ingress packets into
multiple segments.

Examples for proto_hdr field defines:
To split after ETH-IPV4-UDP, it should be defined as
proto_hdr = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN |
RTE_PTYPE_L4_UDP

For inner ETH-IPV4-UDP, it should be defined as
proto_hdr = RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_INNER_L4_UDP

If the protocol header is repeated with the previously defined one,
the repeated part can be omitted. For example, split after ETH, ETH-IPV4
and ETH-IPV4-UDP, it should be defined as
proto_hdr0 = RTE_PTYPE_L2_ETHER
proto_hdr1 = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN
proto_hdr2 = RTE_PTYPE_L4_UDP

struct rte_eth_rxseg_split {
struct rte_mempool *mp; /* memory pools to allocate segment from */
uint16_t length; /* segment maximal data length,
configures split point */
uint16_t offset; /* data offset from beginning
of mbuf data buffer */
/**
 * Proto_hdr defines a bit mask of the protocol sequence as
 * RTE_PTYPE_*, configures split point. The last RTE_PTYPE*
 * in the mask indicates the split position.
 * If one protocol header is defined to split packets into two
 * segments, for non-tunneling packets, the complete protocol
 * sequence should be defined.
 * For tunneling packets, for simplicity,
 * only the tunnel and inner part of comple protocol sequence
 * is required.
 * If several protocol headers are defined to split packets into
 * multi-segments, the repeated parts of adjacent segments
 * should be omitted.
 */
uint32_t proto_hdr;
};

If protocol header split can be supported by a PMD, the
rte_eth_buffer_split_get_supported_hdr_ptypes function can
be use to obtain a list of these protocol headers.

For example, let's suppose we configured the Rx queue with the
following segments:
seg0 - pool0, proto_hdr0=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
   off0=2B
seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B
seg2 - pool2, off1=0B

The packet consists of ETH_IPV4_UDP_PAYLOAD will be split like
following:
seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
seg1 - udp header @ 128 in mbuf from pool1
seg2 - payload @ 0 in mbuf from pool2

Note: NIC will only do split when the packets exactly match all the
protocol headers in the segments. For example, if ARP packets received
with above config, the NIC won't do split for ARP packets since
it does not contains ipv4 header and udp header. These packets will be put
into the last valid mempool, with zero offset.

Now buffer split can be configured in two modes. For length based
buffer split, the mp, length, offset field in Rx packet segment should
be configured, while the proto_hdr field will be ignored.
For protocol header based buffer split, the mp, offset, proto_hdr field
in Rx packet segment should be configured, while the length field will
be ignored.

The split limitations imposed by underlying driver is reported in the
rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
parts may differ either, dpdk memory and external memory, respectively.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 doc/guides/rel_notes/release_22_11.rst |  4 ++
 lib/ethdev/rte_ethdev.c| 89 ++
 lib/ethdev/rte_ethdev.h| 34 +-
 3 files changed, 115 insertions(+), 12 deletions(-)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 141fd9442b..4c3a7f8b8b 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -127,6 +127,10 @@ New Features
 
   * Added ``rte_eth_buf

[PATCH v8 3/4] app/testpmd: add rxhdrs commands and parameters

2022-10-05 Thread Yuan Wang
Add command line parameter:
--rxhdrs=eth[,ipv4,udp]

Set the protocol_hdr of segments to scatter packets on receiving if
split feature is engaged. And the queues with BUFFER_SPLIT flag.

Add interactive mode command:
testpmd>set rxhdrs eth,ipv4,udp
(protocol sequence should be valid)

The protocol split feature is off by default. To enable protocol split,
you need:
1. Start testpmd with multiple mempools. E.g. --mbuf-size=2048,2048
2. Configure Rx queue with rx_offload buffer split on.
3. Set the protocol type of buffer split. E.g. set rxhdrs eth,eth-ipv4
(default protocols of testpmd : eth|ipv4|ipv6|ipv4-tcp|ipv6-tcp|
 ipv4-udp|ipv6-udp|ipv4-sctp|ipv6-sctp|grenat|inner-eth|
 inner-ipv4|inner-ipv6|inner-ipv4-tcp|inner-ipv6-tcp|
 inner-ipv4-udp|inner-ipv6-udp|inner-ipv4-sctp|inner-ipv6-sctp)
Above protocols can be configured in testpmd. But the configuration can
only be applied when it is supported by specific pmd.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 app/test-pmd/cmdline.c| 152 +-
 app/test-pmd/config.c |  95 
 app/test-pmd/parameters.c |  16 +++-
 app/test-pmd/testpmd.c|  11 ++-
 app/test-pmd/testpmd.h|   6 ++
 5 files changed, 273 insertions(+), 7 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ba749f588a..49e0321786 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -181,7 +181,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"show (rxq|txq) info (port_id) (queue_id)\n"
"Display information for configured RX/TX 
queue.\n\n"
 
-   "show config (rxtx|cores|fwd|rxoffs|rxpkts|txpkts)\n"
+   "show config 
(rxtx|cores|fwd|rxoffs|rxpkts|rxhdrs|txpkts)\n"
"Display the given configuration.\n\n"
 
"read rxd (port_id) (queue_id) (rxd_id)\n"
@@ -305,6 +305,17 @@ static void cmd_help_long_parsed(void *parsed_result,
" Affects only the queues configured with split"
" offloads.\n\n"
 
+   "set rxhdrs (eth[,ipv4,udp])*\n"
+   "Set the protocol hdr of each segment to scatter"
+   " packets on receiving if split feature is engaged."
+   " Affects only the queues configured with split"
+   " offloads.\n"
+   "Supported values: eth|ipv4|ipv6|ipv4-tcp|ipv6-tcp|"
+   "ipv4-udp|ipv6-udp|ipv4-sctp|ipv6-sctp|"
+   "grenat|inner-eth|inner-ipv4|inner-ipv6|inner-ipv4-tcp|"
+   "inner-ipv6-tcp|inner-ipv4-udp|inner-ipv6-udp|"
+   "inner-ipv4-sctp|inner-ipv6-sctp\n\n"
+
"set txpkts (x[,y]*)\n"
"Set the length of each segment of TXONLY"
" and optionally CSUM packets.\n\n"
@@ -3366,6 +3377,94 @@ static cmdline_parse_inst_t cmd_stop = {
},
 };
 
+static unsigned int
+get_ptype(char *value)
+{
+   uint32_t protocol;
+
+   if (!strcmp(value, "eth"))
+   protocol = RTE_PTYPE_L2_ETHER;
+   else if (!strcmp(value, "ipv4"))
+   protocol = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+   else if (!strcmp(value, "ipv6"))
+   protocol = RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+   else if (!strcmp(value, "ipv4-tcp"))
+   protocol = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_TCP;
+   else if (!strcmp(value, "ipv6-tcp"))
+   protocol = RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | RTE_PTYPE_L4_TCP;
+   else if (!strcmp(value, "ipv4-udp"))
+   protocol = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_UDP;
+   else if (!strcmp(value, "ipv6-udp"))
+   protocol = RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | RTE_PTYPE_L4_UDP;
+   else if (!strcmp(value, "ipv4-sctp"))
+   protocol = RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_SCTP;
+   else if (!strcmp(value, "ipv6-sctp"))
+   protocol = RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | RTE_PTYPE_L4_SCTP;
+   else if (!strcmp(value, "tcp"))
+   protocol = RTE_PTYPE_L4_TCP;
+   else if (!strcmp(value, "udp"))
+   protocol = RTE_PTYPE_L4_UDP;
+   else if (!strcmp(value, "sctp"))
+   protocol = RTE_PTYPE_L4_SCTP;
+   else if (!strcmp(value, "grenat"))
+   protocol = RTE_PTYPE_TUNNEL_GRENAT;
+   else if (!strcmp(value, "inner-eth"))
+   protocol = RTE_PTYPE_INNER_L2_ETHER;
+   else if (!strcmp(value, "inner-ipv4"))
+   protocol = RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN;
+   else if (!strcmp(value, "inner-ipv6"))
+   protocol = RTE_PTYPE_INNER_L3_IPV6_EXT_UNKNOWN;
+   else if (!strcmp(value, "inner-ipv4-tcp"))
+   protoco

[PATCH v8 4/4] net/ice: support buffer split in Rx path

2022-10-05 Thread Yuan Wang
This patch adds support for protocol based buffer split in normal Rx
data paths. When the Rx queue is configured with specific protocol type,
packets received will be directly split into protocol header and
payload parts limitation of pmd. And the two parts will be
put into different mempools.

Currently, protocol based buffer split is not supported in vectorized
paths.

A new api ice_buffer_split_supported_hdr_ptypes_get() has been
introduced, it will return the supported header protocols of ice PMD
to app for splitting.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/net/ice/ice_ethdev.c   |  97 -
 drivers/net/ice/ice_rxtx.c | 266 ++---
 drivers/net/ice/ice_rxtx.h |  16 ++
 drivers/net/ice/ice_rxtx_vec_common.h  |   3 +
 5 files changed, 354 insertions(+), 32 deletions(-)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 4c3a7f8b8b..513db6476b 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -132,6 +132,10 @@ New Features
 User can choose length or protocol header to configure buffer split
 according to NIC's capability.
 
+* **Updated Intel ice driver.**
+
+  * Added protocol based buffer split support in scalar path.
+
 
 Removed Items
 -
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 8aa37722c3..320e92d760 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -161,6 +161,7 @@ static int ice_timesync_read_time(struct rte_eth_dev *dev,
 static int ice_timesync_write_time(struct rte_eth_dev *dev,
   const struct timespec *timestamp);
 static int ice_timesync_disable(struct rte_eth_dev *dev);
+static const uint32_t *ice_buffer_split_supported_hdr_ptypes_get(struct 
rte_eth_dev *dev);
 
 static const struct rte_pci_id pci_id_ice_map[] = {
{ RTE_PCI_DEVICE(ICE_INTEL_VENDOR_ID, ICE_DEV_ID_E823L_BACKPLANE) },
@@ -275,6 +276,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = {
.timesync_write_time  = ice_timesync_write_time,
.timesync_disable = ice_timesync_disable,
.tm_ops_get   = ice_tm_ops_get,
+   .buffer_split_supported_hdr_ptypes_get = 
ice_buffer_split_supported_hdr_ptypes_get,
 };
 
 /* store statistics names and its offset in stats structure */
@@ -3802,7 +3804,8 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM |
RTE_ETH_RX_OFFLOAD_VLAN_EXTEND |
RTE_ETH_RX_OFFLOAD_RSS_HASH |
-   RTE_ETH_RX_OFFLOAD_TIMESTAMP;
+   RTE_ETH_RX_OFFLOAD_TIMESTAMP |
+   RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT;
dev_info->tx_offload_capa |=
RTE_ETH_TX_OFFLOAD_QINQ_INSERT |
RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
@@ -3814,7 +3817,7 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->flow_type_rss_offloads |= ICE_RSS_OFFLOAD_ALL;
}
 
-   dev_info->rx_queue_offload_capa = 0;
+   dev_info->rx_queue_offload_capa = RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT;
dev_info->tx_queue_offload_capa = RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;
 
dev_info->reta_size = pf->hash_lut_size;
@@ -3883,6 +3886,11 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->default_rxportconf.ring_size = ICE_BUF_SIZE_MIN;
dev_info->default_txportconf.ring_size = ICE_BUF_SIZE_MIN;
 
+   dev_info->rx_seg_capa.max_nseg = ICE_RX_MAX_NSEG;
+   dev_info->rx_seg_capa.multi_pools = 1;
+   dev_info->rx_seg_capa.offset_allowed = 0;
+   dev_info->rx_seg_capa.offset_align_log2 = 0;
+
return 0;
 }
 
@@ -5960,6 +5968,91 @@ ice_timesync_disable(struct rte_eth_dev *dev)
return 0;
 }
 
+static const uint32_t *
+ice_buffer_split_supported_hdr_ptypes_get(struct rte_eth_dev *dev __rte_unused)
+{
+   /* Buffer split protocol header capability. */
+   static const uint32_t ptypes[] = {
+   /* Non tunneled */
+   RTE_PTYPE_L2_ETHER,
+   RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+   RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+   RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_L4_TCP,
+   RTE_PTYPE_L4_SCTP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_TCP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_SCTP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+   

Re: [PATCH v2] vhost: fix compilation issue in async path

2022-10-05 Thread Maxime Coquelin




On 10/5/22 17:11, Maxime Coquelin wrote:

This patch fixes a compilation issue met with GCC 12 on
LoongArch64:

In function ‘mbuf_to_desc’,
 inlined from ‘vhost_enqueue_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1822:6,
 inlined from ‘virtio_dev_rx_async_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1836:6,
 inlined from ‘virtio_dev_rx_async_submit_packed’ at 
../../../dpdk/lib/vhost/virtio_net.c:1895:7:
../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ may 
be used uninitialized [-Werror=maybe-uninitialized]
  1159 | buf_addr = buf_vec[vec_idx].buf_addr;
   | ~^~~
../../../dpdk/lib/vhost/virtio_net.c: In function 
‘virtio_dev_rx_async_submit_packed’:
../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
  1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
   |   ^~~

It happens because the compiler assumes that 'size'
variable in vhost_enqueue_async_packed could wrap to 0 since
'size' is uint32_t and pkt->pkt_len too.

In practice, it would never happen since 'pkt->pkt_len' is
unlikely to be close to UINT32_MAX, but let's just change
'size' to uint64_t to make the compiler happy without
having to add runtime checks.

Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
Cc: sta...@dpdk.org

Signed-off-by: Maxime Coquelin 


Forgot to report the R-by from David on v1:
Reviewed-by: David Marchand 


---
  lib/vhost/virtio_net.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 8f4d0f0502..b86fb26040 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -1780,7 +1780,7 @@ vhost_enqueue_async_packed(struct virtio_net *dev,
uint16_t buf_id = 0;
uint32_t len = 0;
uint16_t desc_count = 0;
-   uint32_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   uint64_t size = pkt->pkt_len + sizeof(struct virtio_net_hdr_mrg_rxbuf);
uint32_t buffer_len[vq->size];
uint16_t buffer_buf_id[vq->size];
uint16_t buffer_desc_count[vq->size];




[PATCH] eal: fix return type of bsf safe functions

2022-10-05 Thread Thomas Monjalon
In a recent commit, changing return type from int to uint32_t,
I did a last minute change to functions rte_bsf32_safe and rte_bsf64_safe,
because thought they were forgotten.
Actually these functions are returning 0 or 1, so it should be int.
The return type is reverted to the original type for these 2 functions.

Fixes: 4b81c145ae37 ("eal: change return type of bsf/fls functions")

Signed-off-by: Thomas Monjalon 
---
 lib/eal/include/rte_common.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h
index 23d9e05cbb..15765b408d 100644
--- a/lib/eal/include/rte_common.h
+++ b/lib/eal/include/rte_common.h
@@ -660,7 +660,7 @@ rte_bsf32(uint32_t v)
  * @return
  * Returns 0 if ``v`` was 0, otherwise returns 1.
  */
-static inline uint32_t
+static inline int
 rte_bsf32_safe(uint32_t v, uint32_t *pos)
 {
if (v == 0)
@@ -739,7 +739,7 @@ rte_bsf64(uint64_t v)
  * @return
  * Returns 0 if ``v`` was 0, otherwise returns 1.
  */
-static inline uint32_t
+static inline int
 rte_bsf64_safe(uint64_t v, uint32_t *pos)
 {
if (v == 0)
-- 
2.36.1



Re: [PATCH] net/nfp: fix memory leak for receive function

2022-10-05 Thread Ferruh Yigit

On 9/22/2022 2:09 PM, Niklas Söderlund wrote:

From: Long Wu 

nfp_net_recv_pkts() should not return a value that less than 0 and the
inappropriate return value in receive loop also causes the memory leak.
Modify code to avoid return a value less than 0. Furthermore, When
nfp_net_recv_pkts() break out from the receive loop because of packet
problems, a rte_mbuf will not be freed and it will cause memory leak.
Free the rte_mbuf before break out.

Fixes: b812daadad0d ("nfp: add Rx and Tx")
Cc: louis.pe...@corigine.com
Cc: sta...@dpdk.org

Signed-off-by: Long Wu 
Reviewed-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 
Signed-off-by: Niklas Söderlund 


Applied to dpdk-next-net/main, thanks.



Re: [PATCH v4 0/6] add thread lifetime and attributes API

2022-10-05 Thread Tyler Retzlaff
hi David,

On Wed, Sep 21, 2022 at 10:15:36AM +0200, David Marchand wrote:
> On Mon, Jun 27, 2022 at 6:56 PM Tyler Retzlaff
>  wrote:
> >
> > add rte thread lifetime and attributes api. with these api additions
> > there is now sufficient platform abstracted thread api to remove the
> > use of pthread in the unit tests.
> >
> > v4:
> >   * update version.map to show api from series added in 22.11 instead
> > of 22.07.
> 
> Patch 1 still adds symbols in 22.07 section of the experimental block.

this looks like a goofup on my part, i will fix in the next revision.

> 
> Compilation is broken when stopping at patch 4.

this was because commit 72b452c5f2599f970f47fd17d3e8e5d60bfebe7a removed
#include of  from rte_common.h

i will fix this in the next revision.

> 
> Newly added code can go to eal_common_thread.c rather than introduce a
> new common/rte_thread.c file (or is there a rationale for this?).

i will make this change in the next revision. if anyone does object i
hope they will do so quickly.

> 
> There are guards for the new structs against RTE_HAS_CPUSET being defined.
> Can you elaborate why those guards are needed?

eal uses this guard in a couple of places so the best explanation i can
provide is an existing pattern was being followed. that being said it's
clear that this code won't work without RTE_HAS_CPUSET.

i will remove the guard and test if the build breaks for the platforms
i have toolchains available. if it builds cleanly i'll remove the guard
in the next revision.

> 
> Commitlogs / comments sentences must start with a capital letter.

i will fix in next revision, but ew caps.

> 
> Can you prepare a new revision for -rc1?

assuming the above actions to address feedback are acceptable the new
revision should be up today (PST).

> 
> Thanks.

thank you!

ty


[PATCH v8 0/3] add Intel uncore api to be called through l3fwd-power

2022-10-05 Thread Tadhg Kearney
This is targeting 22.11 and aims to add an API to DPDK power library to
allow uncore frequency adjustment. This will be called through the l3fwd-power
app and gives the ability to set the minimum and maximum uncore frequency to
both min, max or specific frequency index.

Signed-off-by: tadhgkearney 
Reviewed-by: David Hunt 
Acked-by: David Hunt 
---

v2:
Fix compilation warnings and errors.
v3:
Remove addition of x86 global macros.
Add 2 new API's for getting package and die numbers from system.
Address comments from mailing list.
Improve efficiency of code and code quality.
v4:
Fix compilation warnings and errors.
v5:
Improve error message for uncore access not working.
v6:
Fix uncore exit being called if uncore option not selected.
v7:
Address mailing list comments.
v8:
Address mailing list comments.

Tadhg Kearney (3):
  power: add Intel uncore frequency control API to power library
  l3fwd-power: add option to call uncore API
  test/power: add unit tests for uncore API

 app/test/meson.build  |   2 +
 app/test/test_power_intel_uncore.c| 301 
 doc/guides/prog_guide/power_man.rst   |  54 +++
 doc/guides/rel_notes/release_22_11.rst|   6 +
 .../sample_app_ug/l3_forward_power_man.rst|  29 ++
 examples/l3fwd-power/main.c   | 126 -
 lib/power/meson.build |   2 +
 lib/power/rte_power_intel_uncore.c| 451 ++
 lib/power/rte_power_intel_uncore.h| 194 
 lib/power/version.map |  11 +
 10 files changed, 1174 insertions(+), 2 deletions(-)
 create mode 100644 app/test/test_power_intel_uncore.c
 create mode 100644 lib/power/rte_power_intel_uncore.c
 create mode 100644 lib/power/rte_power_intel_uncore.h

-- 
2.25.1



[PATCH v8 1/3] power: add Intel uncore frequency control API to power library

2022-10-05 Thread Tadhg Kearney
Add API to allow uncore frequency adjustment. Uncore is a
term used by Intel to describe function of a microprocessor
that are closely connected to the core to achieve high
performance. This is done through manipulating related
uncore frequency control sysfs entries to adjust the
minimum and maximum uncore frequency values and works
on Linux for Intel hardware.

Signed-off-by: Tadhg Kearney 
Reviewed-by: David Hunt 
Acked-by: David Hunt 
---
 doc/guides/prog_guide/power_man.rst|  54 +++
 doc/guides/rel_notes/release_22_11.rst |   6 +
 lib/power/meson.build  |   2 +
 lib/power/rte_power_intel_uncore.c | 451 +
 lib/power/rte_power_intel_uncore.h | 194 +++
 lib/power/version.map  |  11 +
 6 files changed, 718 insertions(+)
 create mode 100644 lib/power/rte_power_intel_uncore.c
 create mode 100644 lib/power/rte_power_intel_uncore.h

diff --git a/doc/guides/prog_guide/power_man.rst 
b/doc/guides/prog_guide/power_man.rst
index 98cfd3c1f3..89bc23aa9d 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -276,6 +276,60 @@ API Overview for Ethernet PMD Power Management
 * **Set Scaling Max Freq**: Set the maximum frequency (kHz) to be used in 
Frequency
   Scaling mode.
 
+Intel Uncore API
+
+
+Abstract
+
+
+Uncore is a term used by Intel to describe the functions of a microprocessor 
that are
+not in the core, but which must be closely connected to the core to achieve 
high performance;
+L3 cache, on-die memory controller, etc.
+Significant power savings can be achieved by reducing the uncore frequency to 
its lowest value.
+
+The Linux kernel provides the driver ???intel-uncore-frequency" to control the 
uncore frequency limits
+for x86 platform. The driver is available from kernel version 5.6 and above.
+Also CONFIG_INTEL_UNCORE_FREQ_CONTROL will need to be enabled in the kernel, 
which was added in 5.6.
+This manipulates the contest of MSR 0x620, which sets min/max of the uncore 
for the SKU.
+
+
+API Overview for Intel Uncore
+~
+
+Overview of each function in the Intel Uncore API, with explanation of what
+they do. Each function should not be called in the fast path.
+
+* **Uncore Power Init**
+Initialize uncore power, populate frequency array and record
+original min & max for die on pkg.
+
+* **Uncore Power Exit**
+Exit uncore power, restoring original min & max for die on pkg.
+
+* **Get Uncore Power Freq**
+Get current uncore freq index for die on pkg.
+
+* **Set Uncore Power Freq**
+Set min & max uncore freq index for die on pkg to specified index value
+(min and max will be the same).
+
+* **Uncore Power Max**
+Set min & max uncore freq to maximum frequency index for die on pkg
+(min and max will be the same).
+
+* **Uncore Power Min**
+Set min & max uncore freq to minimum frequency index for die on pkg
+(min and max will be the same).
+
+* **Get Num Freqs**
+Get the number of frequencies in the index array.
+
+* **Get Num Pkgs**
+Get the number of packages (CPU's) on the system.
+
+* **Get Num Dies**
+Get the number of die's on a given package.
+
 References
 --
 
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index ac67e7e710..3f21a6126b 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -123,6 +123,12 @@ New Features
   into single event containing ``rte_event_vector``
   whose event type is ``RTE_EVENT_TYPE_CRYPTODEV_VECTOR``.
 
+* **Added Intel uncore frequency control API to the power library.**
+
+  Add API to allow uncore frequency adjustment. This is done through
+  manipulating related uncore frequency control sysfs entries to
+  adjust the minimum and maximum uncore frequency values, which works on
+  Linux with Intel hardware only.
 
 Removed Items
 -
diff --git a/lib/power/meson.build b/lib/power/meson.build
index ba8d66074b..c8de545b6c 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,12 +21,14 @@ sources = files(
 'rte_power.c',
 'rte_power_empty_poll.c',
 'rte_power_pmd_mgmt.c',
+'rte_power_intel_uncore.c',
 )
 headers = files(
 'rte_power.h',
 'rte_power_empty_poll.h',
 'rte_power_pmd_mgmt.h',
 'rte_power_guest_channel.h',
+'rte_power_intel_uncore.h',
 )
 if cc.has_argument('-Wno-cast-qual')
 cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_intel_uncore.c 
b/lib/power/rte_power_intel_uncore.c
new file mode 100644
index 00..4ba1ed7b4e
--- /dev/null
+++ b/lib/power/rte_power_intel_uncore.c
@@ -0,0 +1,451 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "rte_power_intel_uncore.h"
+#include "power_common.h"
+
+#define MAX_UNCORE_FREQS 32
+#define MA

[PATCH v8 3/3] test/power: add unit tests for uncore API

2022-10-05 Thread Tadhg Kearney
Add basic unit tests covering all nine uncore API's.

Signed-off-by: Tadhg Kearney 
Reviewed-by: David Hunt 
Acked-by: David Hunt 
---
 app/test/meson.build   |   2 +
 app/test/test_power_intel_uncore.c | 301 +
 2 files changed, 303 insertions(+)
 create mode 100644 app/test/test_power_intel_uncore.c

diff --git a/app/test/meson.build b/app/test/meson.build
index d5cad72116..2ce923a6b3 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -101,6 +101,7 @@ test_sources = files(
 'test_power.c',
 'test_power_cpufreq.c',
 'test_power_kvm_vm.c',
+'test_power_intel_uncore.c',
 'test_prefetch.c',
 'test_rand_perf.c',
 'test_rawdev.c',
@@ -241,6 +242,7 @@ fast_tests = [
 ['power_cpufreq_autotest', false, true],
 ['power_autotest', true, true],
 ['power_kvm_vm_autotest', false, true],
+['power_uncore_autotest', true, true],
 ['reorder_autotest', true, true],
 ['service_autotest', true, true],
 ['thash_autotest', true, true],
diff --git a/app/test/test_power_intel_uncore.c 
b/app/test/test_power_intel_uncore.c
new file mode 100644
index 00..4615ce5d19
--- /dev/null
+++ b/app/test/test_power_intel_uncore.c
@@ -0,0 +1,301 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2010-2022 Intel Corporation
+ */
+
+#include "test.h"
+
+#ifndef RTE_LIB_POWER
+
+static int
+test_power_intel_uncore(void)
+{
+   printf("Power management library not supported, skipping test\n");
+   return TEST_SKIPPED;
+}
+
+#else
+#include 
+#include 
+
+#define MAX_UNCORE_FREQS 32
+
+#define VALID_PKG 0
+#define VALID_DIE 0
+#define INVALID_PKG (rte_power_uncore_get_num_pkgs() + 1)
+#define INVALID_DIE (rte_power_uncore_get_num_dies(VALID_PKG) + 1)
+#define VALID_INDEX 1
+#define INVALID_INDEX (MAX_UNCORE_FREQS + 1)
+
+static int check_power_uncore_init(void)
+{
+   int ret;
+
+   /* Test initialisation of uncore configuration*/
+   ret = rte_power_uncore_init(VALID_PKG, VALID_DIE);
+   if (ret < 0) {
+   printf("Cannot initialise uncore power management for pkg %u 
die %u, this "
+   "may occur if environment is not configured "
+   "correctly(APCI cpufreq) or operating in another valid "
+   "Power management environment\n", VALID_PKG, VALID_DIE);
+   return -1;
+   }
+
+   /* Unsuccessful Test */
+   ret = rte_power_uncore_init(INVALID_PKG, INVALID_DIE);
+   if (ret == 0) {
+   printf("Unexpectedly was able to initialise uncore power 
management "
+   "for pkg %u die %u\n", INVALID_PKG, INVALID_DIE);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int
+check_power_get_uncore_freq(void)
+{
+   int ret;
+
+   /* Successfully get uncore freq */
+   ret = rte_power_get_uncore_freq(VALID_PKG, VALID_DIE);
+   if (ret < 0) {
+   printf("Failed to get uncore frequency for pkg %u die %u\n",
+   VALID_PKG, VALID_DIE);
+   return -1;
+   }
+
+   /* Unsuccessful Test */
+   ret = rte_power_get_uncore_freq(INVALID_PKG, INVALID_DIE);
+   if (ret >= 0) {
+   printf("Unexpectedly got invalid uncore frequency for pkg %u 
die %u\n",
+   INVALID_PKG, 
INVALID_DIE);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int
+check_power_set_uncore_freq(void)
+{
+   int ret;
+
+   /* Successfully set uncore freq */
+   ret = rte_power_set_uncore_freq(VALID_PKG, VALID_DIE, VALID_INDEX);
+   if (ret < 0) {
+   printf("Failed to set uncore frequency for pkg %u die %u index 
%u\n",
+   VALID_PKG, VALID_DIE, 
VALID_INDEX);
+   return -1;
+   }
+
+   /* Try to unsuccessfully set invalid uncore freq index */
+   ret = rte_power_set_uncore_freq(VALID_PKG, VALID_DIE, INVALID_INDEX);
+   if (ret == 0) {
+   printf("Unexpectedly set invalid uncore index for pkg %u die %u 
index %u\n",
+   VALID_PKG, VALID_DIE, 
INVALID_INDEX);
+   return -1;
+   }
+
+   /* Unsuccessful Test */
+   ret = rte_power_set_uncore_freq(INVALID_PKG, INVALID_DIE, VALID_INDEX);
+   if (ret == 0) {
+   printf("Unexpectedly set invalid uncore frequency for pkg %u 
die %u index %u\n",
+   INVALID_PKG, 
INVALID_DIE, VALID_INDEX);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int
+check_power_uncore_freq_max(void)
+{
+   int ret;
+
+   /* Successfully get max uncore freq */
+   ret = rte_power_uncore_freq_max(VALID_PKG, VALID_DIE);
+   if (ret < 0) {
+   pri

[PATCH v8 2/3] l3fwd-power: add option to call uncore API

2022-10-05 Thread Tadhg Kearney
Add option for setting uncore frequency min/max/index, through uncore API.
This will be set for each package and die on the SKU. On exit, uncore min
and max frequency will be reverted back to previous frequencies.

Signed-off-by: Tadhg Kearney 
Reviewed-by: David Hunt 
Acked-by: David Hunt 
---
 .../sample_app_ug/l3_forward_power_man.rst|  29 
 examples/l3fwd-power/main.c   | 126 +-
 2 files changed, 153 insertions(+), 2 deletions(-)

diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst 
b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 8f6d906200..08ac8ef369 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -97,6 +97,12 @@ where,
 *   -P: Sets all ports to promiscuous mode so that packets are accepted 
regardless of the packet's Ethernet MAC destination address.
 Without this option, only packets with the Ethernet MAC destination 
address set to the Ethernet address of the port are accepted.
 
+*   -u: optional, sets uncore min/max frequency to minimum value.
+
+*   -U: optional, sets uncore min/max frequency to maximum value.
+
+*   -i (frequency index): optional, sets uncore frequency to frequency index 
value, by setting min and max values to be the same.
+
 *   --config (port,queue,lcore)[,(port,queue,lcore)]: determines which queues 
from which ports are mapped to which cores.
 
 *   --max-pkt-len: optional, maximum packet length in decimal (64-9600)
@@ -364,3 +370,26 @@ in the DPDK Programmer's Guide for more details on PMD 
power management.
 .. code-block:: console
 
 .//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f 
--config="(0,0,2),(0,1,3)" --pmd-mgmt=scale
+
+Setting Uncore Values
+-
+
+Uncore frequency can be adjusted through manipulating related sysfs entries to 
adjust the minimum and maximum uncore values.
+This will be set for each package and die on the SKU. The driver for enabling 
this is available from kernel version 5.6 and above.
+Three options are available for setting uncore frequency:
+
+``-u``
+  This will set uncore minimum and maximum frequencies to minimum possible 
value.
+
+``-U``
+  This will set uncore minimum and maximum frequencies to maximum possible 
value.
+
+``-i``
+  This will allow you to set the specific uncore frequency index that you 
want, by setting
+  the uncore frequency to a frequency pointed by index. Frequency index's are 
set 100MHz apart from
+  maximum to minimum.
+  Frequency index values are in descending order, ie, index 0 is maximum 
frequency index.
+
+.. code-block:: console
+
+.//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f 
--config="(0,0,2),(0,1,3)" -i 1
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 887c6eae3f..d6d71719e5 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "perf_core.h"
 #include "main.h"
@@ -161,6 +162,9 @@ static struct rte_timer telemetry_timer;
 /* stats index returned by metrics lib */
 int telstats_index;
 
+/* flag to check if uncore option enabled */
+int enabled_uncore = -1;
+
 struct telstats_name {
char name[RTE_ETH_XSTATS_NAME_SIZE];
 };
@@ -179,6 +183,12 @@ enum busy_rate {
FULL = 100
 };
 
+enum uncore_choice {
+   UNCORE_MIN = 0,
+   UNCORE_MAX = 1,
+   UNCORE_IDX = 2
+};
+
 /* reference poll count to measure core busyness */
 #define DEFAULT_COUNT 1
 /*
@@ -1616,6 +1626,9 @@ print_usage(const char *prgname)
"  [--max-pkt-len PKTLEN]\n"
"  -p PORTMASK: hexadecimal bitmask of ports to configure\n"
"  -P: enable promiscuous mode\n"
+   "  -u: set min/max frequency for uncore to minimum value\n"
+   "  -U: set min/max frequency for uncore to maximum value\n"
+   "  -i (frequency index): set min/max frequency for uncore to 
specified frequency index\n"
"  --config (port,queue,lcore): rx queues configuration\n"
"  --high-perf-cores CORELIST: list of high performance cores\n"
"  --perf-config: similar as config, cores specified as indices"
@@ -1672,6 +1685,74 @@ static int parse_max_pkt_len(const char *pktlen)
return len;
 }
 
+static int
+parse_uncore_options(enum uncore_choice choice, const char *argument)
+{
+   unsigned int die, pkg, max_pkg, max_die;
+   int ret = 0;
+   max_pkg = rte_power_uncore_get_num_pkgs();
+   if (max_pkg == 0)
+   return -1;
+
+   for (pkg = 0; pkg < max_pkg; pkg++) {
+   max_die = rte_power_uncore_get_num_dies(pkg);
+   if (max_die == 0)
+   return -1;
+   for (die = 0; die < max_die; die++) {
+   ret = rte_power_uncore_init(pkg, die);
+   if (ret == -1) {
+   RTE_LOG(INFO, L

[PATCH] license/README: fix pathnames and add MIT

2022-10-05 Thread Stephen Hemminger
The pathnames in the license directory README are incorrect.
The current repository puts license text in license/ not licenses/.

Also, MIT license is used already in a couple of places so
add the necessary entry in the README.

Signed-off-by: Stephen Hemminger 
---
 license/README | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/license/README b/license/README
index 79dac86440a5..9def09058751 100644
--- a/license/README
+++ b/license/README
@@ -58,7 +58,7 @@ DPDK Governing Board. Steps for any exception approval:
 3. Technical Board then approach Governing Board for such limited approval for
the given contribution only.
 
-Any approvals shall be documented in "Licenses/exceptions.txt" with record
+Any approvals shall be documented in "license/exceptions.txt" with record
 dates.
 
 DPDK project supported licenses are:
@@ -66,12 +66,16 @@ DPDK project supported licenses are:
 1. BSD 3-clause "New" or "Revised" License
SPDX-License-Identifier: BSD-3-Clause
URL: http://spdx.org/licenses/BSD-3-Clause#licenseText
-   DPDK License text: licenses/bsd-3-clause.txt
+   DPDK License text: license/bsd-3-clause.txt
 2. GNU General Public License v2.0 only
SPDX-License-Identifier: GPL-2.0
URL: http://spdx.org/licenses/GPL-2.0.html#licenseText
-   DPDK License text: licenses/gpl-2.0.txt
+   DPDK License text: license/gpl-2.0.txt
 3. GNU Lesser General Public License v2.1
SPDX-License-Identifier: LGPL-2.1
URL: http://spdx.org/licenses/LGPL-2.1.html#licenseText
-   DPDK License text: licenses/lgpl-2.1.txt
+   DPDK License text: license/lgpl-2.1.txt
+4. MIT License
+   SPDX-License-Identifier: MIT
+   URL: http://spdx.org/licenses/MIT.html#licenseText
+   DPDK License text: license/mit.txt
-- 
2.35.1



Re: [PATCH v2] vhost: fix compilation issue in async path

2022-10-05 Thread David Marchand
On Wed, Oct 5, 2022 at 5:36 PM Maxime Coquelin
 wrote:
> On 10/5/22 17:11, Maxime Coquelin wrote:
> > This patch fixes a compilation issue met with GCC 12 on
> > LoongArch64:
> >
> > In function ‘mbuf_to_desc’,
> >  inlined from ‘vhost_enqueue_async_packed’ at 
> > ../../../dpdk/lib/vhost/virtio_net.c:1822:6,
> >  inlined from ‘virtio_dev_rx_async_packed’ at 
> > ../../../dpdk/lib/vhost/virtio_net.c:1836:6,
> >  inlined from ‘virtio_dev_rx_async_submit_packed’ at 
> > ../../../dpdk/lib/vhost/virtio_net.c:1895:7:
> > ../../../dpdk/lib/vhost/virtio_net.c:1159:18: error: ‘buf_vec[0].buf_addr’ 
> > may be used uninitialized [-Werror=maybe-uninitialized]
> >   1159 | buf_addr = buf_vec[vec_idx].buf_addr;
> >| ~^~~
> > ../../../dpdk/lib/vhost/virtio_net.c: In function 
> > ‘virtio_dev_rx_async_submit_packed’:
> > ../../../dpdk/lib/vhost/virtio_net.c:1834:27: note: ‘buf_vec’ declared here
> >   1834 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
> >|   ^~~
> >
> > It happens because the compiler assumes that 'size'
> > variable in vhost_enqueue_async_packed could wrap to 0 since
> > 'size' is uint32_t and pkt->pkt_len too.
> >
> > In practice, it would never happen since 'pkt->pkt_len' is
> > unlikely to be close to UINT32_MAX, but let's just change
> > 'size' to uint64_t to make the compiler happy without
> > having to add runtime checks.
> >
> > Fixes: 873e8dad6f49 ("vhost: support packed ring in async datapath")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Maxime Coquelin 
>
> Forgot to report the R-by from David on v1:
> Reviewed-by: David Marchand 

After retesting with other arches, I have an error for cross compiling
ARM, on another part of the code.

C compiler for the host machine: ccache aarch64-linux-gnu-gcc (gcc
12.0.1 "aarch64-linux-gnu-gcc (GCC) 12.0.1 20220205 (experimental)
[master revision f49b8d25b1ff96e9cd0932cc510b3a3accde]")

Same symptom:

[251/2797] Compiling C object lib/librte_vhost.a.p/vhost_virtio_net.c.o
FAILED: lib/librte_vhost.a.p/vhost_virtio_net.c.o
ccache aarch64-linux-gnu-gcc -Ilib/librte_vhost.a.p -Ilib
-I../../../git/pub/dpdk.org/main/lib -Ilib/vhost
-I../../../git/pub/dpdk.org/main/lib/vhost -I.
-I../../../git/pub/dpdk.org/main -Iconfig
-I../../../git/pub/dpdk.org/main/config -Ilib/eal/include
-I../../../git/pub/dpdk.org/main/lib/eal/include
-Ilib/eal/linux/include
-I../../../git/pub/dpdk.org/main/lib/eal/linux/include
-Ilib/eal/arm/include
-I../../../git/pub/dpdk.org/main/lib/eal/arm/include -Ilib/eal/common
-I../../../git/pub/dpdk.org/main/lib/eal/common -Ilib/eal
-I../../../git/pub/dpdk.org/main/lib/eal -Ilib/kvargs
-I../../../git/pub/dpdk.org/main/lib/kvargs -Ilib/metrics
-I../../../git/pub/dpdk.org/main/lib/metrics -Ilib/telemetry
-I../../../git/pub/dpdk.org/main/lib/telemetry -Ilib/ethdev
-I../../../git/pub/dpdk.org/main/lib/ethdev -Ilib/net
-I../../../git/pub/dpdk.org/main/lib/net -Ilib/mbuf
-I../../../git/pub/dpdk.org/main/lib/mbuf -Ilib/mempool
-I../../../git/pub/dpdk.org/main/lib/mempool -Ilib/ring
-I../../../git/pub/dpdk.org/main/lib/ring -Ilib/meter
-I../../../git/pub/dpdk.org/main/lib/meter -Ilib/cryptodev
-I../../../git/pub/dpdk.org/main/lib/cryptodev -Ilib/rcu
-I../../../git/pub/dpdk.org/main/lib/rcu -Ilib/hash
-I../../../git/pub/dpdk.org/main/lib/hash -Ilib/pci
-I../../../git/pub/dpdk.org/main/lib/pci -Ilib/dmadev
-I../../../git/pub/dpdk.org/main/lib/dmadev -fdiagnostics-color=always
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -O2 -g
-include rte_config.h -Wcast-qual -Wdeprecated -Wformat
-Wformat-nonliteral -Wformat-security -Wmissing-declarations
-Wmissing-prototypes -Wnested-externs -Wold-style-definition
-Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
-Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned
-Wno-missing-field-initializers -Wno-zero-length-bounds -D_GNU_SOURCE
-fPIC -march=armv8-a+crc -moutline-atomics -DALLOW_EXPERIMENTAL_API
-DALLOW_INTERNAL_API -Wno-format-truncation -DVHOST_GCC_UNROLL_PRAGMA
-fno-strict-aliasing -DRTE_LOG_DEFAULT_LOGTYPE=lib.vhost -MD -MQ
lib/librte_vhost.a.p/vhost_virtio_net.c.o -MF
lib/librte_vhost.a.p/vhost_virtio_net.c.o.d -o
lib/librte_vhost.a.p/vhost_virtio_net.c.o -c
../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c
In function ‘mbuf_to_desc’,
inlined from ‘vhost_enqueue_single_packed’ at
../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1321:6,
inlined from ‘virtio_dev_rx_single_packed’ at
../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1506:6,
inlined from ‘virtio_dev_rx_packed’ at
../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1540:7:
../../../git/pub/dpdk.org/main/lib/vhost/virtio_net.c:1159:18: error:
‘buf_vec[0].buf_addr’ may be used uninitialized
[-Werror=maybe-uninitialized]
 1159 | buf_addr = buf_vec[vec_idx].buf_addr;
  | ~^~~
../../../git/pub/dpdk.org/main/lib/vhost/v

Re: [PATCH v4 0/6] add thread lifetime and attributes API

2022-10-05 Thread Tyler Retzlaff
On Wed, Oct 05, 2022 at 09:11:26AM -0700, Tyler Retzlaff wrote:
> hi David,
> 
> 
> > 
> > Newly added code can go to eal_common_thread.c rather than introduce a
> > new common/rte_thread.c file (or is there a rationale for this?).
> 
> i will make this change in the next revision. if anyone does object i
> hope they will do so quickly.

looking at this more closely i'm going to back away from making the
adjustment here. if Thomas and/or Dmitry could comment it would be
appreciated.

it appears that functions placed in eal_common_xxx files are consumed
internally by the eal where rte_xxx files are functions that are exposed
through public api.

since these additions are public api it seems they should remain in
rte_thread.c
 
i won't change this in the next revision, but please do correct me if
i'm still not on track.


Re: [PATCH 0/2] net/nfp: use a single struct eth_dev_ops

2022-10-05 Thread Ferruh Yigit

On 9/22/2022 2:03 PM, Niklas Söderlund wrote:

Hi all,

This small series aims to reduce code duplication by using a single
shared struct eth_dev_ops for NFDk and NFD3.

Patch 1/2 is a small bug fix that suck in when adding support for
NFP3800 where the error code is not correctly propagated.

Patch 2/2 is do the real work by merging the two existing structs and
adding the small glue in the setup step that differenced them.

Jin Liu (2):
   net/nfp: fix return value of nfp card init function
   net/nfp: optimize ethdev struct



Series applied to dpdk-next-net/main, thanks.



Re: [PATCH v10 00/13] preparation for the rte_flow offload of nfp PMD

2022-10-05 Thread Ferruh Yigit

On 10/5/2022 12:34 PM, Ferruh Yigit wrote:

On 9/26/2022 7:59 AM, Chaoyong He wrote:

This is the first patch series to add the support of rte_flow offload for
nfp PMD, includes:
Add the support of flower firmware application
Add the support of representor port
Add the flower service infrastructure
Add the cmsg interactive channels between pmd and fw

* Changes since v9
- Remove the use of rte_eth_tx_burst()
- Remove the logics rely on OvS

* Changes since v8
- Update the nfp.rst
- Fix the 'app_hw' to 'app_fw'
- Remove the ovs compatible header file
- Remove the use of 
rte_eth_dev_configure()/rte_eth_rx_burst()/rte_eth_dev_start() API


* Changes since v7
- Adjust the logics to make sure not break the pci probe process
- Change 'app' to 'app_fw' in all logics to avoid confuse
- Fix problem about log level

* Changes since v6
- Fix the compile error

* Changes since v5
- Compare integer with 0 explicitly
- Change helper macro to function
- Implement the dummy functions
- Remove some unnecessary logics

* Changes since v4
- Remove the unneeded '__rte_unused' attribute
- Fixup a potential memory leak problem

* Changes since v3
- Add the 'Depends-on' tag

* Changes since v2
- Remove the use of rte_panic()

* Changes since v1
- Fix the compile error

Depends-on: series-23707 ("Add support of NFP3800 chip and firmware 
with NFDk")


Chaoyong He (13):
   net/nfp: move app specific attributes to own struct
   net/nfp: simplify initialization and remove dead code
   net/nfp: move app specific init logic to own function
   net/nfp: add initial flower firmware support
   net/nfp: add flower PF setup logic
   net/nfp: add flower ctrl VNIC related logics
   net/nfp: move common rxtx function for flower use
   net/nfp: add flower ctrl VNIC rxtx logic
   net/nfp: add flower representor framework
   net/nfp: add flower PF related routines
   net/nfp: move rxtx function to header file
   net/nfp: add flower PF rxtx logic
   net/nfp: add the representor port rxtx logic



Series applied to dpdk-next-net/main, thanks.



Ahh, this has a small conflict with other set [1], and other one is fix.
To keep fix patch backportable to stable trees, I will get that series 
first and apply this set on top of it, by force push to next-net.


Please highlight these kind of dependencies in advance, in commit log etc.

And can you please double check final code in next-net head?

[1]
https://patches.dpdk.org/project/dpdk/patch/20220922130314.694790-2-niklas.soderl...@corigine.com/
[1/2] net/nfp: fix return value of nfp card init function




Re: [PATCH] examples/l2fwd-cat: fix build

2022-10-05 Thread David Marchand
On Wed, Oct 5, 2022 at 5:24 PM Kevin Traynor  wrote:
>
>  and  need to be included for the build
> since they were removed from .
>
> ../examples/l2fwd-cat/cat.c: In function ‘parse_set’:
> ../examples/l2fwd-cat/cat.c:66:16:
> warning: implicit declaration of function ‘isblank’
> [-Wimplicit-function-declaration]
>66 | while (isblank(*str))
>   |^~~
> ../examples/l2fwd-cat/cat.c:18:1:
> note: include ‘’ or provide a declaration of ‘isblank’
>17 | #include "cat.h"
>   +++ |+#include 
>18 |
> ../examples/l2fwd-cat/cat.c:70:15:
> warning: implicit declaration of function ‘isdigit’
> [-Wimplicit-function-declaration]
>70 | if ((!isdigit(*str) && *str != '(') || *str == '\0')
>   |   ^~~
> ../examples/l2fwd-cat/cat.c:70:15:
> note: include ‘’ or provide a declaration of ‘isdigit’
> ../examples/l2fwd-cat/cat.c:75:17:
> error: ‘errno’ undeclared (first use in this function)
>75 | errno = 0;
>   | ^
> ../examples/l2fwd-cat/cat.c:18:1:
> note: ‘errno’ is defined in header ‘’;
> did you forget to ‘#include ’?
>17 | #include "cat.h"
>   +++ |+#include 
>
> Fixes: 72b452c5f259 ("eal: remove unneeded includes from a public header")
>
> Signed-off-by: Kevin Traynor 
> ---
>  examples/l2fwd-cat/cat.c | 2 ++
>  1 file changed, 2 insertions(+)

There is another issue, caught when building this example out of the
meson build process (can be caught with
DPDK_BUILD_TEST_EXAMPLES=l2fwd-cat devtools/test-meson-builds.sh).
Not urgent, can you look at it?
(I added a rte_common.h inclusion in cat.h as a workaround, there may
be a bettter fix to do).


We could also enhance build coverage in GHA, since the
intel-cmt-cat/intel-cmd-cat-devel packages are available for
Ubuntu/Fedora distribs.


-- 
David Marchand



[PATCH v5 0/6] add thread lifetime and attributes API

2022-10-05 Thread Tyler Retzlaff
add rte thread lifetime and attributes api. with these api additions
there is now sufficient platform abstracted thread api to remove the
use of pthread in the unit tests.

v5:
  * include errno.h in rte_thread.c since errno.h is no longer included
in rte_common.h
  * move rte_thread_attr symbols from 22.07 to 22.11 section of
version.map.
  * remove RTE_HAS_CPUSET guards from rte_thread.h
  * change case of characters in commit message for patches {6,5,4} of
the series.

v4:
  * update version.map to show api from series added in 22.11 instead
of 22.07.
  * fix missing parameter name in rte_thread_func declaration causing
doxygen ci failure.

v3:
  * change rte_thread_func return type to uint32_t for exit value.
  * change rte_thread_join retval to be uint32_t (matched with the
return value from rte_thread_func).
  * introduce a wrapper for rte_thread_func on posix platforms to
adapt differences between rte_thread_func and pthread
start_routine.
  * remove interpretation / dereference of result from pthread_join
in posix implementation of rte_thread_join.
  * fix leak of dynamically allocated thread_routine_ctx on windows
in error paths.
  * don't cast and truncate NULL to integer value for rte_thread_join
when pthread_join returns no result.

v2:
  * split implementation of rte_thread_equal for windows / posix
and use pthread_equal for posix platforms.
  * remove parameter validation assertions and instead return
EINVAL for mandatory pointers to type that are NULL.
  * correct doxygen comment parameter name args -> arg

Tyler Retzlaff (6):
  eal: add thread attributes
  eal: add thread lifetime management
  eal: add basic rte thread ID equal API
  test/threads: add tests for thread lifetime API
  test/threads: add tests for thread attributes API
  test/threads: remove unit test use of pthread

 app/test/test_threads.c | 134 ++--
 lib/eal/common/meson.build  |   1 +
 lib/eal/common/rte_thread.c |  62 
 lib/eal/include/rte_thread.h| 185 -
 lib/eal/unix/rte_thread.c   | 141 ++
 lib/eal/version.map |  10 ++
 lib/eal/windows/include/sched.h |   2 +-
 lib/eal/windows/rte_thread.c| 219 
 8 files changed, 702 insertions(+), 52 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c

-- 
1.8.3.1



[PATCH v5 3/6] eal: add basic rte thread ID equal API

2022-10-05 Thread Tyler Retzlaff
Add rte_thread_equal() that tests if two rte_thread_id are equal.

Signed-off-by: Narcisa Vasile 
Signed-off-by: Tyler Retzlaff 
Acked-by: Chengwen Feng 
---
 lib/eal/unix/rte_thread.c| 6 ++
 lib/eal/version.map  | 1 +
 lib/eal/windows/rte_thread.c | 6 ++
 3 files changed, 13 insertions(+)

diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index d4c1a7f..37ebfcf 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -210,6 +210,12 @@ struct thread_routine_ctx {
return pthread_detach((pthread_t)thread_id.opaque_id);
 }
 
+int
+rte_thread_equal(rte_thread_t t1, rte_thread_t t2)
+{
+   return pthread_equal((pthread_t)t1.opaque_id, (pthread_t)t2.opaque_id);
+}
+
 rte_thread_t
 rte_thread_self(void)
 {
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 5fd84b6..2d64a25 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -438,6 +438,7 @@ EXPERIMENTAL {
rte_thread_attr_set_priority;
rte_thread_create;
rte_thread_detach;
+   rte_thread_equal;
rte_thread_join;
 };
 
diff --git a/lib/eal/windows/rte_thread.c b/lib/eal/windows/rte_thread.c
index e153278..1c1e9d0 100644
--- a/lib/eal/windows/rte_thread.c
+++ b/lib/eal/windows/rte_thread.c
@@ -289,6 +289,12 @@ struct thread_routine_ctx {
return 0;
 }
 
+int
+rte_thread_equal(rte_thread_t t1, rte_thread_t t2)
+{
+   return t1.opaque_id == t2.opaque_id;
+}
+
 rte_thread_t
 rte_thread_self(void)
 {
-- 
1.8.3.1



[PATCH v5 1/6] eal: add thread attributes

2022-10-05 Thread Tyler Retzlaff
Implement thread attributes for:

* thread affinity
* thread priority

Implement functions for managing thread attributes.

  Priority is represented through an enum that allows for two levels:

* RTE_THREAD_PRIORITY_NORMAL
* RTE_THREAD_PRIORITY_REALTIME_CRITICAL

  Affinity is described by the rte_cpuset_t type.

An rte_thread_attr_t object can be set to the default values
by calling rte_thread_attr_init().

Signed-off-by: Narcisa Vasile 
Signed-off-by: Tyler Retzlaff 
---
 lib/eal/common/meson.build   |   1 +
 lib/eal/common/rte_thread.c  |  62 
 lib/eal/include/rte_thread.h | 110 +--
 lib/eal/version.map  |   6 +++
 4 files changed, 176 insertions(+), 3 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c

diff --git a/lib/eal/common/meson.build b/lib/eal/common/meson.build
index 917758c..1e77dba 100644
--- a/lib/eal/common/meson.build
+++ b/lib/eal/common/meson.build
@@ -36,6 +36,7 @@ sources += files(
 'rte_random.c',
 'rte_reciprocal.c',
 'rte_service.c',
+'rte_thread.c',
 'rte_version.c',
 )
 if is_linux or is_windows
diff --git a/lib/eal/common/rte_thread.c b/lib/eal/common/rte_thread.c
new file mode 100644
index 000..760ad62
--- /dev/null
+++ b/lib/eal/common/rte_thread.c
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (C) 2022 Microsoft Corporation
+ */
+
+#include 
+
+#include 
+#include 
+
+int
+rte_thread_attr_init(rte_thread_attr_t *attr)
+{
+   if (attr == NULL)
+   return EINVAL;
+
+   CPU_ZERO(&attr->cpuset);
+   attr->priority = RTE_THREAD_PRIORITY_NORMAL;
+
+   return 0;
+}
+
+int
+rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr,
+   rte_cpuset_t *cpuset)
+{
+   if (thread_attr == NULL)
+   return EINVAL;
+
+   if (cpuset == NULL)
+   return EINVAL;
+
+   thread_attr->cpuset = *cpuset;
+
+   return 0;
+}
+
+int
+rte_thread_attr_get_affinity(rte_thread_attr_t *thread_attr,
+   rte_cpuset_t *cpuset)
+{
+   if (thread_attr == NULL)
+   return EINVAL;
+
+   if (cpuset == NULL)
+   return EINVAL;
+
+   *cpuset = thread_attr->cpuset;
+
+   return 0;
+}
+
+int
+rte_thread_attr_set_priority(rte_thread_attr_t *thread_attr,
+   enum rte_thread_priority priority)
+{
+   if (thread_attr == NULL)
+   return EINVAL;
+
+   thread_attr->priority = priority;
+
+   return 0;
+}
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index 9e261bf..a3802de 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -41,6 +41,14 @@ enum rte_thread_priority {
 };
 
 /**
+ * Representation for thread attributes.
+ */
+typedef struct {
+   enum rte_thread_priority priority; /**< thread priority */
+   rte_cpuset_t cpuset; /**< thread affinity */
+} rte_thread_attr_t;
+
+/**
  * TLS key type, an opaque pointer.
  */
 typedef struct eal_tls_key *rte_thread_key;
@@ -57,7 +65,105 @@ enum rte_thread_priority {
 __rte_experimental
 rte_thread_t rte_thread_self(void);
 
-#ifdef RTE_HAS_CPUSET
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Check if 2 thread ids are equal.
+ *
+ * @param t1
+ *   First thread id.
+ *
+ * @param t2
+ *   Second thread id.
+ *
+ * @return
+ *   If the ids are equal, return nonzero.
+ *   Otherwise, return 0.
+ */
+__rte_experimental
+int rte_thread_equal(rte_thread_t t1, rte_thread_t t2);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Initialize the attributes of a thread.
+ * These attributes can be passed to the rte_thread_create() function
+ * that will create a new thread and set its attributes according to attr.
+ *
+ * @param attr
+ *   Thread attributes to initialize.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_attr_init(rte_thread_attr_t *attr);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set the CPU affinity value in the thread attributes pointed to
+ * by 'thread_attr'.
+ *
+ * @param thread_attr
+ *   Points to the thread attributes in which affinity will be updated.
+ *
+ * @param cpuset
+ *   Points to the value of the affinity to be set.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_attr_set_affinity(rte_thread_attr_t *thread_attr,
+   rte_cpuset_t *cpuset);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Get the value of CPU affinity that is set in the thread attributes pointed
+ * to by 'thread_attr'.
+ *
+ * @param thread_attr
+ *   Points to the thread attributes from which affinity will be retrieved.
+

[PATCH v5 2/6] eal: add thread lifetime management

2022-10-05 Thread Tyler Retzlaff
The *rte_thread_create()* function can optionally receive an
rte_thread_attr_t object that will cause the thread to be created with
the affinity and priority described by the attributes object. If
no rte_thread_attr_t is passed (parameter is NULL), the default
affinity and priority are used.

On Windows, the function executed by a thread when the thread starts is
represeneted by a function pointer of type DWORD (*func) (void*).
On other platforms, the function pointer is a void* (*func) (void*).

Performing a cast between these two types of function pointers to
uniformize the API on all platforms may result in undefined behavior.
TO fix this issue, a wrapper that respects the signature required by
CreateThread() has been created on Windows.

Signed-off-by: Narcisa Vasile 
Signed-off-by: Tyler Retzlaff 
---
 lib/eal/include/rte_thread.h|  75 ++
 lib/eal/unix/rte_thread.c   | 135 +
 lib/eal/version.map |   3 +
 lib/eal/windows/include/sched.h |   2 +-
 lib/eal/windows/rte_thread.c| 213 
 5 files changed, 387 insertions(+), 41 deletions(-)

diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index a3802de..65cdcfa 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -31,6 +31,18 @@
 } rte_thread_t;
 
 /**
+ * Thread function
+ *
+ * Function pointer to thread start routine.
+ *
+ * @param arg
+ *   Argument passed to rte_thread_create().
+ * @return
+ *   Thread function exit value.
+ */
+typedef uint32_t (*rte_thread_func) (void *arg);
+
+/**
  * Thread priority values.
  */
 enum rte_thread_priority {
@@ -57,6 +69,69 @@ enum rte_thread_priority {
  * @warning
  * @b EXPERIMENTAL: this API may change without prior notice.
  *
+ * Create a new thread that will invoke the 'thread_func' routine.
+ *
+ * @param thread_id
+ *A pointer that will store the id of the newly created thread.
+ *
+ * @param thread_attr
+ *Attributes that are used at the creation of the new thread.
+ *
+ * @param thread_func
+ *The routine that the new thread will invoke when starting execution.
+ *
+ * @param arg
+ *Argument to be passed to the 'thread_func' routine.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_create(rte_thread_t *thread_id,
+   const rte_thread_attr_t *thread_attr,
+   rte_thread_func thread_func, void *arg);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Waits for the thread identified by 'thread_id' to terminate
+ *
+ * @param thread_id
+ *The identifier of the thread.
+ *
+ * @param value_ptr
+ *Stores the exit status of the thread.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_join(rte_thread_t thread_id, uint32_t *value_ptr);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Indicate that the return value of the thread is not needed and
+ * all thread resources should be release when the thread terminates.
+ *
+ * @param thread_id
+ *The id of the thread to be detached.
+ *
+ * @return
+ *   On success, return 0.
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_detach(rte_thread_t thread_id);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
  * Get the id of the calling thread.
  *
  * @return
diff --git a/lib/eal/unix/rte_thread.c b/lib/eal/unix/rte_thread.c
index 9126595..d4c1a7f 100644
--- a/lib/eal/unix/rte_thread.c
+++ b/lib/eal/unix/rte_thread.c
@@ -16,6 +16,11 @@ struct eal_tls_key {
pthread_key_t thread_index;
 };
 
+struct thread_routine_ctx {
+   rte_thread_func thread_func;
+   void *routine_args;
+};
+
 static int
 thread_map_priority_to_os_value(enum rte_thread_priority eal_pri, int *os_pri,
int *pol)
@@ -75,6 +80,136 @@ struct eal_tls_key {
return 0;
 }
 
+static void *
+thread_func_wrapper(void *arg)
+{
+   struct thread_routine_ctx ctx = *(struct thread_routine_ctx *)arg;
+
+   free(arg);
+
+   return (void *)(uintptr_t)ctx.thread_func(ctx.routine_args);
+}
+
+int
+rte_thread_create(rte_thread_t *thread_id,
+   const rte_thread_attr_t *thread_attr,
+   rte_thread_func thread_func, void *args)
+{
+   int ret = 0;
+   pthread_attr_t attr;
+   pthread_attr_t *attrp = NULL;
+   struct thread_routine_ctx *ctx;
+   struct sched_param param = {
+   .sched_priority = 0,
+   };
+   int policy = SCHED_OTHER;
+
+   ctx = calloc(1, sizeof(*ctx));
+   if (ctx == NULL) {
+   RTE_LOG(DEBUG, EAL, "Insufficient memory for thread context 
allocations\n");
+   ret = ENOMEM;
+   goto cleanup;
+ 

[PATCH v5 4/6] test/threads: add tests for thread lifetime API

2022-10-05 Thread Tyler Retzlaff
Test basic functionality and demonstrate use of following thread
lifetime api.

* rte_thread_create
* rte_thread_detach
* rte_thread_join

Signed-off-by: Narcisa Vasile 
Signed-off-by: Tyler Retzlaff 
---
 app/test/test_threads.c | 54 +++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index b9d8b4e..1077373 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -14,7 +14,7 @@
 
 static uint32_t thread_id_ready;
 
-static void *
+static uint32_t
 thread_main(void *arg)
 {
*(rte_thread_t *)arg = rte_thread_self();
@@ -23,7 +23,55 @@
while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
;
 
-   return NULL;
+   return 0;
+}
+
+static int
+test_thread_create_join(void)
+{
+   rte_thread_t thread_id;
+   rte_thread_t thread_main_id;
+
+   thread_id_ready = 0;
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, NULL, thread_main, 
&thread_main_id) == 0,
+   "Failed to create thread.");
+
+   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
+   ;
+
+   RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
+   "Unexpected thread id.");
+
+   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
+
+   RTE_TEST_ASSERT(rte_thread_join(thread_id, NULL) == 0,
+   "Failed to join thread.");
+
+   return 0;
+}
+
+static int
+test_thread_create_detach(void)
+{
+   rte_thread_t thread_id;
+   rte_thread_t thread_main_id;
+
+   thread_id_ready = 0;
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, NULL, thread_main,
+   &thread_main_id) == 0, "Failed to create thread.");
+
+   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
+   ;
+
+   RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
+   "Unexpected thread id.");
+
+   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
+
+   RTE_TEST_ASSERT(rte_thread_detach(thread_id) == 0,
+   "Failed to detach thread.");
+
+   return 0;
 }
 
 static int
@@ -123,6 +171,8 @@
.setup = NULL,
.teardown = NULL,
.unit_test_cases = {
+   TEST_CASE(test_thread_create_join),
+   TEST_CASE(test_thread_create_detach),
TEST_CASE(test_thread_affinity),
TEST_CASE(test_thread_priority),
TEST_CASES_END()
-- 
1.8.3.1



[PATCH v5 6/6] test/threads: remove unit test use of pthread

2022-10-05 Thread Tyler Retzlaff
Now that rte_thread provides thread lifetime functions stop using
pthread in unit tests.

Signed-off-by: Tyler Retzlaff 
---
 app/test/test_threads.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 3c22cec..e0f18e4 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -3,7 +3,6 @@
  */
 
 #include 
-#include 
 
 #include 
 #include 
@@ -79,12 +78,11 @@
 static int
 test_thread_priority(void)
 {
-   pthread_t id;
rte_thread_t thread_id;
enum rte_thread_priority priority;
 
thread_id_ready = 0;
-   RTE_TEST_ASSERT(pthread_create(&id, NULL, thread_main, &thread_id) == 0,
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, NULL, thread_main, NULL) 
== 0,
"Failed to create thread");
 
while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
@@ -131,13 +129,12 @@
 static int
 test_thread_affinity(void)
 {
-   pthread_t id;
rte_thread_t thread_id;
rte_cpuset_t cpuset0;
rte_cpuset_t cpuset1;
 
thread_id_ready = 0;
-   RTE_TEST_ASSERT(pthread_create(&id, NULL, thread_main, &thread_id) == 0,
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, NULL, thread_main, NULL) 
== 0,
"Failed to create thread");
 
while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
-- 
1.8.3.1



[PATCH v5 5/6] test/threads: add tests for thread attributes API

2022-10-05 Thread Tyler Retzlaff
Test basic functionality and demonstrate use of following thread
attributes api. Additionally, test attributes are processed when
supplied to rte_thread_create().

* rte_thread_attr_init
* rte_thread_attr_set_affinity
* rte_thread_attr_get_affinity
* rte_thread_attr_set_priority

Signed-off-by: Narcisa Vasile 
Signed-off-by: Tyler Retzlaff 
---
 app/test/test_threads.c | 73 -
 1 file changed, 72 insertions(+), 1 deletion(-)

diff --git a/app/test/test_threads.c b/app/test/test_threads.c
index 1077373..3c22cec 100644
--- a/app/test/test_threads.c
+++ b/app/test/test_threads.c
@@ -17,7 +17,9 @@
 static uint32_t
 thread_main(void *arg)
 {
-   *(rte_thread_t *)arg = rte_thread_self();
+   if (arg != NULL)
+   *(rte_thread_t *)arg = rte_thread_self();
+
__atomic_store_n(&thread_id_ready, 1, __ATOMIC_RELEASE);
 
while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 1)
@@ -166,6 +168,73 @@
return 0;
 }
 
+static int
+test_thread_attributes_affinity(void)
+{
+   rte_thread_t thread_id;
+   rte_thread_attr_t attr;
+   rte_cpuset_t cpuset0;
+   rte_cpuset_t cpuset1;
+
+   RTE_TEST_ASSERT(rte_thread_attr_init(&attr) == 0,
+   "Failed to initialize thread attributes");
+
+   CPU_ZERO(&cpuset0);
+   RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(rte_thread_self(), 
&cpuset0) == 0,
+   "Failed to get thread affinity");
+   RTE_TEST_ASSERT(rte_thread_attr_set_affinity(&attr, &cpuset0) == 0,
+   "Failed to set thread attributes affinity");
+   RTE_TEST_ASSERT(rte_thread_attr_get_affinity(&attr, &cpuset1) == 0,
+   "Failed to get thread attributes affinity");
+   RTE_TEST_ASSERT(memcmp(&cpuset0, &cpuset1, sizeof(rte_cpuset_t)) == 0,
+   "Affinity should be stable");
+
+   thread_id_ready = 0;
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, &attr, thread_main, NULL) 
== 0,
+   "Failed to create attributes affinity thread.");
+
+   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
+   ;
+
+   RTE_TEST_ASSERT(rte_thread_get_affinity_by_id(thread_id, &cpuset1) == 0,
+   "Failed to get attributes thread affinity");
+   RTE_TEST_ASSERT(memcmp(&cpuset0, &cpuset1, sizeof(rte_cpuset_t)) == 0,
+   "Failed to apply affinity attributes");
+
+   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
+
+   return 0;
+}
+
+static int
+test_thread_attributes_priority(void)
+{
+   rte_thread_t thread_id;
+   rte_thread_attr_t attr;
+   enum rte_thread_priority priority;
+
+   RTE_TEST_ASSERT(rte_thread_attr_init(&attr) == 0,
+   "Failed to initialize thread attributes");
+   RTE_TEST_ASSERT(rte_thread_attr_set_priority(&attr, 
RTE_THREAD_PRIORITY_NORMAL) == 0,
+   "Failed to set thread attributes priority");
+
+   thread_id_ready = 0;
+   RTE_TEST_ASSERT(rte_thread_create(&thread_id, &attr, thread_main, NULL) 
== 0,
+   "Failed to create attributes priority thread.");
+
+   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
+   ;
+
+   RTE_TEST_ASSERT(rte_thread_get_priority(thread_id, &priority) == 0,
+   "Failed to get thread priority");
+   RTE_TEST_ASSERT(priority == RTE_THREAD_PRIORITY_NORMAL,
+   "Failed to apply priority attributes");
+
+   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
+
+   return 0;
+}
+
 static struct unit_test_suite threads_test_suite = {
.suite_name = "threads autotest",
.setup = NULL,
@@ -175,6 +244,8 @@
TEST_CASE(test_thread_create_detach),
TEST_CASE(test_thread_affinity),
TEST_CASE(test_thread_priority),
+   TEST_CASE(test_thread_attributes_affinity),
+   TEST_CASE(test_thread_attributes_priority),
TEST_CASES_END()
}
 };
-- 
1.8.3.1



[PATCH v2] sched: Fix subport profile id not set correctly.

2022-10-05 Thread Megha Ajmera
In rte_sched_subport_config() API, subport_profile_id is not set correctly.

Fixes: ac6fcb841b0f ("sched: update subport rate dynamically")
Cc: cristian.dumitre...@intel.com

Signed-off-by: Megha Ajmera 
---
 lib/sched/rte_sched.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/sched/rte_sched.c b/lib/sched/rte_sched.c
index c5fa9e4582..c91697131d 100644
--- a/lib/sched/rte_sched.c
+++ b/lib/sched/rte_sched.c
@@ -1257,8 +1257,6 @@ rte_sched_subport_config(struct rte_sched_port *port,
 
n_subports++;
 
-   subport_profile_id = 0;
-
/* Port */
port->subports[subport_id] = s;
 
-- 
2.25.1



Re: [Bug 1030] rte_malloc() and rte_free() get stuck when used with signal handler

2022-10-05 Thread Stephen Hemminger
On Mon, 13 Jun 2022 14:48:45 +0500
Sarosh Arif  wrote:

> Thank you for help, I'll do it this way.
> 
> On Sat, Jun 11, 2022 at 9:25 PM Mattias Rönnblom  
> wrote:
> >
> > On 2022-06-10 08:04, Sarosh Arif wrote:  
> > > On Thu, Jun 9, 2022 at 8:26 PM Stephen Hemminger
> > >  wrote:  
> > >>
> > >> On Thu, 09 Jun 2022 12:47:43 +
> > >> bugzi...@dpdk.org wrote:
> > >>  
> > >>> https://bugs.dpdk.org/show_bug.cgi?id=1030
> > >>>
> > >>>  Bug ID: 1030
> > >>> Summary: rte_malloc() and rte_free() get stuck when used 
> > >>> with
> > >>>  signal handler
> > >>> Product: DPDK
> > >>> Version: 22.03
> > >>>Hardware: All
> > >>>  OS: Linux
> > >>>  Status: UNCONFIRMED
> > >>>Severity: normal
> > >>>Priority: Normal
> > >>>   Component: core
> > >>>Assignee: dev@dpdk.org
> > >>>Reporter: sarosh.a...@emumba.com
> > >>>Target Milestone: ---
> > >>>
> > >>> Created attachment 205  
> > >>>--> https://bugs.dpdk.org/attachment.cgi?id=205&action=edit  
> > >>> calls rte_malloc and rte_free in the handler and main code
> > >>>
> > >>> I have a dpdk based application which uses rte_malloc() and rte_free()
> > >>> frequently in it's main code. The general method to close the 
> > >>> application is
> > >>> though sending SIGINT. The application has a signal handler written for 
> > >>> cleanup
> > >>> purposes before closing the application. The handler also uses 
> > >>> rte_free() to
> > >>> release some of the memory during cleanup. The application gets stuck 
> > >>> in a
> > >>> deadlock.
> > >>>
> > >>>
> > >>> Upon investigation I found out that both rte_free() and rte_malloc() use
> > >>> rte_spinlock_lock() function to place a lock on heap. While this lock 
> > >>> is placed
> > >>> and the application receives SIGINT, it goes into the handler without 
> > >>> releasing
> > >>> the lock. Since the handler itself calls rte_free() which tries to 
> > >>> acquire the
> > >>> lock it gets stuck.
> > >>>
> > >>>
> > >>> I have attached a sample application to reproduce this problem.
> > >>>
> > >>>
> > >>> Steps to reproduce this problem:
> > >>>
> > >>> 1. compile the code provided in attachment with any version of dpdk
> > >>> 2. run the compiled binary
> > >>> 3. press ctrl+c till the prints stop
> > >>>
> > >>> Actual Results:
> > >>> The application gets stuck in either rte_free() or rte_malloc()
> > >>>
> > >>> Expected Results:
> > >>> Application should allocate and free the memory without getting stuck
> > >>>  
> > >>
> > >> rte_malloc and rte_free are not async sigsafe()
> > >>  
> > > Oh, I did not know that. This should be mentioned in the documentation.  
> >
> > Is there anything except  that is/should be async-signal-safe?
> >  
> > >> but then again regular glibc is not either.  
> > > Memory allocated with glibc malloc() is freed by itself upon closing
> > > the application. My application runs as a secondary process, and it
> > > needs to use rte_malloc() specifically because the memory should be
> > > shared between the two processes. If I don't free it upon closure it
> > > would just be leaked. Is there any other solution for it?  
> >
> > The standard solution is that the signal handler using some appropriate,
> > async-signal-safe way talks to the main thread, which then goes on to
> > cleanly terminate the application.
> >
> > A write() to an fd, or an atomic store to a flag are two options.  

Patch is pending (why is it not merged?) to describe what is signal safe.
https://patchwork.dpdk.org/project/dpdk/patch/20220711230448.557715-1-step...@networkplumber.org/


[PATCH v2] sched: subport field is unused in hqos profile

2022-10-05 Thread Megha Ajmera
Removed ununsed subport field from profile.cfg
Correctly using subport profile id in subport config load.

Fixes: 802d214dc880 ("examples/qos_sched: update subport rate dynamically")
Cc: cristian.dumitre...@intel.com

Signed-off-by: Megha Ajmera 
---
 examples/qos_sched/cfg_file.c  | 2 +-
 examples/qos_sched/profile.cfg | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index 3d5d75fcf0..fa3e802363 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -157,7 +157,7 @@ cfg_load_subport_profile(struct rte_cfgfile *cfg,
 
profiles = rte_cfgfile_num_sections(cfg, "subport profile",
   sizeof("subport profile") - 1);
-   subport_params[0].n_pipe_profiles = profiles;
+   port_params.n_subport_profiles= profiles;
 
for (i = 0; i < profiles; i++) {
char sec_name[32];
diff --git a/examples/qos_sched/profile.cfg b/examples/qos_sched/profile.cfg
index c9ec187c93..e8de101b6c 100644
--- a/examples/qos_sched/profile.cfg
+++ b/examples/qos_sched/profile.cfg
@@ -26,8 +26,6 @@ number of subports per port = 1
 number of pipes per subport = 4096
 queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64
 
-subport 0-8 = 0; These subports are configured with subport 
profile 0
-
 [subport profile 0]
 tb rate = 125000   ; Bytes per second
 tb size = 100  ; Bytes
-- 
2.25.1



[PATCH v3] sched: subport field is unused in hqos profile

2022-10-05 Thread Megha Ajmera
Removed unused subport field from profile.cfg
Correctly using subport profile id in subport config load.

Fixes: 802d214dc880 ("examples/qos_sched: update subport rate dynamically")
Cc: cristian.dumitre...@intel.com

Signed-off-by: Megha Ajmera 
---
 examples/qos_sched/cfg_file.c  | 2 +-
 examples/qos_sched/profile.cfg | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/examples/qos_sched/cfg_file.c b/examples/qos_sched/cfg_file.c
index 3d5d75fcf0..ca871d3287 100644
--- a/examples/qos_sched/cfg_file.c
+++ b/examples/qos_sched/cfg_file.c
@@ -157,7 +157,7 @@ cfg_load_subport_profile(struct rte_cfgfile *cfg,
 
profiles = rte_cfgfile_num_sections(cfg, "subport profile",
   sizeof("subport profile") - 1);
-   subport_params[0].n_pipe_profiles = profiles;
+   port_params.n_subport_profiles = profiles;
 
for (i = 0; i < profiles; i++) {
char sec_name[32];
diff --git a/examples/qos_sched/profile.cfg b/examples/qos_sched/profile.cfg
index c9ec187c93..e8de101b6c 100644
--- a/examples/qos_sched/profile.cfg
+++ b/examples/qos_sched/profile.cfg
@@ -26,8 +26,6 @@ number of subports per port = 1
 number of pipes per subport = 4096
 queue sizes = 64 64 64 64 64 64 64 64 64 64 64 64 64
 
-subport 0-8 = 0; These subports are configured with subport 
profile 0
-
 [subport profile 0]
 tb rate = 125000   ; Bytes per second
 tb size = 100  ; Bytes
-- 
2.25.1



rte_service unit test failing randomly

2022-10-05 Thread David Marchand
Hello,

The service_autotest unit test has been failing randomly.
This is not something new.
We have been fixing this unit test and the service code, here and there.
For some time we were "fine": the failures were rare.

But recenly (for the last two weeks at least), it started failing more
frequently in UNH lab.

The symptoms are linked to places where the unit test code is "waiting
for some time":

-  service_lcore_attr_get:
+ TestCase [ 5] : service_lcore_attr_get failed
EAL: Test assert service_lcore_attr_get line 422 failed: Service lcore
not stopped after waiting.


-  service_may_be_active:
+ TestCase [15] : service_may_be_active failed
...
EAL: Test assert service_may_be_active line 960 failed: Error: Service
not stopped after 100ms

Ideas?


Thanks.
-- 
David Marchand



RE: [PATCH v2] sched: Fix subport profile id not set correctly.

2022-10-05 Thread Dumitrescu, Cristian



> -Original Message-
> From: Ajmera, Megha 
> Sent: Wednesday, October 5, 2022 6:23 PM
> To: dev@dpdk.org; Singh, Jasvinder ;
> Dumitrescu, Cristian 
> Cc: sta...@dpdk.org
> Subject: [PATCH v2] sched: Fix subport profile id not set correctly.
> 
> In rte_sched_subport_config() API, subport_profile_id is not set correctly.
> 
> Fixes: ac6fcb841b0f ("sched: update subport rate dynamically")
> Cc: cristian.dumitre...@intel.com
> 
> Signed-off-by: Megha Ajmera 
> ---
>  lib/sched/rte_sched.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/lib/sched/rte_sched.c b/lib/sched/rte_sched.c
> index c5fa9e4582..c91697131d 100644
> --- a/lib/sched/rte_sched.c
> +++ b/lib/sched/rte_sched.c
> @@ -1257,8 +1257,6 @@ rte_sched_subport_config(struct rte_sched_port
> *port,
> 
>   n_subports++;
> 
> - subport_profile_id = 0;
> -
>   /* Port */
>   port->subports[subport_id] = s;
> 
> --
> 2.25.1

Hi Megha,

I noticed several patches from you that all fix small things, can you please 
put all of them into a series as opposed to isolated patches? This is to avoid 
apply issues due to dependency order. No need for a cover letter in this case.

Can you please also pay attention to the details in the patch title and 
description:
-title needs to have a lower case letter after "sched: "
-title needs to start with a verb
-title needs to be short and clear
-description needs to make sense

Thanks,
Cristian


Re: [PATCH] net/qede/base: fix 32-bit build with GCC 12

2022-10-05 Thread David Marchand
On Tue, Oct 4, 2022 at 1:18 PM Thomas Monjalon  wrote:
>
> A pointer is passed to a macro and it seems mistakenly referenced.
> This issue is seen only when compiling with GCC 12 for 32-bit:
>
> drivers/net/qede/base/ecore_init_fw_funcs.c:1418:25:
> error: array subscript 1 is outside array bounds of ‘u32[1]’
> {aka ‘unsigned int[1]’} [-Werror=array-bounds]
>  1418 | ecore_wr(dev, ptt, ((addr) + (4 * i)),  \
>   | ^
>  1419 |  ((u32 *)&(arr))[i]);   \
>   |  ~~~
> drivers/net/qede/base/ecore_init_fw_funcs.c:1465:17:
> note: in expansion of macro ‘ARR_REG_WR’
>  1465 | ARR_REG_WR(p_hwfn, p_ptt, addr, pData, len_in_dwords);
>   | ^~
> drivers/net/qede/base/ecore_init_fw_funcs.c:1439:35:
> note: at offset 4 into object ‘pData’ of size 4
>  1439 |  u32 *pData,
>   |  ~^
>
> Fixes: 3b307c55f2ac ("net/qede/base: update FW to 8.40.25.0")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Thomas Monjalon 

Reproduced and tested build fix.
Tested-by: David Marchand 

-- 
David Marchand



Re: [PATCH] raw/skeleton: remove useless check

2022-10-05 Thread David Marchand
On Tue, Oct 4, 2022 at 10:00 AM David Marchand
 wrote:
>
> As reported by Coverity, this check is pointless since dev is already
> dereferenced earlier. Besides, dev is passed by the rawdev layer and
> can't be NULL.
>
> Note: the issue was probably present before the incriminated commit.
> It is unclear why Coverity would start complaining about this now.
>
> Coverity issue: 380991
> Fixes: 8f1d23ece06a ("eal: deprecate RTE_FUNC_PTR_* macros")
>
> Signed-off-by: David Marchand 
Acked-by: Hemant Agrawal 

Applied, thanks.

-- 
David Marchand



Re: [PATCH] remove prefix to some local macros in apps and examples

2022-10-05 Thread David Marchand
On Tue, Oct 4, 2022 at 2:58 PM Ferruh Yigit  wrote:
>
> On 10/4/2022 9:01 AM, David Marchand wrote:
> > RTE_TEST_[RT]X_DESC_DEFAULT and RTE_TEST_[RT]X_DESC_MAX macros have been
> > copied in a lot of app/ and examples/ code.
> > Those macros are local to each program.
> >
> > They are not related to a DPDK public header/API, drop the RTE_TEST_
> > prefix.
> >
> > Signed-off-by: David Marchand
> Acked-by: Ferruh Yigit 

Applied, thanks.


-- 
David Marchand



  1   2   >