Re: [PATCH v6 1/3] eal: add rte thread create control API

2023-02-24 Thread David Marchand
On Wed, Feb 8, 2023 at 10:26 PM Tyler Retzlaff
 wrote:
>
> Add rte_thread_create_control API as a replacement for
> rte_ctrl_thread_create to allow deprecation of the use of platform
> specific types in DPDK public API.
>
> Add test from David Marchand to exercise the new API.
>
> Signed-off-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> Reviewed-by: Mattias Rönnblom 
> ---
>  app/test/test_threads.c| 26 
>  lib/eal/common/eal_common_thread.c | 85 
> ++
>  lib/eal/include/rte_thread.h   | 33 +++
>  lib/eal/version.map|  1 +
>  4 files changed, 137 insertions(+), 8 deletions(-)
>
> diff --git a/app/test/test_threads.c b/app/test/test_threads.c
> index e0f18e4..657ecad 100644
> --- a/app/test/test_threads.c
> +++ b/app/test/test_threads.c
> @@ -232,6 +232,31 @@
> return 0;
>  }
>
> +static int
> +test_thread_create_control_join(void)
> +{
> +   rte_thread_t thread_id;
> +   rte_thread_t thread_main_id;
> +
> +   thread_id_ready = 0;
> +   RTE_TEST_ASSERT(rte_thread_create_control(&thread_id, 
> "test_control_threads",
> +   NULL, thread_main, &thread_main_id) == 0,
> +   "Failed to create thread.");
> +
> +   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
> +   ;
> +
> +   RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
> +   "Unexpected thread id.");
> +
> +   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
> +
> +   RTE_TEST_ASSERT(rte_thread_join(thread_id, NULL) == 0,
> +   "Failed to join thread.");
> +
> +   return 0;
> +}
> +
>  static struct unit_test_suite threads_test_suite = {
> .suite_name = "threads autotest",
> .setup = NULL,
> @@ -243,6 +268,7 @@
> TEST_CASE(test_thread_priority),
> TEST_CASE(test_thread_attributes_affinity),
> TEST_CASE(test_thread_attributes_priority),
> +   TEST_CASE(test_thread_create_control_join),
> TEST_CASES_END()
> }
>  };
> diff --git a/lib/eal/common/eal_common_thread.c 
> b/lib/eal/common/eal_common_thread.c
> index 3181515..4f83c97 100644
> --- a/lib/eal/common/eal_common_thread.c
> +++ b/lib/eal/common/eal_common_thread.c
> @@ -232,7 +232,10 @@ enum __rte_ctrl_thread_status {
>  };
>
>  struct rte_thread_ctrl_params {
> -   void *(*start_routine)(void *);
> +   union {
> +   void *(*ctrl_start_routine)(void *arg);
> +   rte_thread_func control_start_routine;
> +   } u;
> void *arg;
> int ret;
> /* Control thread status.
> @@ -241,27 +244,47 @@ struct rte_thread_ctrl_params {
> enum __rte_ctrl_thread_status ctrl_thread_status;
>  };
>
> -static void *ctrl_thread_init(void *arg)
> +static int ctrl_thread_init(void *arg)
>  {
> struct internal_config *internal_conf =
> eal_get_internal_configuration();
> rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> struct rte_thread_ctrl_params *params = arg;
> -   void *(*start_routine)(void *) = params->start_routine;
> -   void *routine_arg = params->arg;
>
> __rte_thread_init(rte_lcore_id(), cpuset);
> params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), 
> cpuset);
> if (params->ret != 0) {
> __atomic_store_n(¶ms->ctrl_thread_status,
> CTRL_THREAD_ERROR, __ATOMIC_RELEASE);
> -   return NULL;
> +   return params->ret;
> }
>
> __atomic_store_n(¶ms->ctrl_thread_status,
> CTRL_THREAD_RUNNING, __ATOMIC_RELEASE);
>
> -   return start_routine(routine_arg);
> +   return 0;
> +}
> +
> +static void *ctrl_thread_start(void *arg)
> +{
> +   struct rte_thread_ctrl_params *params = arg;
> +   void *(*start_routine)(void *) = params->u.ctrl_start_routine;
> +
> +   if (ctrl_thread_init(arg) != 0)
> +   return NULL;
> +
> +   return start_routine(params->arg);

Does this patch open a race on params->arg between the newly created
thread, and the one that asked for creation?
Reading params->arg is not under the ctrl_thread_status
syncrhonisation point anymore (which was protecting against
free(params)).

WDYT?

(Note that I doubt this patch has anything to do with the recent bz
opened by Intel QE)


> +}
> +
> +static uint32_t control_thread_start(void *arg)
> +{
> +   struct rte_thread_ctrl_params *params = arg;
> +   rte_thread_func start_routine = params->u.control_start_routine;
> +
> +   if (ctrl_thread_init(arg) != 0)
> +   return params->ret;

Idem here for params->ret.


> +
> +   return start_routine(params->arg);
>  }
>
>  int
> @@ -277,12 +300,12 @@ static void *ctrl_thread_init(void *arg)
> if (!params)
> return -ENOMEM;
>
> -   params->start_routine = star

[PATCH 00/14] Enable lock annotations on most libraries and drivers

2023-02-24 Thread David Marchand
This is a followup of the series that introduced lock annotations.
I reworked and made annotations work in what seemed the easier cases.
In most cases, I chose to convert inline wrappers around the EAL lock
API to simple macro: I did not see much value in those wrappers and this
is way simpler than adding __rte_*lock_function tags everywhere.

A list of libraries and drivers still need more work as their code have
non obvious locks handling. For those components, the check is opted
out.
I leave it to their respective maintainers to enable the checks later.

Maintainers, please review.


-- 
David Marchand

David Marchand (14):
  malloc: rework heap lock handling
  mem: rework malloc heap init
  mem: annotate shared memory config locks
  hash: annotate cuckoo hash lock
  graph: annotate graph lock
  drivers: inherit lock annotations for Intel drivers
  net/cxgbe: inherit lock annotations
  net/fm10k: annotate mailbox lock
  net/sfc: rework locking in proxy code
  net/sfc: inherit lock annotations
  net/virtio: annotate lock for guest announce
  raw/ifpga: inherit lock annotations
  vdpa/sfc: inherit lock annotations
  enable lock check

 .../prog_guide/env_abstraction_layer.rst  |  5 +-
 drivers/bus/dpaa/meson.build  |  1 +
 drivers/common/cnxk/meson.build   |  1 +
 drivers/common/iavf/iavf_osdep.h  | 39 +++
 drivers/common/iavf/iavf_prototype.h  |  6 --
 drivers/common/idpf/base/idpf_osdep.h | 26 ++--
 drivers/common/mlx5/meson.build   |  1 +
 drivers/event/cnxk/meson.build|  1 +
 drivers/meson.build   |  2 +-
 drivers/net/bnx2x/meson.build |  1 +
 drivers/net/bnxt/meson.build  |  1 +
 drivers/net/cnxk/meson.build  |  1 +
 drivers/net/cxgbe/base/adapter.h  | 35 ++
 drivers/net/enic/meson.build  |  1 +
 drivers/net/fm10k/fm10k_ethdev.c  |  2 +
 drivers/net/hns3/meson.build  |  1 +
 drivers/net/i40e/base/i40e_osdep.h|  8 +--
 drivers/net/i40e/base/i40e_prototype.h|  5 --
 drivers/net/i40e/i40e_ethdev.c| 24 ---
 drivers/net/ice/base/ice_osdep.h  | 26 ++--
 drivers/net/mlx5/meson.build  |  1 +
 drivers/net/sfc/sfc.h | 41 ++--
 drivers/net/sfc/sfc_ev.c  |  6 +-
 drivers/net/sfc/sfc_repr.c| 38 ++-
 drivers/net/sfc/sfc_repr_proxy.c  | 59 +
 drivers/net/virtio/virtio_ethdev.c|  8 +--
 drivers/net/virtio/virtio_ethdev.h|  7 +-
 drivers/raw/ifpga/afu_pmd_core.c  | 17 +
 drivers/vdpa/sfc/sfc_vdpa.h   | 41 ++--
 drivers/vdpa/sfc/sfc_vdpa_ops.c   | 14 ++--
 lib/eal/common/eal_common_mcfg.c  | 66 +++
 lib/eal/common/eal_common_memory.c| 10 +--
 lib/eal/common/malloc_heap.c  | 55 +++-
 lib/eal/common/malloc_heap.h  |  3 +
 lib/eal/common/rte_malloc.c   | 10 ---
 lib/eal/freebsd/eal.c | 13 
 lib/eal/include/rte_eal_memconfig.h   | 63 ++
 lib/eal/linux/eal.c   | 13 
 lib/eal/version.map   |  4 ++
 lib/eal/windows/eal.c | 13 
 lib/graph/graph.c | 10 ++-
 lib/graph/graph_private.h | 10 ++-
 lib/hash/rte_cuckoo_hash.c|  8 +++
 lib/ipsec/meson.build |  1 +
 lib/meson.build   |  2 +-
 lib/timer/meson.build |  1 +
 lib/vhost/meson.build |  1 -
 47 files changed, 313 insertions(+), 389 deletions(-)

-- 
2.39.2



[PATCH 01/14] malloc: rework heap lock handling

2023-02-24 Thread David Marchand
Move all heap->lock manipulation to malloc_heap.c to have a single
location where to look at and make higher level code unaware of this
locking constraint.

The destroy helper has been reworked to zero all the heap object but
leave the lock untouched. The heap lock is then released through the
standard API.
This makes the code less scary even though, at this point of its life,
the heap object is probably referenced only by the caller.

Signed-off-by: David Marchand 
---
 lib/eal/common/malloc_heap.c | 45 +++-
 lib/eal/common/rte_malloc.c  | 10 
 2 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index d7c410b786..cc84dce672 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -1292,6 +1292,8 @@ int
 malloc_heap_add_external_memory(struct malloc_heap *heap,
struct rte_memseg_list *msl)
 {
+   rte_spinlock_lock(&heap->lock);
+
/* erase contents of new memory */
memset(msl->base_va, 0, msl->len);
 
@@ -1308,6 +1310,11 @@ malloc_heap_add_external_memory(struct malloc_heap *heap,
eal_memalloc_mem_event_notify(RTE_MEM_EVENT_ALLOC,
msl->base_va, msl->len);
 
+   /* mark it as heap segment */
+   msl->heap = 1;
+
+   rte_spinlock_unlock(&heap->lock);
+
return 0;
 }
 
@@ -1315,7 +1322,12 @@ int
 malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr,
size_t len)
 {
-   struct malloc_elem *elem = heap->first;
+   struct malloc_elem *elem;
+   int ret = -1;
+
+   rte_spinlock_lock(&heap->lock);
+
+   elem = heap->first;
 
/* find element with specified va address */
while (elem != NULL && elem != va_addr) {
@@ -1323,20 +1335,24 @@ malloc_heap_remove_external_memory(struct malloc_heap 
*heap, void *va_addr,
/* stop if we've blown past our VA */
if (elem > (struct malloc_elem *)va_addr) {
rte_errno = ENOENT;
-   return -1;
+   goto out;
}
}
/* check if element was found */
if (elem == NULL || elem->msl->len != len) {
rte_errno = ENOENT;
-   return -1;
+   goto out;
}
/* if element's size is not equal to segment len, segment is busy */
if (elem->state == ELEM_BUSY || elem->size != len) {
rte_errno = EBUSY;
-   return -1;
+   goto out;
}
-   return destroy_elem(elem, len);
+   ret = destroy_elem(elem, len);
+
+out:
+   rte_spinlock_unlock(&heap->lock);
+   return ret;
 }
 
 int
@@ -1372,23 +1388,30 @@ malloc_heap_create(struct malloc_heap *heap, const char 
*heap_name)
 int
 malloc_heap_destroy(struct malloc_heap *heap)
 {
+   int ret = -1;
+
+   rte_spinlock_lock(&heap->lock);
+
if (heap->alloc_count != 0) {
RTE_LOG(ERR, EAL, "Heap is still in use\n");
rte_errno = EBUSY;
-   return -1;
+   goto fail;
}
if (heap->first != NULL || heap->last != NULL) {
RTE_LOG(ERR, EAL, "Heap still contains memory segments\n");
rte_errno = EBUSY;
-   return -1;
+   goto fail;
}
if (heap->total_size != 0)
RTE_LOG(ERR, EAL, "Total size not zero, heap is likely 
corrupt\n");
 
-   /* after this, the lock will be dropped */
-   memset(heap, 0, sizeof(*heap));
-
-   return 0;
+   RTE_BUILD_BUG_ON(offsetof(struct malloc_heap, lock) != 0);
+   memset(RTE_PTR_ADD(heap, sizeof(heap->lock)), 0,
+   sizeof(*heap) - sizeof(heap->lock));
+   ret = 0;
+fail:
+   rte_spinlock_unlock(&heap->lock);
+   return ret;
 }
 
 int
diff --git a/lib/eal/common/rte_malloc.c b/lib/eal/common/rte_malloc.c
index d39870bf3c..4f500892f2 100644
--- a/lib/eal/common/rte_malloc.c
+++ b/lib/eal/common/rte_malloc.c
@@ -436,10 +436,7 @@ rte_malloc_heap_memory_add(const char *heap_name, void 
*va_addr, size_t len,
goto unlock;
}
 
-   rte_spinlock_lock(&heap->lock);
ret = malloc_heap_add_external_memory(heap, msl);
-   msl->heap = 1; /* mark it as heap segment */
-   rte_spinlock_unlock(&heap->lock);
 
 unlock:
rte_mcfg_mem_write_unlock();
@@ -482,9 +479,7 @@ rte_malloc_heap_memory_remove(const char *heap_name, void 
*va_addr, size_t len)
goto unlock;
}
 
-   rte_spinlock_lock(&heap->lock);
ret = malloc_heap_remove_external_memory(heap, va_addr, len);
-   rte_spinlock_unlock(&heap->lock);
if (ret != 0)
goto unlock;
 
@@ -655,12 +650,7 @@ rte_malloc_heap_destroy(const char *heap_name)
goto unlock;
}
/* sanity checks done, now we can destroy the heap */
-   rt

[PATCH 03/14] mem: annotate shared memory config locks

2023-02-24 Thread David Marchand
Expose internal locks via some internal accessors.
Then annotate rte_mcfg_xxx_(read|write)_(|un)lock.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_mcfg.c| 66 +
 lib/eal/include/rte_eal_memconfig.h | 63 +--
 lib/eal/version.map |  4 ++
 3 files changed, 91 insertions(+), 42 deletions(-)

diff --git a/lib/eal/common/eal_common_mcfg.c b/lib/eal/common/eal_common_mcfg.c
index cf4a279905..b60d41f7b6 100644
--- a/lib/eal/common/eal_common_mcfg.c
+++ b/lib/eal/common/eal_common_mcfg.c
@@ -69,102 +69,112 @@ eal_mcfg_update_from_internal(void)
mcfg->version = RTE_VERSION;
 }
 
+rte_rwlock_t *
+rte_mcfg_mem_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->memory_hotplug_lock;
+}
+
 void
 rte_mcfg_mem_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_read_lock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_read_unlock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_write_lock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_write_unlock(rte_mcfg_mem_get_lock());
+}
+
+rte_rwlock_t *
+rte_mcfg_tailq_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->qlock;
 }
 
 void
 rte_mcfg_tailq_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->qlock);
+   rte_rwlock_read_lock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->qlock);
+   rte_rwlock_read_unlock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->qlock);
+   rte_rwlock_write_lock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->qlock);
+   rte_rwlock_write_unlock(rte_mcfg_tailq_get_lock());
+}
+
+rte_rwlock_t *
+rte_mcfg_mempool_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->mplock;
 }
 
 void
 rte_mcfg_mempool_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->mplock);
+   rte_rwlock_read_lock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->mplock);
+   rte_rwlock_read_unlock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->mplock);
+   rte_rwlock_write_lock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->mplock);
+   rte_rwlock_write_unlock(rte_mcfg_mempool_get_lock());
+}
+
+rte_spinlock_t *
+rte_mcfg_timer_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->tlock;
 }
 
 void
 rte_mcfg_timer_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_spinlock_lock(&mcfg->tlock);
+   rte_spinlock_lock(rte_mcfg_timer_get_lock());
 }
 
 void
 rte_mcfg_timer_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_spinlock_unlock(&mcfg->tlock);
+   rte_spinlock_unlock(rte_mcfg_timer_get_lock());
 }
 
 bool
diff --git a/lib/eal/include/rte_eal_memconfig.h 
b/lib/eal/include/rte_eal_memconfig.h
index e175564647..c527f9aa29 100644
--- a/lib/eal/include/rte_eal_memconfig.h
+++ b/lib/eal/include/rte_eal_memconfig.h
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+#include 
 
 /**
  * @file
@@ -18,89 +20,122 @@
 extern "C" {
 #endif
 
+/**
+ * Internal helpers used for lock annotations.
+ */
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_mem_get_lock(void);
+
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_tailq_get_lock(void);
+
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_mempool_get_

[PATCH 02/14] mem: rework malloc heap init

2023-02-24 Thread David Marchand
rte_eal_memory_init() and rte_eal_malloc_heap_init() must be called in
a common section taking rte_mcfg_mem_read_lock().
Split rte_eal_malloc_heap_init() in two so that the mem lock is taken
in rte_eal_init() making lock checks trivial (once annotated in the next
patch).

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_memory.c | 10 +-
 lib/eal/common/malloc_heap.c   | 10 ++
 lib/eal/common/malloc_heap.h   |  3 +++
 lib/eal/freebsd/eal.c  | 13 +
 lib/eal/linux/eal.c| 13 +
 lib/eal/windows/eal.c  | 13 +
 6 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/lib/eal/common/eal_common_memory.c 
b/lib/eal/common/eal_common_memory.c
index c917b981bc..5e162a1092 100644
--- a/lib/eal/common/eal_common_memory.c
+++ b/lib/eal/common/eal_common_memory.c
@@ -1078,18 +1078,11 @@ rte_eal_memory_detach(void)
 int
 rte_eal_memory_init(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
const struct internal_config *internal_conf =
eal_get_internal_configuration();
-
int retval;
-   RTE_LOG(DEBUG, EAL, "Setting up physically contiguous memory...\n");
-
-   if (!mcfg)
-   return -1;
 
-   /* lock mem hotplug here, to prevent races while we init */
-   rte_mcfg_mem_read_lock();
+   RTE_LOG(DEBUG, EAL, "Setting up physically contiguous memory...\n");
 
if (rte_eal_memseg_init() < 0)
goto fail;
@@ -1108,7 +1101,6 @@ rte_eal_memory_init(void)
 
return 0;
 fail:
-   rte_mcfg_mem_read_unlock();
return -1;
 }
 
diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index cc84dce672..7498a05ed3 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -1442,18 +1442,20 @@ rte_eal_malloc_heap_init(void)
}
}
 
-
if (register_mp_requests()) {
RTE_LOG(ERR, EAL, "Couldn't register malloc multiprocess 
actions\n");
-   rte_mcfg_mem_read_unlock();
return -1;
}
 
-   /* unlock mem hotplug here. it's safe for primary as no requests can
+   return 0;
+}
+
+int rte_eal_malloc_heap_populate(void)
+{
+   /* mem hotplug is unlocked here. it's safe for primary as no requests 
can
 * even come before primary itself is fully initialized, and secondaries
 * do not need to initialize the heap.
 */
-   rte_mcfg_mem_read_unlock();
 
/* secondary process does not need to initialize anything */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/lib/eal/common/malloc_heap.h b/lib/eal/common/malloc_heap.h
index 3b2fbc0aa0..8f3ab57154 100644
--- a/lib/eal/common/malloc_heap.h
+++ b/lib/eal/common/malloc_heap.h
@@ -85,6 +85,9 @@ malloc_socket_to_heap_id(unsigned int socket_id);
 int
 rte_eal_malloc_heap_init(void);
 
+int
+rte_eal_malloc_heap_populate(void);
+
 void
 rte_eal_malloc_heap_cleanup(void);
 
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 7db4239c51..7daf22e314 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -754,13 +755,25 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
+   rte_mcfg_mem_read_lock();
+
if (rte_eal_memory_init() < 0) {
+   rte_mcfg_mem_read_unlock();
rte_eal_init_alert("Cannot init memory");
rte_errno = ENOMEM;
return -1;
}
 
if (rte_eal_malloc_heap_init() < 0) {
+   rte_mcfg_mem_read_unlock();
+   rte_eal_init_alert("Cannot init malloc heap");
+   rte_errno = ENODEV;
+   return -1;
+   }
+
+   rte_mcfg_mem_read_unlock();
+
+   if (rte_eal_malloc_heap_populate() < 0) {
rte_eal_init_alert("Cannot init malloc heap");
rte_errno = ENODEV;
return -1;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index fabafbc39b..7242ee2c9a 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1197,7 +1198,10 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
+   rte_mcfg_mem_read_lock();
+
if (rte_eal_memory_init() < 0) {
+   rte_mcfg_mem_read_unlock();
rte_eal_init_alert("Cannot init memory");
rte_errno = ENOMEM;
return -1;
@@ -1207,6 +1211,15 @@ rte_eal_init(int argc, char **argv)
eal_hugedirs_unlock();
 
if (rte_eal_malloc_heap_init() < 0) {
+   rte_mcfg_mem_read_unlock();
+   rte_eal_init_alert("Cannot init malloc heap");
+   rte_errno = ENODEV;
+   return -1;
+   }
+
+   

[PATCH 04/14] hash: annotate cuckoo hash lock

2023-02-24 Thread David Marchand
__hash_rw_(reader|writer)_(|un)lock helpers take locks depending on
conditions that are fixed at the rte_hash object initialisation.
So we can tell clang that those helpers unconditionally take/release
those locks (and waive the lock check on their implementation).

Signed-off-by: David Marchand 
---
 lib/hash/rte_cuckoo_hash.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index 829b79c89a..d92a903bb3 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -575,6 +575,8 @@ rte_hash_count(const struct rte_hash *h)
 /* Read write locks implemented using rte_rwlock */
 static inline void
 __hash_rw_writer_lock(const struct rte_hash *h)
+   __rte_exclusive_lock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_lock_tm(h->readwrite_lock);
@@ -584,6 +586,8 @@ __hash_rw_writer_lock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_reader_lock(const struct rte_hash *h)
+   __rte_shared_lock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->readwrite_concur_support && h->hw_trans_mem_support)
rte_rwlock_read_lock_tm(h->readwrite_lock);
@@ -593,6 +597,8 @@ __hash_rw_reader_lock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_writer_unlock(const struct rte_hash *h)
+   __rte_unlock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_unlock_tm(h->readwrite_lock);
@@ -602,6 +608,8 @@ __hash_rw_writer_unlock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_reader_unlock(const struct rte_hash *h)
+   __rte_unlock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->readwrite_concur_support && h->hw_trans_mem_support)
rte_rwlock_read_unlock_tm(h->readwrite_lock);
-- 
2.39.2



[PATCH 05/14] graph: annotate graph lock

2023-02-24 Thread David Marchand
Export internal lock and annotate associated helper.

Signed-off-by: David Marchand 
---
 lib/graph/graph.c | 10 --
 lib/graph/graph_private.h | 10 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index a839a2803b..5582631b53 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -30,16 +30,22 @@ graph_list_head_get(void)
return &graph_list;
 }
 
+rte_spinlock_t *
+graph_spinlock_get(void)
+{
+   return &graph_lock;
+}
+
 void
 graph_spinlock_lock(void)
 {
-   rte_spinlock_lock(&graph_lock);
+   rte_spinlock_lock(graph_spinlock_get());
 }
 
 void
 graph_spinlock_unlock(void)
 {
-   rte_spinlock_unlock(&graph_lock);
+   rte_spinlock_unlock(graph_spinlock_get());
 }
 
 static int
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 7d1b30b8ac..eacdef45f0 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 #include "rte_graph.h"
 #include "rte_graph_worker.h"
@@ -148,20 +149,25 @@ STAILQ_HEAD(graph_head, graph);
  */
 struct graph_head *graph_list_head_get(void);
 
+rte_spinlock_t *
+graph_spinlock_get(void);
+
 /* Lock functions */
 /**
  * @internal
  *
  * Take a lock on the graph internal spin lock.
  */
-void graph_spinlock_lock(void);
+void graph_spinlock_lock(void)
+   __rte_exclusive_lock_function(graph_spinlock_get());
 
 /**
  * @internal
  *
  * Release a lock on the graph internal spin lock.
  */
-void graph_spinlock_unlock(void);
+void graph_spinlock_unlock(void)
+   __rte_unlock_function(graph_spinlock_get());
 
 /* Graph operations */
 /**
-- 
2.39.2



[PATCH 07/14] net/cxgbe: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

Signed-off-by: David Marchand 
---
 drivers/net/cxgbe/base/adapter.h | 35 +++-
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 16cbc1a345..8f2ffa0eeb 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -351,28 +351,19 @@ struct adapter {
  * t4_os_rwlock_init - initialize rwlock
  * @lock: the rwlock
  */
-static inline void t4_os_rwlock_init(rte_rwlock_t *lock)
-{
-   rte_rwlock_init(lock);
-}
+#define t4_os_rwlock_init(lock) rte_rwlock_init(lock)
 
 /**
  * t4_os_write_lock - get a write lock
  * @lock: the rwlock
  */
-static inline void t4_os_write_lock(rte_rwlock_t *lock)
-{
-   rte_rwlock_write_lock(lock);
-}
+#define t4_os_write_lock(lock) rte_rwlock_write_lock(lock)
 
 /**
  * t4_os_write_unlock - unlock a write lock
  * @lock: the rwlock
  */
-static inline void t4_os_write_unlock(rte_rwlock_t *lock)
-{
-   rte_rwlock_write_unlock(lock);
-}
+#define t4_os_write_unlock(lock) rte_rwlock_write_unlock(lock)
 
 /**
  * ethdev2pinfo - return the port_info structure associated with a rte_eth_dev
@@ -678,37 +669,25 @@ static inline void t4_os_set_hw_addr(struct adapter 
*adapter, int port_idx,
  * t4_os_lock_init - initialize spinlock
  * @lock: the spinlock
  */
-static inline void t4_os_lock_init(rte_spinlock_t *lock)
-{
-   rte_spinlock_init(lock);
-}
+#define t4_os_lock_init(lock) rte_spinlock_init(lock)
 
 /**
  * t4_os_lock - spin until lock is acquired
  * @lock: the spinlock
  */
-static inline void t4_os_lock(rte_spinlock_t *lock)
-{
-   rte_spinlock_lock(lock);
-}
+#define t4_os_lock(lock) rte_spinlock_lock(lock)
 
 /**
  * t4_os_unlock - unlock a spinlock
  * @lock: the spinlock
  */
-static inline void t4_os_unlock(rte_spinlock_t *lock)
-{
-   rte_spinlock_unlock(lock);
-}
+#define t4_os_unlock(lock) rte_spinlock_unlock(lock)
 
 /**
  * t4_os_trylock - try to get a lock
  * @lock: the spinlock
  */
-static inline int t4_os_trylock(rte_spinlock_t *lock)
-{
-   return rte_spinlock_trylock(lock);
-}
+#define t4_os_trylock(lock) rte_spinlock_trylock(lock)
 
 /**
  * t4_os_init_list_head - initialize
-- 
2.39.2



[PATCH 06/14] drivers: inherit lock annotations for Intel drivers

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

Signed-off-by: David Marchand 
---
 drivers/common/iavf/iavf_osdep.h   | 39 ++
 drivers/common/iavf/iavf_prototype.h   |  6 
 drivers/common/idpf/base/idpf_osdep.h  | 26 +++--
 drivers/net/i40e/base/i40e_osdep.h |  8 +++---
 drivers/net/i40e/base/i40e_prototype.h |  5 
 drivers/net/i40e/i40e_ethdev.c | 24 
 drivers/net/ice/base/ice_osdep.h   | 26 +++--
 7 files changed, 20 insertions(+), 114 deletions(-)

diff --git a/drivers/common/iavf/iavf_osdep.h b/drivers/common/iavf/iavf_osdep.h
index 31d3d809f9..263d92400c 100644
--- a/drivers/common/iavf/iavf_osdep.h
+++ b/drivers/common/iavf/iavf_osdep.h
@@ -170,11 +170,6 @@ struct iavf_virt_mem {
u32 size;
 } __rte_packed;
 
-/* SW spinlock */
-struct iavf_spinlock {
-   rte_spinlock_t spinlock;
-};
-
 #define iavf_allocate_dma_mem(h, m, unused, s, a) \
iavf_allocate_dma_mem_d(h, m, s, a)
 #define iavf_free_dma_mem(h, m) iavf_free_dma_mem_d(h, m)
@@ -182,32 +177,14 @@ struct iavf_spinlock {
 #define iavf_allocate_virt_mem(h, m, s) iavf_allocate_virt_mem_d(h, m, s)
 #define iavf_free_virt_mem(h, m) iavf_free_virt_mem_d(h, m)
 
-static inline void
-iavf_init_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_init(&sp->spinlock);
-}
-
-static inline void
-iavf_acquire_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_lock(&sp->spinlock);
-}
-
-static inline void
-iavf_release_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_unlock(&sp->spinlock);
-}
-
-static inline void
-iavf_destroy_spinlock_d(__rte_unused struct iavf_spinlock *sp)
-{
-}
+/* SW spinlock */
+struct iavf_spinlock {
+   rte_spinlock_t spinlock;
+};
 
-#define iavf_init_spinlock(_sp) iavf_init_spinlock_d(_sp)
-#define iavf_acquire_spinlock(_sp) iavf_acquire_spinlock_d(_sp)
-#define iavf_release_spinlock(_sp) iavf_release_spinlock_d(_sp)
-#define iavf_destroy_spinlock(_sp) iavf_destroy_spinlock_d(_sp)
+#define iavf_init_spinlock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define iavf_acquire_spinlock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define iavf_release_spinlock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define iavf_destroy_spinlock(sp) RTE_SET_USED(sp)
 
 #endif /* _IAVF_OSDEP_H_ */
diff --git a/drivers/common/iavf/iavf_prototype.h 
b/drivers/common/iavf/iavf_prototype.h
index b5124de5bf..ba78ec5169 100644
--- a/drivers/common/iavf/iavf_prototype.h
+++ b/drivers/common/iavf/iavf_prototype.h
@@ -76,12 +76,6 @@ STATIC INLINE struct iavf_rx_ptype_decoded 
decode_rx_desc_ptype(u8 ptype)
return iavf_ptype_lookup[ptype];
 }
 
-/* prototype for functions used for SW spinlocks */
-void iavf_init_spinlock(struct iavf_spinlock *sp);
-void iavf_acquire_spinlock(struct iavf_spinlock *sp);
-void iavf_release_spinlock(struct iavf_spinlock *sp);
-void iavf_destroy_spinlock(struct iavf_spinlock *sp);
-
 __rte_internal
 void iavf_vf_parse_hw_config(struct iavf_hw *hw,
 struct virtchnl_vf_resource *msg);
diff --git a/drivers/common/idpf/base/idpf_osdep.h 
b/drivers/common/idpf/base/idpf_osdep.h
index 99ae9cf60a..49bd7c4b21 100644
--- a/drivers/common/idpf/base/idpf_osdep.h
+++ b/drivers/common/idpf/base/idpf_osdep.h
@@ -210,28 +210,10 @@ struct idpf_lock {
rte_spinlock_t spinlock;
 };
 
-static inline void
-idpf_init_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_init(&sp->spinlock);
-}
-
-static inline void
-idpf_acquire_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_lock(&sp->spinlock);
-}
-
-static inline void
-idpf_release_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_unlock(&sp->spinlock);
-}
-
-static inline void
-idpf_destroy_lock(__rte_unused struct idpf_lock *sp)
-{
-}
+#define idpf_init_lock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define idpf_acquire_lock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define idpf_release_lock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define idpf_destroy_lock(sp) RTE_SET_USED(sp)
 
 struct idpf_hw;
 
diff --git a/drivers/net/i40e/base/i40e_osdep.h 
b/drivers/net/i40e/base/i40e_osdep.h
index 51537c5cf3..aa5dc61841 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -215,10 +215,10 @@ struct i40e_spinlock {
rte_spinlock_t spinlock;
 };
 
-#define i40e_init_spinlock(_sp) i40e_init_spinlock_d(_sp)
-#define i40e_acquire_spinlock(_sp) i40e_acquire_spinlock_d(_sp)
-#define i40e_release_spinlock(_sp) i40e_release_spinlock_d(_sp)
-#define i40e_destroy_spinlock(_sp) i40e_destroy_spinlock_d(_sp)
+#define i40e_init_spinlock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define i40e_acquire_spinlock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define i40e_release_spinlock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define i40e_destroy_spinlock(sp) RTE_SET_USED(sp)
 
 #define I40E_NTOHS(a) rte_be_to_cpu_16(a)
 #define

[PATCH 08/14] net/fm10k: annotate mailbox lock

2023-02-24 Thread David Marchand
Expose requirements for helpers dealing with the
FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back) lock.

Signed-off-by: David Marchand 
---
 drivers/net/fm10k/fm10k_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 8b83063f0a..4d3c4c10cf 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -116,6 +116,7 @@ fm10k_mbx_initlock(struct fm10k_hw *hw)
 
 static void
 fm10k_mbx_lock(struct fm10k_hw *hw)
+   __rte_exclusive_lock_function(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back))
 {
while (!rte_spinlock_trylock(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back)))
rte_delay_us(FM10K_MBXLOCK_DELAY_US);
@@ -123,6 +124,7 @@ fm10k_mbx_lock(struct fm10k_hw *hw)
 
 static void
 fm10k_mbx_unlock(struct fm10k_hw *hw)
+   __rte_unlock_function(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back))
 {
rte_spinlock_unlock(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back));
 }
-- 
2.39.2



[PATCH 09/14] net/sfc: rework locking in proxy code

2023-02-24 Thread David Marchand
Remove one extra layer for proxy code: sfc_get_adapter_by_pf_port_id()
now only resolves the sa object and sfc_adapter_(|un)lock() are added
were necessary.

This will simplify lock checks later.

Signed-off-by: David Marchand 
---
 drivers/net/sfc/sfc_repr_proxy.c | 59 
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/drivers/net/sfc/sfc_repr_proxy.c b/drivers/net/sfc/sfc_repr_proxy.c
index 4b958ced61..4ba7683370 100644
--- a/drivers/net/sfc/sfc_repr_proxy.c
+++ b/drivers/net/sfc/sfc_repr_proxy.c
@@ -51,17 +51,9 @@ sfc_get_adapter_by_pf_port_id(uint16_t pf_port_id)
dev = &rte_eth_devices[pf_port_id];
sa = sfc_adapter_by_eth_dev(dev);
 
-   sfc_adapter_lock(sa);
-
return sa;
 }
 
-static void
-sfc_put_adapter(struct sfc_adapter *sa)
-{
-   sfc_adapter_unlock(sa);
-}
-
 static struct sfc_repr_proxy_port *
 sfc_repr_proxy_find_port(struct sfc_repr_proxy *rp, uint16_t repr_id)
 {
@@ -1289,6 +1281,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
int rc;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1341,7 +1334,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
}
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 
@@ -1352,7 +1345,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
 fail_alloc_port:
 fail_port_exists:
sfc_log_init(sa, "failed: %s", rte_strerror(rc));
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return rc;
 }
@@ -1366,6 +1359,7 @@ sfc_repr_proxy_del_port(uint16_t pf_port_id, uint16_t 
repr_id)
int rc;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1393,14 +1387,14 @@ sfc_repr_proxy_del_port(uint16_t pf_port_id, uint16_t 
repr_id)
 
sfc_log_init(sa, "done");
 
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 
 fail_port_remove:
 fail_no_port:
sfc_log_init(sa, "failed: %s", rte_strerror(rc));
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return rc;
 }
@@ -1416,6 +1410,7 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1423,14 +1418,14 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return ENOENT;
}
 
rxq = &port->rxq[queue_id];
if (rp->dp_rxq[queue_id].mp != NULL && rp->dp_rxq[queue_id].mp != mp) {
sfc_err(sa, "multiple mempools per queue are not supported");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return ENOTSUP;
}
 
@@ -1440,7 +1435,7 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
rp->dp_rxq[queue_id].ref_count++;
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 }
@@ -1455,6 +1450,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1462,7 +1458,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return;
}
 
@@ -1475,7 +1471,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
rp->dp_rxq[queue_id].mp = NULL;
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 }
 
 int
@@ -1489,6 +1485,7 @@ sfc_repr_proxy_add_txq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1496,7 +1493,7 @@ sfc_repr_proxy_add_txq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_

[PATCH 10/14] net/sfc: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

One additional change is required in sfc_ev_qpoll() so that clang does
see the same lock is being manipulated.

Signed-off-by: David Marchand 
---
 drivers/net/sfc/sfc.h  | 41 ++
 drivers/net/sfc/sfc_ev.c   |  6 --
 drivers/net/sfc/sfc_repr.c | 38 +--
 3 files changed, 15 insertions(+), 70 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 0a1e224fa2..730d054aea 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -333,41 +333,12 @@ sfc_sa2shared(struct sfc_adapter *sa)
  * change the lock in one place.
  */
 
-static inline void
-sfc_adapter_lock_init(struct sfc_adapter *sa)
-{
-   rte_spinlock_init(&sa->lock);
-}
-
-static inline int
-sfc_adapter_is_locked(struct sfc_adapter *sa)
-{
-   return rte_spinlock_is_locked(&sa->lock);
-}
-
-static inline void
-sfc_adapter_lock(struct sfc_adapter *sa)
-{
-   rte_spinlock_lock(&sa->lock);
-}
-
-static inline int
-sfc_adapter_trylock(struct sfc_adapter *sa)
-{
-   return rte_spinlock_trylock(&sa->lock);
-}
-
-static inline void
-sfc_adapter_unlock(struct sfc_adapter *sa)
-{
-   rte_spinlock_unlock(&sa->lock);
-}
-
-static inline void
-sfc_adapter_lock_fini(__rte_unused struct sfc_adapter *sa)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_adapter_lock_init(sa) rte_spinlock_init(&(sa)->lock)
+#define sfc_adapter_is_locked(sa) rte_spinlock_is_locked(&(sa)->lock)
+#define sfc_adapter_lock(sa) rte_spinlock_lock(&(sa)->lock)
+#define sfc_adapter_trylock(sa) rte_spinlock_trylock(&(sa)->lock)
+#define sfc_adapter_unlock(sa) rte_spinlock_unlock(&(sa)->lock)
+#define sfc_adapter_lock_fini(sa) RTE_SET_USED(sa)
 
 static inline unsigned int
 sfc_nb_counter_rxq(const struct sfc_adapter_shared *sas)
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index f949abbfc3..c0d58c9554 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -570,6 +570,8 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
 void
 sfc_ev_qpoll(struct sfc_evq *evq)
 {
+   struct sfc_adapter *sa;
+
SFC_ASSERT(evq->init_state == SFC_EVQ_STARTED ||
   evq->init_state == SFC_EVQ_STARTING);
 
@@ -577,8 +579,8 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 
efx_ev_qpoll(evq->common, &evq->read_ptr, evq->callbacks, evq);
 
-   if (unlikely(evq->exception) && sfc_adapter_trylock(evq->sa)) {
-   struct sfc_adapter *sa = evq->sa;
+   sa = evq->sa;
+   if (unlikely(evq->exception) && sfc_adapter_trylock(sa)) {
int rc;
 
if (evq->dp_rxq != NULL) {
diff --git a/drivers/net/sfc/sfc_repr.c b/drivers/net/sfc/sfc_repr.c
index 4b03b101d8..017c3fb247 100644
--- a/drivers/net/sfc/sfc_repr.c
+++ b/drivers/net/sfc/sfc_repr.c
@@ -112,39 +112,11 @@ sfc_repr_by_eth_dev(struct rte_eth_dev *eth_dev)
  * change the lock in one place.
  */
 
-static inline void
-sfc_repr_lock_init(struct sfc_repr *sr)
-{
-   rte_spinlock_init(&sr->lock);
-}
-
-#if defined(RTE_LIBRTE_SFC_EFX_DEBUG) || defined(RTE_ENABLE_ASSERT)
-
-static inline int
-sfc_repr_lock_is_locked(struct sfc_repr *sr)
-{
-   return rte_spinlock_is_locked(&sr->lock);
-}
-
-#endif
-
-static inline void
-sfc_repr_lock(struct sfc_repr *sr)
-{
-   rte_spinlock_lock(&sr->lock);
-}
-
-static inline void
-sfc_repr_unlock(struct sfc_repr *sr)
-{
-   rte_spinlock_unlock(&sr->lock);
-}
-
-static inline void
-sfc_repr_lock_fini(__rte_unused struct sfc_repr *sr)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_repr_lock_init(sr) rte_spinlock_init(&(sr)->lock)
+#define sfc_repr_lock_is_locked(sr) rte_spinlock_is_locked(&(sr)->lock)
+#define sfc_repr_lock(sr) rte_spinlock_lock(&(sr)->lock)
+#define sfc_repr_unlock(sr) rte_spinlock_unlock(&(sr)->lock)
+#define sfc_repr_lock_fini(sr) RTE_SET_USED(sr)
 
 static void
 sfc_repr_rx_queue_stop(void *queue)
-- 
2.39.2



[PATCH 11/14] net/virtio: annotate lock for guest announce

2023-02-24 Thread David Marchand
Expose requirements for helpers dealing with the
VIRTIO_DEV_TO_HW(dev)->state_lock lock.

Signed-off-by: David Marchand 
---
 drivers/net/virtio/virtio_ethdev.c | 8 
 drivers/net/virtio/virtio_ethdev.h | 7 +--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 0103d95920..a3de44958c 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1149,11 +1149,11 @@ virtio_dev_pause(struct rte_eth_dev *dev)
 {
struct virtio_hw *hw = dev->data->dev_private;
 
-   rte_spinlock_lock(&hw->state_lock);
+   rte_spinlock_lock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 
if (hw->started == 0) {
/* Device is just stopped. */
-   rte_spinlock_unlock(&hw->state_lock);
+   rte_spinlock_unlock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
return -1;
}
hw->started = 0;
@@ -1174,7 +1174,7 @@ virtio_dev_resume(struct rte_eth_dev *dev)
struct virtio_hw *hw = dev->data->dev_private;
 
hw->started = 1;
-   rte_spinlock_unlock(&hw->state_lock);
+   rte_spinlock_unlock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 }
 
 /*
@@ -1217,7 +1217,7 @@ virtio_notify_peers(struct rte_eth_dev *dev)
}
 
/* If virtio port just stopped, no need to send RARP */
-   if (virtio_dev_pause(dev) < 0) {
+   if (virtio_dev_pause(dev) != 0) {
rte_pktmbuf_free(rarp_mbuf);
return;
}
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index c08f382791..ece0130603 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -112,8 +112,11 @@ int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 
 void virtio_interrupt_handler(void *param);
 
-int virtio_dev_pause(struct rte_eth_dev *dev);
-void virtio_dev_resume(struct rte_eth_dev *dev);
+#define VIRTIO_DEV_TO_HW(dev) ((struct virtio_hw *)(dev)->data->dev_private)
+int virtio_dev_pause(struct rte_eth_dev *dev)
+   __rte_exclusive_trylock_function(0, &VIRTIO_DEV_TO_HW(dev)->state_lock);
+void virtio_dev_resume(struct rte_eth_dev *dev)
+   __rte_unlock_function(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 int virtio_dev_stop(struct rte_eth_dev *dev);
 int virtio_dev_close(struct rte_eth_dev *dev);
 int virtio_inject_pkts(struct rte_eth_dev *dev, struct rte_mbuf **tx_pkts,
-- 
2.39.2



[PATCH 12/14] raw/ifpga: inherit lock annotations

2023-02-24 Thread David Marchand
The checks in those helpers are useless:
- all (start/stop/reset/test) callers ensure that dev != NULL,
- dev->sd can't be NULL either as it would mean the application is calling
  those helpers for a dev pointer that did not pass initialisation,

Once the checks are removed, the only thing that remains is calls to the
rte_spinlock API, so simply use macros and inherit annotations from the
lock API.

Signed-off-by: David Marchand 
---
 drivers/raw/ifpga/afu_pmd_core.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/raw/ifpga/afu_pmd_core.c b/drivers/raw/ifpga/afu_pmd_core.c
index ddf7a34f33..3ab1f47ac1 100644
--- a/drivers/raw/ifpga/afu_pmd_core.c
+++ b/drivers/raw/ifpga/afu_pmd_core.c
@@ -23,21 +23,8 @@ static struct rte_afu_uuid 
afu_pmd_uuid_map[AFU_RAWDEV_MAX_DRVS+1];
 TAILQ_HEAD(afu_drv_list, afu_rawdev_drv);
 static struct afu_drv_list afu_pmd_list = TAILQ_HEAD_INITIALIZER(afu_pmd_list);
 
-static inline int afu_rawdev_trylock(struct afu_rawdev *dev)
-{
-   if (!dev || !dev->sd)
-   return 0;
-
-   return rte_spinlock_trylock(&dev->sd->lock);
-}
-
-static inline void afu_rawdev_unlock(struct afu_rawdev *dev)
-{
-   if (!dev || !dev->sd)
-   return;
-
-   rte_spinlock_unlock(&dev->sd->lock);
-}
+#define afu_rawdev_trylock(dev) rte_spinlock_trylock(&dev->sd->lock)
+#define afu_rawdev_unlock(dev) rte_spinlock_unlock(&dev->sd->lock)
 
 static int afu_rawdev_configure(const struct rte_rawdev *rawdev,
rte_rawdev_obj_t config, size_t config_size)
-- 
2.39.2



[PATCH 13/14] vdpa/sfc: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

sfc_vdpa_ops.c was relying on an implicit cast of the dev_handle to a
vdpa adapter object. Add an explicit conversion.

Signed-off-by: David Marchand 
---
 drivers/vdpa/sfc/sfc_vdpa.h | 41 +
 drivers/vdpa/sfc/sfc_vdpa_ops.c | 14 +--
 2 files changed, 13 insertions(+), 42 deletions(-)

diff --git a/drivers/vdpa/sfc/sfc_vdpa.h b/drivers/vdpa/sfc/sfc_vdpa.h
index b25eb3a5fe..2b843e563d 100644
--- a/drivers/vdpa/sfc/sfc_vdpa.h
+++ b/drivers/vdpa/sfc/sfc_vdpa.h
@@ -122,40 +122,11 @@ sfc_vdpa_adapter_by_dev_handle(void *dev_handle)
  * Add wrapper functions to acquire/release lock to be able to remove or
  * change the lock in one place.
  */
-static inline void
-sfc_vdpa_adapter_lock_init(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_init(&sva->lock);
-}
-
-static inline int
-sfc_vdpa_adapter_is_locked(struct sfc_vdpa_adapter *sva)
-{
-   return rte_spinlock_is_locked(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_lock(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_lock(&sva->lock);
-}
-
-static inline int
-sfc_vdpa_adapter_trylock(struct sfc_vdpa_adapter *sva)
-{
-   return rte_spinlock_trylock(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_unlock(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_unlock(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_lock_fini(__rte_unused struct sfc_vdpa_adapter *sva)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_vdpa_adapter_lock_init(sva) rte_spinlock_init(&(sva)->lock)
+#define sfc_vdpa_adapter_is_locked(sva) rte_spinlock_is_locked(&(sva)->lock)
+#define sfc_vdpa_adapter_lock(sva) rte_spinlock_lock(&(sva)->lock)
+#define sfc_vdpa_adapter_trylock(sva) rte_spinlock_trylock(&(sva)->lock)
+#define sfc_vdpa_adapter_unlock(sva) rte_spinlock_unlock(&(sva)->lock)
+#define sfc_vdpa_adapter_lock_fini(sva) RTE_SET_USED(sva)
 
 #endif  /* _SFC_VDPA_H */
diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.c b/drivers/vdpa/sfc/sfc_vdpa_ops.c
index 6401d4e16f..afa6b7e8ef 100644
--- a/drivers/vdpa/sfc/sfc_vdpa_ops.c
+++ b/drivers/vdpa/sfc/sfc_vdpa_ops.c
@@ -577,7 +577,7 @@ sfc_vdpa_notify_ctrl(void *arg)
if (ops_data == NULL)
return NULL;
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
vid = ops_data->vid;
 
@@ -586,7 +586,7 @@ sfc_vdpa_notify_ctrl(void *arg)
  "vDPA (%s): Notifier could not get configured",
  ops_data->vdpa_dev->device->name);
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return NULL;
 }
@@ -637,7 +637,7 @@ sfc_vdpa_dev_config(int vid)
 
ops_data->vid = vid;
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
sfc_vdpa_log_init(ops_data->dev_handle, "configuring");
rc = sfc_vdpa_configure(ops_data);
@@ -653,7 +653,7 @@ sfc_vdpa_dev_config(int vid)
if (rc != 0)
goto fail_vdpa_notify;
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
sfc_vdpa_log_init(ops_data->dev_handle, "done");
 
@@ -666,7 +666,7 @@ sfc_vdpa_dev_config(int vid)
sfc_vdpa_close(ops_data);
 
 fail_vdpa_config:
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return -1;
 }
@@ -688,7 +688,7 @@ sfc_vdpa_dev_close(int vid)
return -1;
}
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
if (ops_data->is_notify_thread_started == true) {
void *status;
ret = pthread_cancel(ops_data->notify_tid);
@@ -710,7 +710,7 @@ sfc_vdpa_dev_close(int vid)
sfc_vdpa_stop(ops_data);
sfc_vdpa_close(ops_data);
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return 0;
 }
-- 
2.39.2



[PATCH 14/14] enable lock check

2023-02-24 Thread David Marchand
Now that a lot of components can be compiled with the lock checks,
invert the logic and opt out for components not ready yet:
- drivers/bus/dpaa,
- drivers/common/cnxk,
- drivers/common/mlx5,
- drivers/event/cnxk,
- drivers/net/bnx2x,
- drivers/net/bnxt,
- drivers/net/cnxk,
- drivers/net/enic,
- drivers/net/hns3,
- drivers/net/mlx5,
- lib/ipsec,
- lib/timer,

Signed-off-by: David Marchand 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 5 +++--
 drivers/bus/dpaa/meson.build| 1 +
 drivers/common/cnxk/meson.build | 1 +
 drivers/common/mlx5/meson.build | 1 +
 drivers/event/cnxk/meson.build  | 1 +
 drivers/meson.build | 2 +-
 drivers/net/bnx2x/meson.build   | 1 +
 drivers/net/bnxt/meson.build| 1 +
 drivers/net/cnxk/meson.build| 1 +
 drivers/net/enic/meson.build| 1 +
 drivers/net/hns3/meson.build| 1 +
 drivers/net/mlx5/meson.build| 1 +
 lib/ipsec/meson.build   | 1 +
 lib/meson.build | 2 +-
 lib/timer/meson.build   | 1 +
 lib/vhost/meson.build   | 1 -
 16 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 3f33621e05..93c8a031be 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -550,8 +550,9 @@ Some general comments:
   waiving checks with ``__rte_no_thread_safety_analysis`` in your code, please
   discuss it on the mailing list,
 
-A DPDK library/driver can enable/disable the checks by setting
-``annotate_locks`` accordingly in its ``meson.build`` file.
+The checks are enabled by default for libraries and drivers.
+They can be disabled by setting ``annotate_locks`` to ``false`` in
+the concerned library/driver ``meson.build``.
 
 IOVA Mode Detection
 ~~~
diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build
index 5506f2bffc..183b251459 100644
--- a/drivers/bus/dpaa/meson.build
+++ b/drivers/bus/dpaa/meson.build
@@ -29,3 +29,4 @@ if cc.has_argument('-Wno-pointer-arith')
 endif
 
 includes += include_directories('include', 'base/qbman')
+annotate_locks = false
diff --git a/drivers/common/cnxk/meson.build b/drivers/common/cnxk/meson.build
index 849735921c..9beb1cddba 100644
--- a/drivers/common/cnxk/meson.build
+++ b/drivers/common/cnxk/meson.build
@@ -88,3 +88,4 @@ sources += files('cnxk_telemetry_bphy.c',
 
 deps += ['bus_pci', 'net', 'telemetry']
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build
index 60ccd95cbc..d38255dc82 100644
--- a/drivers/common/mlx5/meson.build
+++ b/drivers/common/mlx5/meson.build
@@ -40,3 +40,4 @@ endif
 mlx5_config = configuration_data()
 subdir(exec_env)
 configure_file(output: 'mlx5_autoconf.h', configuration: mlx5_config)
+annotate_locks = false
diff --git a/drivers/event/cnxk/meson.build b/drivers/event/cnxk/meson.build
index aa42ab3a90..20c6a0484a 100644
--- a/drivers/event/cnxk/meson.build
+++ b/drivers/event/cnxk/meson.build
@@ -480,3 +480,4 @@ endforeach
 
 deps += ['bus_pci', 'common_cnxk', 'net_cnxk', 'crypto_cnxk']
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/meson.build b/drivers/meson.build
index 0618c31a69..d529980fc5 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -91,7 +91,7 @@ foreach subpath:subdirs
 build = true # set to false to disable, e.g. missing deps
 reason = '' # set if build == false to explain
 name = drv
-annotate_locks = false
+annotate_locks = true
 sources = []
 headers = []
 driver_sdk_headers = [] # public headers included by drivers
diff --git a/drivers/net/bnx2x/meson.build b/drivers/net/bnx2x/meson.build
index 156f97d31f..dbf9c7225d 100644
--- a/drivers/net/bnx2x/meson.build
+++ b/drivers/net/bnx2x/meson.build
@@ -21,3 +21,4 @@ sources = files(
 'ecore_sp.c',
 'elink.c',
 )
+annotate_locks = false
diff --git a/drivers/net/bnxt/meson.build b/drivers/net/bnxt/meson.build
index 09d494e90f..f43dbfc445 100644
--- a/drivers/net/bnxt/meson.build
+++ b/drivers/net/bnxt/meson.build
@@ -68,3 +68,4 @@ if arch_subdir == 'x86'
 elif arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64')
 sources += files('bnxt_rxtx_vec_neon.c')
 endif
+annotate_locks = false
diff --git a/drivers/net/cnxk/meson.build b/drivers/net/cnxk/meson.build
index c7ca24d437..86ed2d13dd 100644
--- a/drivers/net/cnxk/meson.build
+++ b/drivers/net/cnxk/meson.build
@@ -196,3 +196,4 @@ endforeach
 
 headers = files('rte_pmd_cnxk.h')
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/net/enic/meson.build

Re: [PATCH v3 2/9] app/testpmd: fix packet count in ieee15888 engine

2023-02-24 Thread Singh, Aman Deep



On 2/21/2023 12:04 AM, David Marchand wrote:

Don't count a packet has been transmitted before it is done.

Fixes: af75078fece3 ("first public release")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 


Acked-by: Aman Singh 





回复: [PATCH v3 1/3] ethdev: enable direct rearm with separate API

2023-02-24 Thread Feifei Wang
Sorry for my delayed reply.

> -邮件原件-
> 发件人: Morten Brørup 
> 发送时间: Wednesday, January 4, 2023 6:11 PM
> 收件人: Feifei Wang ; tho...@monjalon.net;
> Ferruh Yigit ; Andrew Rybchenko
> 
> 抄送: dev@dpdk.org; konstantin.v.anan...@yandex.ru; nd ;
> Honnappa Nagarahalli ; Ruifeng Wang
> ; nd 
> 主题: RE: [PATCH v3 1/3] ethdev: enable direct rearm with separate API
> 
> > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > Sent: Wednesday, 4 January 2023 09.51
> >
> > Hi, Morten
> >
> > > 发件人: Morten Brørup 
> > > 发送时间: Wednesday, January 4, 2023 4:22 PM
> > >
> > > > From: Feifei Wang [mailto:feifei.wa...@arm.com]
> > > > Sent: Wednesday, 4 January 2023 08.31
> > > >
> > > > Add 'tx_fill_sw_ring' and 'rx_flush_descriptor' API into direct
> > rearm
> > > > mode for separate Rx and Tx Operation. And this can support
> > different
> > > > multiple sources in direct rearm mode. For examples, Rx driver is
> > > > ixgbe, and Tx driver is i40e.
> > > >
> > > > Suggested-by: Honnappa Nagarahalli 
> > > > Suggested-by: Ruifeng Wang 
> > > > Signed-off-by: Feifei Wang 
> > > > Reviewed-by: Ruifeng Wang 
> > > > Reviewed-by: Honnappa Nagarahalli 
> > > > ---
> > >
> > > This feature looks very promising for performance. I am pleased to
> > see
> > > progress on it.
> > >
> > Thanks very much for your reviewing.
> >
> > > Please confirm that the fast path functions are still thread safe,
> > i.e. one EAL
> > > thread may be calling rte_eth_rx_burst() while another EAL thread is
> > calling
> > > rte_eth_tx_burst().
> > >
> > For the multiple threads safe, like we say in cover letter, current
> > direct-rearm support Rx and Tx in the same thread. If we consider
> > multiple threads like 'pipeline model', there need to add 'lock' in
> > the data path which can decrease the performance.
> > Thus, the first step we do is try to enable direct-rearm in the single
> > thread, and then we will consider to enable direct rearm in multiple
> > threads and improve the performance.
> 
> OK, doing it in steps is a good idea for a feature like this - makes it 
> easier to
> understand and review.
> 
> When proceeding to add support for the "pipeline model", perhaps the
> lockless principles from the rte_ring can be used in this feature too.
> 
> From a high level perspective, I'm somewhat worried that releasing a "work-
> in-progress" version of this feature in some DPDK version will cause API/ABI
> breakage discussions when progressing to the next steps of the
> implementation to make the feature more complete. Not only support for
> thread safety across simultaneous RX and TX, but also support for multiple
> mbuf pools per RX queue [1]. Marking the functions experimental should
> alleviate such discussions, but there is a risk of pushback to not break the
> API/ABI anyway.
> 
> [1]:
> https://elixir.bootlin.com/dpdk/v22.11.1/source/lib/ethdev/rte_ethdev.h#L1
> 105
> 

[Feifei] I think the subsequent upgrade does not significantly damage the 
stability
of the API we currently define.

For thread safety across simultaneous RX and TX, in the future, the lockless 
operation
change will happen in the pmd layer, such as CAS load/store for rxq queue index 
of pmd.
Thus, this can not affect the stability of the upper API.

For multiple mbuf pools per RX queue, direct-rearm just put Tx buffers into Rx 
buffers, and
it do not care which mempool the buffer coming from. 
From different mempool buffers eventually freed into their respective sources 
in the
no FAST_FREE path.  
I think this is a mistake in cover letter. Previous direct-rearm can just 
support FAST_FREE
so it constraint that buffer should be from the same mempool. Now, the latest 
version can
support no_FAST_FREE path, but we forget to make change in cover letter.
> [...]
> 
> > > > --- a/lib/ethdev/ethdev_driver.h
> > > > +++ b/lib/ethdev/ethdev_driver.h
> > > > @@ -59,6 +59,10 @@ struct rte_eth_dev {
> > > > eth_rx_descriptor_status_t rx_descriptor_status;
> > > > /** Check the status of a Tx descriptor */
> > > > eth_tx_descriptor_status_t tx_descriptor_status;
> > > > +   /** Fill Rx sw-ring with Tx buffers in direct rearm mode */
> > > > +   eth_tx_fill_sw_ring_t tx_fill_sw_ring;
> > >
> > > What is "Rx sw-ring"? Please confirm that this is not an Intel PMD
> > specific
> > > term and/or implementation detail, e.g. by providing a conceptual
> > > implementation for a non-Intel PMD, e.g. mlx5.
> > Rx sw_ring is used  to store mbufs in intel PMD. This is the same as
> > 'rxq->elts'
> > in mlx5.
> 
> Sounds good.
> 
> Then all we need is consensus on a generic name for this, unless "Rx sw-ring"
> already is the generic name. (I'm not a PMD developer, so I might be
> completely off track here.) Naming is often debatable, so I'll stop talking
> about it now - I only wanted to highlight that we should avoid vendor-
> specific terms in public APIs intended to be implemented by multiple vendors.
> On the other hand... if no other vendors raise their voi

Re: [PATCH v2 3/9] app/testpmd: rework ieee1588 engine fwd configuration

2023-02-24 Thread Singh, Aman Deep



On 2/20/2023 10:10 PM, David Marchand wrote:

This fwd engine currently ignores the forwarding configuration.
Force it explicitly when initialising the stream.
The code is then more consistent with other fwd engines (i.e. receiving
on fs->rx_port, transmitting on fs->tx_port).

Signed-off-by: David Marchand 


Acked-by: Aman Singh 


---






Re: Devtool test-meson-build.sh failed

2023-02-24 Thread Bruce Richardson
On Thu, Feb 23, 2023 at 11:48:47PM +, Ferruh Yigit wrote:
> On 2/23/2023 10:58 PM, Tanzeel Ahmed wrote:
> > Hello, devs!
> > I hope you all are doing well!
> > 
> > I am currently working on a new patch version for "[PATCH v2] lib/net:
> > add MPLS insert and strip functionality". However, I am facing an error
> > when running the devtools/test-meson-build.sh script. The error is
> > related to unused functions in rte_mpls.h that is causing warnings to be
> > treated as errors, leading to the build failure. Following is the error
> > snippet:
> > 
> > *[50/3060] Compiling C object lib/librte_net.a.p/net_rte_net.c.o
> > FAILED: lib/librte_net.a.p/net_rte_net.c.o
> > gcc -Ilib/librte_net.a.p -Ilib -I../../lib -Ilib/net -I../../lib/net -I.
> > -I../.. -Iconfig -I../../config -Ilib/eal/include
> > -I../../lib/eal/include -Ilib/eal/linux/include
> > -I../../lib/eal/linux/include -Ilib/eal/x86/include
> > -I../../lib/eal/x86/include -Ilib/eal/common -I../../lib/eal/common
> > -Ilib/eal -I../../lib/eal -Ilib/kvargs -I../../lib/kvargs -Ilib/metrics
> > -I../../lib/metrics -Ilib/telemetry -I../../lib/telemetry -Ilib/mbuf
> > -I../../lib/mbuf -Ilib/mempool -I../../lib/mempool -Ilib/ring
> > -I../../lib/ring -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch
> > -Wextra -Werror -O2 -g -include rte_config.h -Wcast-qual -Wdeprecated
> > -Wformat -Wformat-nonliteral -Wformat-security -Wmissing-declarations
> > -Wmissing-prototypes -Wnested-externs -Wold-style-definition
> > -Wpointer-arith -Wsign-compare -Wstrict-prototypes -Wundef
> > -Wwrite-strings -Wno-missing-field-initializers -D_GNU_SOURCE -fPIC
> > -march=native -DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API
> > -DCC_X86_64_SSE42_PCLMULQDQ_SUPPORT -DRTE_LOG_DEFAULT_LOGTYPE=lib.net
> >  -MD -MQ lib/librte_net.a.p/net_rte_net.c.o -MF
> > lib/librte_net.a.p/net_rte_net.c.o.d -o
> > lib/librte_net.a.p/net_rte_net.c.o -c ../../lib/net/rte_net.c
> > In file included from ../../lib/net/rte_net.c:16:0:
> > ../../lib/net/rte_mpls.h:61:1: error: ‘rte_mpls_push_over_l2’ defined
> > but not used [-Werror=unused-function]
> >  rte_mpls_push_over_l2(struct rte_mbuf **m, const struct rte_mpls_hdr *mp)
> >  ^
> > cc1: all warnings being treated as errors
> > [99/3060] Compiling C object lib/librte_vhost.a.p/vhost_virtio_net.c.o
> > ninja: build stopped: subcommand failed.
> 
> Can it be because 'rte_mpls_push_over_l2()' is defined in 'rte_mpls.h'
> but it is not inline? (just a guess).
> 
I believe this to be the cause. Static functions in a compilation unit must
be used or else you get a warning. This does not apply to static inline
functions, therefore all functions in headers must be static and inline to
avoid compiler warnings.

/Bruce


[PATCH v2 00/11] fixes and improvements to cnxk crypto PMD

2023-02-24 Thread Tejasree Kondoj
This series adds cn10k scatter gather support and improvements
to cnxk crypto PMD.

v2:
* Fixed clang build failure

Anoob Joseph (2):
  crypto/cnxk: use version field directly
  common/cnxk: ensure flush inval completion with CSR read

Gowrishankar Muthukrishnan (2):
  common/cnxk: fix incorrect auth key length
  crypto/cnxk: fix order of ECFPM params

Tejasree Kondoj (7):
  crypto/cnxk: make sg version check const
  crypto/cnxk: use direct mode for zero aad length
  crypto/cnxk: set ctx for AE
  common/cnxk: add errata function for CPT set ctx
  common/cnxk: replace CPT revision check with caps
  crypto/cnxk: support cn10k IPsec SG mode
  crypto/cnxk: add model check for pdcp chain

 drivers/common/cnxk/hw/cpt.h  |  14 +-
 drivers/common/cnxk/roc_cpt.c |  16 ++
 drivers/common/cnxk/roc_errata.h  |   9 +
 drivers/common/cnxk/roc_se.h  |   7 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |  67 +++
 drivers/crypto/cnxk/cn10k_ipsec_la_ops.h  | 222 --
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  |  20 +-
 drivers/crypto/cnxk/cn9k_ipsec_la_ops.h   |   4 +-
 drivers/crypto/cnxk/cnxk_ae.h |  69 +--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  44 +++--
 drivers/crypto/cnxk/cnxk_se.h |   6 +-
 drivers/crypto/cnxk/cnxk_sg.h |  23 +++
 drivers/event/cnxk/cn10k_eventdev.c   |   3 +-
 13 files changed, 389 insertions(+), 115 deletions(-)

-- 
2.25.1



[PATCH v2 01/11] common/cnxk: fix incorrect auth key length

2023-02-24 Thread Tejasree Kondoj
From: Gowrishankar Muthukrishnan 

Auth key length is stored as 8 bit value in SE context. It should
be larger enough to accommodate supported auth key length of 1024
bytes maximum, as in HMAC.

Fixes: a45859312ff ("common/cnxk: add SE definitions for symmetric crypto")

Signed-off-by: Gowrishankar Muthukrishnan 
---
 drivers/common/cnxk/roc_se.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/common/cnxk/roc_se.h b/drivers/common/cnxk/roc_se.h
index 6758142214..a0c97b26c5 100644
--- a/drivers/common/cnxk/roc_se.h
+++ b/drivers/common/cnxk/roc_se.h
@@ -288,16 +288,15 @@ struct roc_se_ctx {
uint64_t enc_cipher : 8;
uint64_t hash_type : 8;
uint64_t mac_len : 8;
-   uint64_t auth_key_len : 8;
+   uint64_t auth_key_len : 16;
uint64_t fc_type : 4;
uint64_t hmac : 1;
uint64_t zsk_flags : 3;
uint64_t k_ecb : 1;
uint64_t pdcp_ci_alg : 2;
uint64_t pdcp_auth_alg : 2;
-   uint16_t ciph_then_auth : 1;
-   uint16_t auth_then_ciph : 1;
-   uint64_t rsvd : 17;
+   uint64_t ciph_then_auth : 1;
+   uint64_t auth_then_ciph : 1;
union cpt_inst_w4 template_w4;
/* Below fields are accessed by hardware */
struct se_ctx_s {
-- 
2.25.1



[PATCH v2 02/11] crypto/cnxk: make sg version check const

2023-02-24 Thread Tejasree Kondoj
Remove sg_ver2 from burst structure and make it as
const argument for compiler optimized code.

Signed-off-by: Tejasree Kondoj 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 26 +++
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 4ee6b944c4..92f7002db9 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -38,7 +38,6 @@ struct ops_burst {
struct cn10k_sso_hws *ws;
struct cnxk_cpt_qp *qp;
uint16_t nb_ops;
-   bool is_sg_ver2;
 };
 
 /* Holds information required to send vector of operations */
@@ -489,7 +488,8 @@ cn10k_cpt_vec_submit(struct vec_request vec_tbl[], uint16_t 
vec_tbl_len, struct
 }
 
 static inline int
-ca_lmtst_vec_submit(struct ops_burst *burst, struct vec_request vec_tbl[], 
uint16_t *vec_tbl_len)
+ca_lmtst_vec_submit(struct ops_burst *burst, struct vec_request vec_tbl[], 
uint16_t *vec_tbl_len,
+   const bool is_sg_ver2)
 {
struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP];
uint64_t lmt_base, lmt_arg, io_addr;
@@ -537,7 +537,7 @@ ca_lmtst_vec_submit(struct ops_burst *burst, struct 
vec_request vec_tbl[], uint1
infl_req = infl_reqs[i];
infl_req->op_flags = 0;
 
-   ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req, 
burst->is_sg_ver2);
+   ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req, 
is_sg_ver2);
if (unlikely(ret != 1)) {
plt_cpt_dbg("Could not process op: %p", burst->op[i]);
if (i != 0)
@@ -620,7 +620,7 @@ next_op:;
 }
 
 static inline uint16_t
-ca_lmtst_burst_submit(struct ops_burst *burst)
+ca_lmtst_burst_submit(struct ops_burst *burst, const bool is_sg_ver2)
 {
struct cpt_inflight_req *infl_reqs[PKTS_PER_LOOP];
uint64_t lmt_base, lmt_arg, io_addr;
@@ -660,7 +660,7 @@ ca_lmtst_burst_submit(struct ops_burst *burst)
infl_req = infl_reqs[i];
infl_req->op_flags = 0;
 
-   ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req, 
burst->is_sg_ver2);
+   ret = cn10k_cpt_fill_inst(qp, &burst->op[i], inst, infl_req, 
is_sg_ver2);
if (unlikely(ret != 1)) {
plt_dp_dbg("Could not process op: %p", burst->op[i]);
if (i != 0)
@@ -729,7 +729,6 @@ cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event 
ev[], uint16_t nb_ev
burst.ws = ws;
burst.qp = NULL;
burst.nb_ops = 0;
-   burst.is_sg_ver2 = is_sg_ver2;
 
for (i = 0; i < nb_events; i++) {
op = ev[i].event_ptr;
@@ -743,8 +742,8 @@ cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event 
ev[], uint16_t nb_ev
if (qp != burst.qp) {
if (burst.nb_ops) {
if (is_vector) {
-   submitted =
-   ca_lmtst_vec_submit(&burst, 
vec_tbl, &vec_tbl_len);
+   submitted = ca_lmtst_vec_submit(&burst, 
vec_tbl,
+   
&vec_tbl_len, is_sg_ver2);
/*
 * Vector submission is required on qp 
change, but not in
 * other cases, since we could send 
several vectors per
@@ -753,7 +752,7 @@ cn10k_cpt_crypto_adapter_enqueue(void *ws, struct rte_event 
ev[], uint16_t nb_ev
cn10k_cpt_vec_submit(vec_tbl, 
vec_tbl_len, burst.qp);
vec_tbl_len = 0;
} else {
-   submitted = 
ca_lmtst_burst_submit(&burst);
+   submitted = 
ca_lmtst_burst_submit(&burst, is_sg_ver2);
}
count += submitted;
if (unlikely(submitted != burst.nb_ops))
@@ -769,9 +768,10 @@ cn10k_cpt_crypto_adapter_enqueue(void *ws, struct 
rte_event ev[], uint16_t nb_ev
/* Max nb_ops per burst check */
if (++burst.nb_ops == PKTS_PER_LOOP) {
if (is_vector)
-   submitted = ca_lmtst_vec_submit(&burst, 
vec_tbl, &vec_tbl_len);
+   submitted = ca_lmtst_vec_submit(&burst, 
vec_tbl, &vec_tbl_len,
+   is_sg_ver2);
else
-   submitted = ca_lmtst_burst_submit(&burst);
+   submitted = ca_lmtst_burst_submit(&burst, 
is_sg_ver2);
count += submitted;
 

[PATCH v2 03/11] crypto/cnxk: use version field directly

2023-02-24 Thread Tejasree Kondoj
From: Anoob Joseph 

As version field is available in rte_ip_hdr, use it directly instead of
masking version_ihl.

Signed-off-by: Anoob Joseph 
---
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c | 4 ++--
 drivers/crypto/cnxk/cn9k_ipsec_la_ops.h  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
index 3a07842e4b..11541b6ab9 100644
--- a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
@@ -546,10 +546,10 @@ cn9k_cpt_sec_post_process(struct rte_crypto_op *cop,
}
}
 
-   if (((ip->version_ihl & 0xf0) >> RTE_IPV4_IHL_MULTIPLIER) == 
IPVERSION) {
+   if (ip->version == IPVERSION) {
m_len = rte_be_to_cpu_16(ip->total_length);
} else {
-   PLT_ASSERT(((ip->version_ihl & 0xf0) >> 
RTE_IPV4_IHL_MULTIPLIER) == 6);
+   PLT_ASSERT((ip->version == 6));
ip6 = (struct rte_ipv6_hdr *)ip;
m_len = rte_be_to_cpu_16(ip6->payload_len) + 
sizeof(struct rte_ipv6_hdr);
}
diff --git a/drivers/crypto/cnxk/cn9k_ipsec_la_ops.h 
b/drivers/crypto/cnxk/cn9k_ipsec_la_ops.h
index 9df41bf65d..85aacb803f 100644
--- a/drivers/crypto/cnxk/cn9k_ipsec_la_ops.h
+++ b/drivers/crypto/cnxk/cn9k_ipsec_la_ops.h
@@ -28,13 +28,13 @@ ipsec_po_out_rlen_get(struct cn9k_sec_session *sess, 
uint32_t plen, struct rte_m
uintptr_t data = (uintptr_t)m_src->buf_addr + m_src->data_off;
struct rte_ipv4_hdr *ip = (struct rte_ipv4_hdr *)data;
 
-   if (unlikely(((ip->version_ihl & 0xf0) >> 
RTE_IPV4_IHL_MULTIPLIER) != IPVERSION)) {
+   if (unlikely(ip->version != IPVERSION)) {
struct rte_ipv6_hdr *ip6 = (struct rte_ipv6_hdr *)ip;
uint8_t *nxt_hdr = (uint8_t *)ip6;
uint8_t dest_op_cnt = 0;
int nh = ip6->proto;
 
-   PLT_ASSERT(((ip->version_ihl & 0xf0) >> 
RTE_IPV4_IHL_MULTIPLIER) == 6);
+   PLT_ASSERT(ip->version == 6);
 
adj_len = ROC_CPT_TUNNEL_IPV6_HDR_LEN;
nxt_hdr += ROC_CPT_TUNNEL_IPV6_HDR_LEN;
-- 
2.25.1



[PATCH v2 04/11] crypto/cnxk: use direct mode for zero aad length

2023-02-24 Thread Tejasree Kondoj
Using direct mode if aad length is zero.

Signed-off-by: Tejasree Kondoj 
---
 drivers/crypto/cnxk/cnxk_se.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/cnxk/cnxk_se.h b/drivers/crypto/cnxk/cnxk_se.h
index c16027ec75..69cd343eea 100644
--- a/drivers/crypto/cnxk/cnxk_se.h
+++ b/drivers/crypto/cnxk/cnxk_se.h
@@ -2258,9 +2258,9 @@ fill_fc_params(struct rte_crypto_op *cop, struct 
cnxk_se_sess *sess,
 
aad_data = sym_op->aead.aad.data;
aad_len = sess->aad_length;
-   if (likely((aad_data + aad_len) ==
-  rte_pktmbuf_mtod_offset(m_src, uint8_t *,
-  sym_op->aead.data.offset))) {
+   if (likely((aad_len == 0) ||
+  ((aad_data + aad_len) ==
+   rte_pktmbuf_mtod_offset(m_src, uint8_t *, 
sym_op->aead.data.offset {
d_offs = (d_offs - aad_len) | (d_offs << 16);
d_lens = (d_lens + aad_len) | (d_lens << 32);
} else {
-- 
2.25.1



[PATCH v2 05/11] crypto/cnxk: set ctx for AE

2023-02-24 Thread Tejasree Kondoj
Set ctx_val to 1 for asymmetric ops.

Signed-off-by: Tejasree Kondoj 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 18 -
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  | 16 +++
 drivers/crypto/cnxk/cnxk_ae.h | 21 +++
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  | 33 ---
 4 files changed, 53 insertions(+), 35 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 92f7002db9..d1a43eaf13 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -158,10 +158,8 @@ cn10k_cpt_fill_inst(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *ops[], struct
 
if (op->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
asym_op = op->asym;
-   ae_sess = (struct cnxk_ae_sess *)
-   asym_op->session->sess_private_data;
-   ret = cnxk_ae_enqueue(qp, op, infl_req, &inst[0],
- ae_sess);
+   ae_sess = (struct cnxk_ae_sess *)asym_op->session;
+   ret = cnxk_ae_enqueue(qp, op, infl_req, &inst[0], 
ae_sess);
if (unlikely(ret))
return 0;
w7 = ae_sess->cpt_inst_w7;
@@ -330,10 +328,9 @@ cn10k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev 
*dev __rte_unused, vo
return -EINVAL;
} else if (op_type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC) {
if (sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
-   struct rte_cryptodev_asym_session *asym_sess = sess;
struct cnxk_ae_sess *priv;
 
-   priv = (struct cnxk_ae_sess 
*)asym_sess->sess_private_data;
+   priv = (struct cnxk_ae_sess *)sess;
priv->qp = qp;
priv->cpt_inst_w2 = w2;
} else
@@ -381,11 +378,9 @@ cn10k_ca_meta_info_extract(struct rte_crypto_op *op, 
struct cnxk_cpt_qp **qp, ui
}
} else if (op->type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC) {
if (op->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
-   struct rte_cryptodev_asym_session *asym_sess;
struct cnxk_ae_sess *priv;
 
-   asym_sess = op->asym->session;
-   priv = (struct cnxk_ae_sess 
*)asym_sess->sess_private_data;
+   priv = (struct cnxk_ae_sess *)op->asym->session;
*qp = priv->qp;
*w2 = priv->cpt_inst_w2;
} else
@@ -890,10 +885,7 @@ cn10k_cpt_dequeue_post_process(struct cnxk_cpt_qp *qp,
} else if (cop->type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC) {
struct rte_crypto_asym_op *op = cop->asym;
uintptr_t *mdata = infl_req->mdata;
-   struct cnxk_ae_sess *sess;
-
-   sess = (struct cnxk_ae_sess *)
-   op->session->sess_private_data;
+   struct cnxk_ae_sess *sess = (struct cnxk_ae_sess 
*)op->session;
 
cnxk_ae_post_process(cop, sess, (uint8_t *)mdata[0]);
}
diff --git a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
index 11541b6ab9..34d40b07d4 100644
--- a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c
@@ -105,13 +105,10 @@ cn9k_cpt_inst_prep(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *op,
inst->w7.u64 = sess->cpt_inst_w7;
}
} else if (op->type == RTE_CRYPTO_OP_TYPE_ASYMMETRIC) {
-   struct rte_crypto_asym_op *asym_op;
struct cnxk_ae_sess *sess;
 
if (op->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
-   asym_op = op->asym;
-   sess = (struct cnxk_ae_sess *)
-   asym_op->session->sess_private_data;
+   sess = (struct cnxk_ae_sess *)op->asym->session;
ret = cnxk_ae_enqueue(qp, op, infl_req, inst, sess);
inst->w7.u64 = sess->cpt_inst_w7;
} else {
@@ -345,7 +342,7 @@ cn9k_cpt_crypto_adapter_ev_mdata_set(struct rte_cryptodev 
*dev __rte_unused,
struct rte_cryptodev_asym_session *asym_sess = sess;
struct cnxk_ae_sess *priv;
 
-   priv = (struct cnxk_ae_sess 
*)asym_sess->sess_private_data;
+   priv = (struct cnxk_ae_sess *)asym_sess;
priv->qp = qp;
priv->cpt_inst_w2 = w2;
} else
@@ -393,11 +390,9 @@ cn9k_ca_meta_info_extract(struct rte_crypto_op *op,
   

[PATCH v2 06/11] common/cnxk: ensure flush inval completion with CSR read

2023-02-24 Thread Tejasree Kondoj
From: Anoob Joseph 

If a CSR read is issued after a write, the read would block till the
write operation is complete. This would help in determining when the
FLUSH+INVALIDATE operation is complete.

Signed-off-by: Anoob Joseph 
---
 drivers/common/cnxk/hw/cpt.h | 11 +++
 drivers/common/cnxk/roc_cpt.c| 16 
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c |  4 
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/common/cnxk/hw/cpt.h b/drivers/common/cnxk/hw/cpt.h
index d378a4eadd..44ff8b08b2 100644
--- a/drivers/common/cnxk/hw/cpt.h
+++ b/drivers/common/cnxk/hw/cpt.h
@@ -100,6 +100,17 @@ union cpt_lf_ctx_flush {
} s;
 };
 
+union cpt_lf_ctx_err {
+   uint64_t u;
+   struct {
+   uint64_t flush_st_flt : 1;
+   uint64_t busy_flr : 1;
+   uint64_t busy_sw_flush : 1;
+   uint64_t reload_faulted : 1;
+   uint64_t reserved_4_63 : 1;
+   } s;
+};
+
 union cpt_lf_ctx_reload {
uint64_t u;
struct {
diff --git a/drivers/common/cnxk/roc_cpt.c b/drivers/common/cnxk/roc_cpt.c
index cf514be69f..dff2fbf2a4 100644
--- a/drivers/common/cnxk/roc_cpt.c
+++ b/drivers/common/cnxk/roc_cpt.c
@@ -783,6 +783,7 @@ int
 roc_cpt_lf_ctx_flush(struct roc_cpt_lf *lf, void *cptr, bool inval)
 {
union cpt_lf_ctx_flush reg;
+   union cpt_lf_ctx_err err;
 
if (lf == NULL) {
plt_err("Could not trigger CTX flush");
@@ -795,6 +796,21 @@ roc_cpt_lf_ctx_flush(struct roc_cpt_lf *lf, void *cptr, 
bool inval)
 
plt_write64(reg.u, lf->rbase + CPT_LF_CTX_FLUSH);
 
+   plt_atomic_thread_fence(__ATOMIC_ACQ_REL);
+
+   /* Read a CSR to ensure that the FLUSH operation is complete */
+   err.u = plt_read64(lf->rbase + CPT_LF_CTX_ERR);
+
+   if (err.s.busy_sw_flush && inval) {
+   plt_err("CTX entry could not be invalidated due to active 
usage.");
+   return -EAGAIN;
+   }
+
+   if (err.s.flush_st_flt) {
+   plt_err("CTX flush could not complete due to store fault");
+   abort();
+   }
+
return 0;
 }
 
diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c 
b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
index f03646fe1a..67bd7e3243 100644
--- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
@@ -737,8 +737,6 @@ sym_session_clear(struct rte_cryptodev_sym_session *sess, 
bool is_session_less)
/* Trigger CTX flush + invalidate to remove from CTX_CACHE */
roc_cpt_lf_ctx_flush(sess_priv->lf, &sess_priv->roc_se_ctx.se_ctx, 
true);
 
-   plt_delay_ms(1);
-
if (sess_priv->roc_se_ctx.auth_key != NULL)
plt_free(sess_priv->roc_se_ctx.auth_key);
 
@@ -767,8 +765,6 @@ cnxk_ae_session_clear(struct rte_cryptodev *dev, struct 
rte_cryptodev_asym_sessi
/* Trigger CTX flush + invalidate to remove from CTX_CACHE */
roc_cpt_lf_ctx_flush(priv->lf, &priv->hw_ctx, true);
 
-   plt_delay_ms(1);
-
/* Free resources allocated in session_cfg */
cnxk_ae_free_session_parameters(priv);
 
-- 
2.25.1



[PATCH v2 07/11] common/cnxk: add errata function for CPT set ctx

2023-02-24 Thread Tejasree Kondoj
Adding function in errata header file for CPT ctx_val
and replace CPT revision_id checks with function call.

Signed-off-by: Tejasree Kondoj 
---
 drivers/common/cnxk/roc_errata.h |  9 +
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c | 13 -
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/common/cnxk/roc_errata.h b/drivers/common/cnxk/roc_errata.h
index 36e6db467a..9954b7da32 100644
--- a/drivers/common/cnxk/roc_errata.h
+++ b/drivers/common/cnxk/roc_errata.h
@@ -4,6 +4,8 @@
 #ifndef _ROC_ERRATA_H_
 #define _ROC_ERRATA_H_
 
+#include "roc_model.h"
+
 /* Errata IPBUNIXRX-40129 */
 static inline bool
 roc_errata_nix_has_no_drop_re(void)
@@ -98,4 +100,11 @@ roc_errata_nix_sdp_send_has_mtu_size_16k(void)
roc_model_is_cn96_a0() || roc_model_is_cn96_b0());
 }
 
+/* Errata IPBUCPT-38753 */
+static inline bool
+roc_errata_cpt_hang_on_mixed_ctx_val(void)
+{
+   return roc_model_is_cn10ka_a0() || roc_model_is_cn10ka_a1();
+}
+
 #endif /* _ROC_ERRATA_H_ */
diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c 
b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
index 67bd7e3243..adc1c7652b 100644
--- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
@@ -8,6 +8,7 @@
 
 #include "roc_ae_fpm_tables.h"
 #include "roc_cpt.h"
+#include "roc_errata.h"
 #include "roc_ie_on.h"
 
 #include "cnxk_ae.h"
@@ -636,7 +637,7 @@ cnxk_cpt_inst_w7_get(struct cnxk_se_sess *sess, struct 
roc_cpt *roc_cpt)
 
inst_w7.s.cptr = (uint64_t)&sess->roc_se_ctx.se_ctx;
 
-   if (roc_cpt->cpt_revision == ROC_CPT_REVISION_ID_106XX)
+   if (roc_errata_cpt_hang_on_mixed_ctx_val())
inst_w7.s.ctx_val = 1;
else
inst_w7.s.cptr += 8;
@@ -709,7 +710,7 @@ sym_session_configure(struct roc_cpt *roc_cpt, struct 
rte_crypto_sym_xform *xfor
 
sess_priv->cpt_inst_w7 = cnxk_cpt_inst_w7_get(sess_priv, roc_cpt);
 
-   if (roc_cpt->cpt_revision == ROC_CPT_REVISION_ID_106XX)
+   if (roc_errata_cpt_hang_on_mixed_ctx_val())
roc_se_ctx_init(&sess_priv->roc_se_ctx);
 
return 0;
@@ -735,7 +736,8 @@ sym_session_clear(struct rte_cryptodev_sym_session *sess, 
bool is_session_less)
struct cnxk_se_sess *sess_priv = (struct cnxk_se_sess *)sess;
 
/* Trigger CTX flush + invalidate to remove from CTX_CACHE */
-   roc_cpt_lf_ctx_flush(sess_priv->lf, &sess_priv->roc_se_ctx.se_ctx, 
true);
+   if (roc_errata_cpt_hang_on_mixed_ctx_val())
+   roc_cpt_lf_ctx_flush(sess_priv->lf, 
&sess_priv->roc_se_ctx.se_ctx, true);
 
if (sess_priv->roc_se_ctx.auth_key != NULL)
plt_free(sess_priv->roc_se_ctx.auth_key);
@@ -763,7 +765,8 @@ cnxk_ae_session_clear(struct rte_cryptodev *dev, struct 
rte_cryptodev_asym_sessi
struct cnxk_ae_sess *priv = (struct cnxk_ae_sess *)sess;
 
/* Trigger CTX flush + invalidate to remove from CTX_CACHE */
-   roc_cpt_lf_ctx_flush(priv->lf, &priv->hw_ctx, true);
+   if (roc_errata_cpt_hang_on_mixed_ctx_val())
+   roc_cpt_lf_ctx_flush(priv->lf, &priv->hw_ctx, true);
 
/* Free resources allocated in session_cfg */
cnxk_ae_free_session_parameters(priv);
@@ -792,7 +795,7 @@ cnxk_ae_session_cfg(struct rte_cryptodev *dev, struct 
rte_crypto_asym_xform *xfo
w7.u64 = 0;
w7.s.egrp = roc_cpt->eng_grp[CPT_ENG_TYPE_AE];
 
-   if (vf->cpt.cpt_revision == ROC_CPT_REVISION_ID_106XX) {
+   if (roc_errata_cpt_hang_on_mixed_ctx_val()) {
hwc = &priv->hw_ctx;
hwc->w0.s.aop_valid = 1;
hwc->w0.s.ctx_hdr_size = 0;
-- 
2.25.1



[PATCH v2 08/11] common/cnxk: replace CPT revision check with caps

2023-02-24 Thread Tejasree Kondoj
Replace SG version revision check with capabilities
populated from microcode.

Signed-off-by: Tejasree Kondoj 
---
 drivers/common/cnxk/hw/cpt.h  | 3 ++-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 2 +-
 drivers/event/cnxk/cn10k_eventdev.c   | 3 ++-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/common/cnxk/hw/cpt.h b/drivers/common/cnxk/hw/cpt.h
index 44ff8b08b2..82ea076e4c 100644
--- a/drivers/common/cnxk/hw/cpt.h
+++ b/drivers/common/cnxk/hw/cpt.h
@@ -75,7 +75,8 @@ union cpt_eng_caps {
uint64_t __io mmul : 1;
uint64_t __io reserved_15_33 : 19;
uint64_t __io pdcp_chain : 1;
-   uint64_t __io reserved_35_63 : 29;
+   uint64_t __io sg_ver2 : 1;
+   uint64_t __io reserved_36_63 : 28;
};
 };
 
diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index d1a43eaf13..9f6fd4e411 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -1045,7 +1045,7 @@ cn10k_cpt_dequeue_burst(void *qptr, struct rte_crypto_op 
**ops, uint16_t nb_ops)
 void
 cn10k_cpt_set_enqdeq_fns(struct rte_cryptodev *dev, struct cnxk_cpt_vf *vf)
 {
-   if (vf->cpt.cpt_revision > ROC_CPT_REVISION_ID_106XX)
+   if (vf->cpt.hw_caps[CPT_ENG_TYPE_SE].sg_ver2 && 
vf->cpt.hw_caps[CPT_ENG_TYPE_IE].sg_ver2)
dev->enqueue_burst = cn10k_cpt_sg_ver2_enqueue_burst;
else
dev->enqueue_burst = cn10k_cpt_sg_ver1_enqueue_burst;
diff --git a/drivers/event/cnxk/cn10k_eventdev.c 
b/drivers/event/cnxk/cn10k_eventdev.c
index 8e74edff55..ee0428adc8 100644
--- a/drivers/event/cnxk/cn10k_eventdev.c
+++ b/drivers/event/cnxk/cn10k_eventdev.c
@@ -602,7 +602,8 @@ cn10k_sso_fp_fns_set(struct rte_eventdev *event_dev)
}
}
 
-   if ((cpt != NULL) && (cpt->cpt_revision > ROC_CPT_REVISION_ID_106XX))
+   if ((cpt != NULL) && cpt->hw_caps[CPT_ENG_TYPE_SE].sg_ver2 &&
+   cpt->hw_caps[CPT_ENG_TYPE_IE].sg_ver2)
event_dev->ca_enqueue = 
cn10k_cpt_sg_ver2_crypto_adapter_enqueue;
else
event_dev->ca_enqueue = 
cn10k_cpt_sg_ver1_crypto_adapter_enqueue;
-- 
2.25.1



[PATCH v2 09/11] crypto/cnxk: support cn10k IPsec SG mode

2023-02-24 Thread Tejasree Kondoj
Adding support for scatter-gather mode in 103XX and
106XX lookaside IPsec.

Signed-off-by: Tejasree Kondoj 
---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |  21 +-
 drivers/crypto/cnxk/cn10k_ipsec_la_ops.h  | 222 --
 drivers/crypto/cnxk/cnxk_sg.h |  23 +++
 3 files changed, 239 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c 
b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
index 9f6fd4e411..e405a2ad9f 100644
--- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c
@@ -77,8 +77,8 @@ cn10k_cpt_sym_temp_sess_create(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *op)
 }
 
 static __rte_always_inline int __rte_hot
-cpt_sec_inst_fill(struct cnxk_cpt_qp *qp, struct rte_crypto_op *op,
- struct cn10k_sec_session *sess, struct cpt_inst_s *inst)
+cpt_sec_inst_fill(struct cnxk_cpt_qp *qp, struct rte_crypto_op *op, struct 
cn10k_sec_session *sess,
+ struct cpt_inst_s *inst, struct cpt_inflight_req *infl_req, 
const bool is_sg_ver2)
 {
struct rte_crypto_sym_op *sym_op = op->sym;
int ret;
@@ -88,15 +88,11 @@ cpt_sec_inst_fill(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *op,
return -ENOTSUP;
}
 
-   if (unlikely(!rte_pktmbuf_is_contiguous(sym_op->m_src))) {
-   plt_dp_err("Scatter Gather mode is not supported");
-   return -ENOTSUP;
-   }
-
if (sess->is_outbound)
-   ret = process_outb_sa(&qp->lf, op, sess, inst);
+   ret = process_outb_sa(&qp->lf, op, sess, &qp->meta_info, 
infl_req, inst,
+ is_sg_ver2);
else
-   ret = process_inb_sa(op, sess, inst);
+   ret = process_inb_sa(op, sess, inst, &qp->meta_info, infl_req, 
is_sg_ver2);
 
return ret;
 }
@@ -129,7 +125,7 @@ cn10k_cpt_fill_inst(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *ops[], struct
if (op->type == RTE_CRYPTO_OP_TYPE_SYMMETRIC) {
if (op->sess_type == RTE_CRYPTO_OP_SECURITY_SESSION) {
sec_sess = (struct cn10k_sec_session *)sym_op->session;
-   ret = cpt_sec_inst_fill(qp, op, sec_sess, &inst[0]);
+   ret = cpt_sec_inst_fill(qp, op, sec_sess, &inst[0], 
infl_req, is_sg_ver2);
if (unlikely(ret))
return 0;
w7 = sec_sess->inst.w7;
@@ -827,7 +823,10 @@ cn10k_cpt_sec_post_process(struct rte_crypto_op *cop, 
struct cpt_cn10k_res_s *re
cop->status = RTE_CRYPTO_OP_STATUS_ERROR;
return;
}
-   mbuf->data_len = m_len;
+
+   if (mbuf->next == NULL)
+   mbuf->data_len = m_len;
+
mbuf->pkt_len = m_len;
 }
 
diff --git a/drivers/crypto/cnxk/cn10k_ipsec_la_ops.h 
b/drivers/crypto/cnxk/cn10k_ipsec_la_ops.h
index f2761a55a5..8e208eb2ca 100644
--- a/drivers/crypto/cnxk/cn10k_ipsec_la_ops.h
+++ b/drivers/crypto/cnxk/cn10k_ipsec_la_ops.h
@@ -8,9 +8,13 @@
 #include 
 #include 
 
+#include "roc_ie.h"
+
 #include "cn10k_cryptodev.h"
 #include "cn10k_ipsec.h"
 #include "cnxk_cryptodev.h"
+#include "cnxk_cryptodev_ops.h"
+#include "cnxk_sg.h"
 
 static inline void
 ipsec_po_sa_iv_set(struct cn10k_sec_session *sess, struct rte_crypto_op *cop)
@@ -44,18 +48,14 @@ ipsec_po_sa_aes_gcm_iv_set(struct cn10k_sec_session *sess, 
struct rte_crypto_op
 
 static __rte_always_inline int
 process_outb_sa(struct roc_cpt_lf *lf, struct rte_crypto_op *cop, struct 
cn10k_sec_session *sess,
-   struct cpt_inst_s *inst)
+   struct cpt_qp_meta_info *m_info, struct cpt_inflight_req 
*infl_req,
+   struct cpt_inst_s *inst, const bool is_sg_ver2)
 {
struct rte_crypto_sym_op *sym_op = cop->sym;
struct rte_mbuf *m_src = sym_op->m_src;
uint64_t inst_w4_u64 = sess->inst.w4;
uint64_t dptr;
 
-   if (unlikely(rte_pktmbuf_tailroom(m_src) < sess->max_extended_len)) {
-   plt_dp_err("Not enough tail room");
-   return -ENOMEM;
-   }
-
RTE_SET_USED(lf);
 
 #ifdef LA_IPSEC_DEBUG
@@ -79,27 +79,217 @@ process_outb_sa(struct roc_cpt_lf *lf, struct 
rte_crypto_op *cop, struct cn10k_s
if (m_src->ol_flags & RTE_MBUF_F_TX_L4_MASK)
inst_w4_u64 &= ~BIT_ULL(32);
 
-   /* Prepare CPT instruction */
-   inst->w4.u64 = inst_w4_u64 | rte_pktmbuf_pkt_len(m_src);
-   dptr = rte_pktmbuf_mtod(m_src, uint64_t);
-   inst->dptr = dptr;
+   if (likely(m_src->next == NULL)) {
+   if (unlikely(rte_pktmbuf_tailroom(m_src) < 
sess->max_extended_len)) {
+   plt_dp_err("Not enough tail room");
+   return -ENOMEM;
+   }
+
+   /* Prepare CPT instruction */
+   inst->w4.u64 = inst_w4_u64 | rte_pktmbuf_pkt_len(m_src);
+   dptr = rte_pktmbuf_mtod(m_src, uint64_t

[PATCH v2 10/11] crypto/cnxk: fix order of ECFPM params

2023-02-24 Thread Tejasree Kondoj
From: Gowrishankar Muthukrishnan 

Fix the order of ECFPM parameters according to target board.

Fixes: 8e39b133235 ("crypto/cnxk: support fixed point multiplication")

Signed-off-by: Gowrishankar Muthukrishnan 
---
 drivers/crypto/cnxk/cnxk_ae.h | 48 ---
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/cnxk/cnxk_ae.h b/drivers/crypto/cnxk/cnxk_ae.h
index b7c13a9e01..47f000dd5e 100644
--- a/drivers/crypto/cnxk/cnxk_ae.h
+++ b/drivers/crypto/cnxk/cnxk_ae.h
@@ -700,7 +700,7 @@ static __rte_always_inline int
 cnxk_ae_ecfpm_prep(struct rte_crypto_ecpm_op_param *ecpm,
   struct roc_ae_buf_ptr *meta_buf, uint64_t *fpm_iova,
   struct roc_ae_ec_group *ec_grp, uint8_t curveid,
-  struct cpt_inst_s *inst)
+  struct cpt_inst_s *inst, int cpt_ver)
 {
uint16_t scalar_align, p_align;
uint16_t dlen, prime_len;
@@ -719,26 +719,33 @@ cnxk_ae_ecfpm_prep(struct rte_crypto_ecpm_op_param *ecpm,
scalar_align = RTE_ALIGN_CEIL(ecpm->scalar.length, 8);
 
/*
-* Set dlen = sum(ROUNDUP8(input point(x and y coordinates), prime,
-* scalar length),
+* Set dlen = sum(prime, scalar length, table address and
+* optionally ROUNDUP8(input point(x and y coordinates)).
 * Please note point length is equivalent to prime of the curve
 */
-   dlen = sizeof(fpm_table_iova) + 3 * p_align + scalar_align;
-
-   memset(dptr, 0, dlen);
-
-   *(uint64_t *)dptr = fpm_table_iova;
-   dptr += sizeof(fpm_table_iova);
-
-   /* Copy scalar, prime */
-   memcpy(dptr, ecpm->scalar.data, ecpm->scalar.length);
-   dptr += scalar_align;
-   memcpy(dptr, ec_grp->prime.data, ec_grp->prime.length);
-   dptr += p_align;
-   memcpy(dptr, ec_grp->consta.data, ec_grp->consta.length);
-   dptr += p_align;
-   memcpy(dptr, ec_grp->constb.data, ec_grp->constb.length);
-   dptr += p_align;
+   if (cpt_ver == ROC_CPT_REVISION_ID_96XX_C0) {
+   dlen = sizeof(fpm_table_iova) + 3 * p_align + scalar_align;
+   memset(dptr, 0, dlen);
+   *(uint64_t *)dptr = fpm_table_iova;
+   dptr += sizeof(fpm_table_iova);
+   memcpy(dptr, ecpm->scalar.data, ecpm->scalar.length);
+   dptr += scalar_align;
+   memcpy(dptr, ec_grp->prime.data, ec_grp->prime.length);
+   dptr += p_align;
+   memcpy(dptr, ec_grp->consta.data, ec_grp->consta.length);
+   dptr += p_align;
+   memcpy(dptr, ec_grp->constb.data, ec_grp->constb.length);
+   dptr += p_align;
+   } else {
+   dlen = sizeof(fpm_table_iova) + p_align + scalar_align;
+   memset(dptr, 0, dlen);
+   memcpy(dptr, ecpm->scalar.data, ecpm->scalar.length);
+   dptr += scalar_align;
+   memcpy(dptr, ec_grp->prime.data, ec_grp->prime.length);
+   dptr += p_align;
+   *(uint64_t *)dptr = fpm_table_iova;
+   dptr += sizeof(fpm_table_iova);
+   }
 
/* Setup opcodes */
w4.s.opcode_major = ROC_AE_MAJOR_OP_ECC;
@@ -969,7 +976,8 @@ cnxk_ae_enqueue(struct cnxk_cpt_qp *qp, struct 
rte_crypto_op *op,
ret = cnxk_ae_ecfpm_prep(&asym_op->ecpm, &meta_buf,
 sess->cnxk_fpm_iova,
 sess->ec_grp[sess->ec_ctx.curveid],
-sess->ec_ctx.curveid, inst);
+sess->ec_ctx.curveid, inst,
+sess->lf->roc_cpt->cpt_revision);
if (unlikely(ret))
goto req_fail;
break;
-- 
2.25.1



[PATCH v2 11/11] crypto/cnxk: add model check for pdcp chain

2023-02-24 Thread Tejasree Kondoj
Adding cn9k model check for pdcp_chain as
it is not supported in cn10k.

Signed-off-by: Tejasree Kondoj 
---
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c 
b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
index adc1c7652b..86efe75cc3 100644
--- a/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
+++ b/drivers/crypto/cnxk/cnxk_cryptodev_ops.c
@@ -479,7 +479,7 @@ cnxk_sess_fill(struct roc_cpt *roc_cpt, struct 
rte_crypto_sym_xform *xform,
bool pdcp_chain_supported = false;
bool ciph_then_auth = false;
 
-   if (roc_cpt->hw_caps[CPT_ENG_TYPE_SE].pdcp_chain)
+   if (roc_model_is_cn9k() && 
(roc_cpt->hw_caps[CPT_ENG_TYPE_SE].pdcp_chain))
pdcp_chain_supported = true;
 
if (xform == NULL)
-- 
2.25.1



RE: [PATCH v11 21/22] hash: move rte_hash_set_alg out header

2023-02-24 Thread Ruifeng Wang
> -Original Message-
> From: Stephen Hemminger 
> Sent: Thursday, February 23, 2023 5:56 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ; Yipeng Wang 
> ;
> Sameh Gobriel ; Bruce Richardson 
> ;
> Vladimir Medvedkin ; Ruifeng Wang 
> 
> Subject: [PATCH v11 21/22] hash: move rte_hash_set_alg out header
> 
> The code for setting algorithm for hash is not at all perf sensitive, and 
> doing it inline
> has a couple of problems. First, it means that if multiple files include the 
> header, then
> the initialization gets done multiple times. But also, it makes it harder to 
> fix usage of
> RTE_LOG().
> 
> Despite what the checking script say. This is not an ABI change, the previous 
> version
> inlined the same code; therefore both old and new code will work the same.
> 
> Signed-off-by: Stephen Hemminger 
> ---
>  lib/hash/meson.build |  1 +
>  lib/hash/rte_crc_arm64.h |  8 ++---
>  lib/hash/rte_crc_x86.h   | 10 +++---
>  lib/hash/rte_hash_crc.c  | 68 
>  lib/hash/rte_hash_crc.h  | 48 ++--
>  lib/hash/version.map |  7 +
>  6 files changed, 88 insertions(+), 54 deletions(-)  create mode 100644
> lib/hash/rte_hash_crc.c
> 
Acked-by: Ruifeng Wang 



回复: [PATCH v3 1/3] ethdev: enable direct rearm with separate API

2023-02-24 Thread Feifei Wang
Hi, Konstantin

Thanks for your reviewing and sorry for my delayed response.
For your comments, we put forward several improvement plans below.

Best Regards
Feifei

> -邮件原件-
> 发件人: Konstantin Ananyev 
> 发送时间: Thursday, February 2, 2023 10:33 PM
> 收件人: Feifei Wang ; tho...@monjalon.net;
> Ferruh Yigit ; Andrew Rybchenko
> 
> 抄送: dev@dpdk.org; nd ; Honnappa Nagarahalli
> ; Ruifeng Wang
> 
> 主题: Re: [PATCH v3 1/3] ethdev: enable direct rearm with separate API
> 
> Hi Feifei,
> 
> > Add 'tx_fill_sw_ring' and 'rx_flush_descriptor' API into direct rearm
> > mode for separate Rx and Tx Operation. And this can support different
> > multiple sources in direct rearm mode. For examples, Rx driver is
> > ixgbe, and Tx driver is i40e.
> 
> 
> Thanks for your effort and thanks for taking comments provided into
> consideration.
> That approach looks much better then previous ones.
> Few nits below.
> Konstantin
> 
> > Suggested-by: Honnappa Nagarahalli 
> > Suggested-by: Ruifeng Wang 
> > Signed-off-by: Feifei Wang 
> > Reviewed-by: Ruifeng Wang 
> > Reviewed-by: Honnappa Nagarahalli 
> > ---
> >   lib/ethdev/ethdev_driver.h   |  10 ++
> >   lib/ethdev/ethdev_private.c  |   2 +
> >   lib/ethdev/rte_ethdev.c  |  52 +++
> >   lib/ethdev/rte_ethdev.h  | 174
> +++
> >   lib/ethdev/rte_ethdev_core.h |  11 +++
> >   lib/ethdev/version.map   |   6 ++
> >   6 files changed, 255 insertions(+)
> >
> > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
> > index 6a550cfc83..bc539ec862 100644
> > --- a/lib/ethdev/ethdev_driver.h
> > +++ b/lib/ethdev/ethdev_driver.h
> > @@ -59,6 +59,10 @@ struct rte_eth_dev {
> > eth_rx_descriptor_status_t rx_descriptor_status;
> > /** Check the status of a Tx descriptor */
> > eth_tx_descriptor_status_t tx_descriptor_status;
> > +   /** Fill Rx sw-ring with Tx buffers in direct rearm mode */
> > +   eth_tx_fill_sw_ring_t tx_fill_sw_ring;
> > +   /** Flush Rx descriptor in direct rearm mode */
> > +   eth_rx_flush_descriptor_t rx_flush_descriptor;
> >
> > /**
> >  * Device data that is shared between primary and secondary
> > processes @@ -504,6 +508,10 @@ typedef void
> (*eth_rxq_info_get_t)(struct rte_eth_dev *dev,
> >   typedef void (*eth_txq_info_get_t)(struct rte_eth_dev *dev,
> > uint16_t tx_queue_id, struct rte_eth_txq_info *qinfo);
> >
> > +/**< @internal Get rearm data for a receive queue of an Ethernet
> > +device. */ typedef void (*eth_rxq_rearm_data_get_t)(struct rte_eth_dev
> *dev,
> > +   uint16_t tx_queue_id, struct rte_eth_rxq_rearm_data
> > +*rxq_rearm_data);
> > +
> >   typedef int (*eth_burst_mode_get_t)(struct rte_eth_dev *dev,
> > uint16_t queue_id, struct rte_eth_burst_mode *mode);
> >
> > @@ -1215,6 +1223,8 @@ struct eth_dev_ops {
> > eth_rxq_info_get_t rxq_info_get;
> > /** Retrieve Tx queue information */
> > eth_txq_info_get_t txq_info_get;
> > +   /** Get Rx queue rearm data */
> > +   eth_rxq_rearm_data_get_t   rxq_rearm_data_get;
> > eth_burst_mode_get_t   rx_burst_mode_get; /**< Get Rx burst
> mode */
> > eth_burst_mode_get_t   tx_burst_mode_get; /**< Get Tx burst
> mode */
> > eth_fw_version_get_t   fw_version_get; /**< Get firmware version
> */
> > diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
> > index 48090c879a..c5dd5e30f6 100644
> > --- a/lib/ethdev/ethdev_private.c
> > +++ b/lib/ethdev/ethdev_private.c
> > @@ -276,6 +276,8 @@ eth_dev_fp_ops_setup(struct rte_eth_fp_ops *fpo,
> > fpo->rx_queue_count = dev->rx_queue_count;
> > fpo->rx_descriptor_status = dev->rx_descriptor_status;
> > fpo->tx_descriptor_status = dev->tx_descriptor_status;
> > +   fpo->tx_fill_sw_ring = dev->tx_fill_sw_ring;
> > +   fpo->rx_flush_descriptor = dev->rx_flush_descriptor;
> >
> > fpo->rxq.data = dev->data->rx_queues;
> > fpo->rxq.clbk = (void **)(uintptr_t)dev->post_rx_burst_cbs;
> > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
> > 5d5e18db1e..2af5cb42fe 100644
> > --- a/lib/ethdev/rte_ethdev.c
> > +++ b/lib/ethdev/rte_ethdev.c
> > @@ -3282,6 +3282,21 @@
> rte_eth_dev_set_rx_queue_stats_mapping(uint16_t port_id, uint16_t
> rx_queue_id,
> > stat_idx, STAT_QMAP_RX));
> >   }
> >
> > +int
> > +rte_eth_dev_direct_rearm(uint16_t rx_port_id, uint16_t rx_queue_id,
> > +   uint16_t tx_port_id, uint16_t tx_rx_queue_id,
> > +   struct rte_eth_rxq_rearm_data *rxq_rearm_data) {
> > +   int nb_rearm = 0;
> > +
> > +   nb_rearm = rte_eth_tx_fill_sw_ring(tx_port_id, tx_rx_queue_id,
> > +rxq_rearm_data);
> > +
> > +   if (nb_rearm > 0)
> > +   return rte_eth_rx_flush_descriptor(rx_port_id, rx_queue_id,
> > +nb_rearm);
> > +
> > +   return 0;
> > +}
> > +
> >   int
> >   rte_eth_dev_fw_version_get(uint16_t port_id, char *fw_version, size_t
> fw_size)
> >   {
> > @@ -5323,6 +5338,43 @@ rte_eth_tx_queue_info_get(uint

回复: [PATCH v3 2/3] net/i40e: enable direct rearm with separate API

2023-02-24 Thread Feifei Wang


> -邮件原件-
> 发件人: Konstantin Ananyev 
> 发送时间: Thursday, February 2, 2023 10:38 PM
> 收件人: Feifei Wang ; Yuying Zhang
> ; Beilei Xing ; Ruifeng
> Wang 
> 抄送: dev@dpdk.org; nd ; Honnappa Nagarahalli
> 
> 主题: Re: [PATCH v3 2/3] net/i40e: enable direct rearm with separate API
> 
> 04/01/2023 07:30, Feifei Wang пишет:
> > Add internal API to separate direct rearm operations between Rx and
> > Tx.
> >
> > Suggested-by: Honnappa Nagarahalli 
> > Signed-off-by: Feifei Wang 
> > Reviewed-by: Ruifeng Wang 
> > Reviewed-by: Honnappa Nagarahalli 
> > ---
> >   drivers/net/i40e/i40e_ethdev.c  |  1 +
> >   drivers/net/i40e/i40e_ethdev.h  |  2 +
> >   drivers/net/i40e/i40e_rxtx.c| 19 +
> >   drivers/net/i40e/i40e_rxtx.h|  4 ++
> >   drivers/net/i40e/i40e_rxtx_vec_common.h | 54
> +
> >   drivers/net/i40e/i40e_rxtx_vec_neon.c   | 42 +++
> >   6 files changed, 122 insertions(+)
> >
> > diff --git a/drivers/net/i40e/i40e_ethdev.c
> > b/drivers/net/i40e/i40e_ethdev.c index 7726a89d99..29c1ce2470 100644
> > --- a/drivers/net/i40e/i40e_ethdev.c
> > +++ b/drivers/net/i40e/i40e_ethdev.c
> > @@ -497,6 +497,7 @@ static const struct eth_dev_ops i40e_eth_dev_ops
> = {
> > .flow_ops_get = i40e_dev_flow_ops_get,
> > .rxq_info_get = i40e_rxq_info_get,
> > .txq_info_get = i40e_txq_info_get,
> > +   .rxq_rearm_data_get   = i40e_rxq_rearm_data_get,
> > .rx_burst_mode_get= i40e_rx_burst_mode_get,
> > .tx_burst_mode_get= i40e_tx_burst_mode_get,
> > .timesync_enable  = i40e_timesync_enable,
> > diff --git a/drivers/net/i40e/i40e_ethdev.h
> > b/drivers/net/i40e/i40e_ethdev.h index fe943a45ff..6a6a2a6d3c 100644
> > --- a/drivers/net/i40e/i40e_ethdev.h
> > +++ b/drivers/net/i40e/i40e_ethdev.h
> > @@ -1352,6 +1352,8 @@ void i40e_rxq_info_get(struct rte_eth_dev *dev,
> uint16_t queue_id,
> > struct rte_eth_rxq_info *qinfo);
> >   void i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
> > struct rte_eth_txq_info *qinfo);
> > +void i40e_rxq_rearm_data_get(struct rte_eth_dev *dev, uint16_t
> queue_id,
> > +   struct rte_eth_rxq_rearm_data *rxq_rearm_data);
> >   int i40e_rx_burst_mode_get(struct rte_eth_dev *dev, uint16_t queue_id,
> >struct rte_eth_burst_mode *mode);
> >   int i40e_tx_burst_mode_get(struct rte_eth_dev *dev, uint16_t
> > queue_id, diff --git a/drivers/net/i40e/i40e_rxtx.c
> > b/drivers/net/i40e/i40e_rxtx.c index 788ffb51c2..d8d801acaf 100644
> > --- a/drivers/net/i40e/i40e_rxtx.c
> > +++ b/drivers/net/i40e/i40e_rxtx.c
> > @@ -3197,6 +3197,19 @@ i40e_txq_info_get(struct rte_eth_dev *dev,
> uint16_t queue_id,
> > qinfo->conf.offloads = txq->offloads;
> >   }
> >
> > +void
> > +i40e_rxq_rearm_data_get(struct rte_eth_dev *dev, uint16_t queue_id,
> > +   struct rte_eth_rxq_rearm_data *rxq_rearm_data) {
> > +   struct i40e_rx_queue *rxq;
> > +
> > +   rxq = dev->data->rx_queues[queue_id];
> > +
> > +   rxq_rearm_data->rx_sw_ring = rxq->sw_ring;
> > +   rxq_rearm_data->rearm_start = &rxq->rxrearm_start;
> > +   rxq_rearm_data->rearm_nb = &rxq->rxrearm_nb; }
> > +
> >   #ifdef RTE_ARCH_X86
> >   static inline bool
> >   get_avx_supported(bool request_avx512) @@ -3321,6 +3334,9 @@
> > i40e_set_rx_function(struct rte_eth_dev *dev)
> > PMD_INIT_LOG(DEBUG, "Using Vector Rx (port %d).",
> >  dev->data->port_id);
> > dev->rx_pkt_burst = i40e_recv_pkts_vec;
> > +#ifdef RTE_ARCH_ARM64
> > +   dev->rx_flush_descriptor =
> i40e_rx_flush_descriptor_vec; #endif
> > }
> >   #endif /* RTE_ARCH_X86 */
> > } else if (!dev->data->scattered_rx && ad->rx_bulk_alloc_allowed) {
> > @@ -3484,6 +3500,9 @@ i40e_set_tx_function(struct rte_eth_dev *dev)
> > PMD_INIT_LOG(DEBUG, "Using Vector Tx (port %d).",
> >  dev->data->port_id);
> > dev->tx_pkt_burst = i40e_xmit_pkts_vec;
> > +#ifdef RTE_ARCH_ARM64
> > +   dev->tx_fill_sw_ring = i40e_tx_fill_sw_ring; #endif
> 
> As I can see tx_fill_sw_ring() is non ARM specific, any reason to guard it 
> with
> #ifdef ARM?
> Actually same ask for rx_flush_descriptor() - can we have generic version too?

Here we consider direct-rearm not enable in other architecture. Agree with that
we need to have generic version to avoid this, I will update in the next 
version.
 
> 
> >   #endif /* RTE_ARCH_X86 */
> > } else {
> > PMD_INIT_LOG(DEBUG, "Simple tx finally be used.");
> diff --git
> > a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index
> > 5e6eecc501..8a29bd89df 100644
> > --- a/drivers/net/i40e/i40e_rxtx.h
> > +++ b/drivers/net/i40e/i40e_rxtx.h
> > @@ -233,6 +233,10 @@ uint32_t i40e_dev_rx_queue_count(void
> *rx_queue);
> >   int i40e_dev_rx

[PATCH v3] lib/net: add MPLS insert and strip functionality

2023-02-24 Thread Tanzeel-inline
From: Tanzeel Ahmed 

This patch is new version of [PATCH] lib/net: added push MPLS header API.
I have also added the MPLS strip functionality to address the question
asked in last patch.

> You should explain why you add this function.
None of the foundational NICs currently supports MPLS insertion and
stripping, this functionality can help users who rely on MPLS in their
network application.

> I'm not sure it should be inline
I did for performance in high-traffic application.

Signed-off-by: Tanzeel Ahmed 

---
v3:
* fixed patch checks failure

v2:
* marked experimental
* coding style fixed
* changed rte_memcpy to memcpy
* mpls header marked as const in parameter
* added MPLS stripping functionality
---
 .mailmap   |  1 +
 lib/net/rte_mpls.h | 97 ++
 2 files changed, 98 insertions(+)

diff --git a/.mailmap b/.mailmap
index a9f4f28..2af4e0d 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1312,6 +1312,7 @@ Takeshi Yoshimura  

 Takuya Asada 
 Tal Avraham 
 Tal Shnaiderman  
+Tanzeel Ahmed 
 Tao Y Yang 
 Tao Zhu 
 Taripin Samuel 
diff --git a/lib/net/rte_mpls.h b/lib/net/rte_mpls.h
index 51523e7..14b55fe 100644
--- a/lib/net/rte_mpls.h
+++ b/lib/net/rte_mpls.h
@@ -13,6 +13,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #ifdef __cplusplus
 extern "C" {
@@ -36,6 +38,101 @@ struct rte_mpls_hdr {
uint8_t  ttl;   /**< Time to live. */
 } __rte_packed;
 
+#define RTE_MPLS_HLEN 4 /**< Length of MPLS header. */
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Insert MPLS header into the packet.
+ * If it's first MPLS header to be inserted in the packet,
+ *  - Updates the ether type.
+ *  - Sets the MPLS bottom-of-stack bit to 1.
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @param mp
+ *   The pointer to the MPLS header.
+ * @return
+ *   0 on success, -1 on error
+ */
+__rte_experimental
+static inline int
+rte_mpls_push_over_l2(struct rte_mbuf **m, const struct rte_mpls_hdr *mp)
+{
+   struct rte_ether_hdr *oh, *nh;
+   struct rte_mpls_hdr *mph;
+
+   /* Can't insert header if mbuf is shared */
+   if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
+   return -EINVAL;
+
+   /* Can't insert header if ethernet frame doesn't exist */
+   if (rte_pktmbuf_data_len(*m) < RTE_ETHER_HDR_LEN)
+   return -EINVAL;
+
+   oh = rte_pktmbuf_mtod(*m, struct rte_ether_hdr *);
+   nh = (struct rte_ether_hdr *)(void *)
+   rte_pktmbuf_prepend(*m, sizeof(struct rte_mpls_hdr));
+   if (nh == NULL)
+   return -ENOSPC;
+
+   memmove(nh, oh, RTE_ETHER_HDR_LEN);
+
+   /* Copy the MPLS header after ethernet frame */
+   mph = rte_pktmbuf_mtod_offset(*m, struct rte_mpls_hdr*,
+   sizeof(struct rte_ether_hdr));
+   memcpy(mph, mp, RTE_MPLS_HLEN);
+
+   mph->tag_msb = rte_cpu_to_be_16(mp->tag_msb);
+
+   /* If first MPLS header, update ether type and bottom-of-stack bit */
+   if (nh->ether_type != rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS)) {
+   nh->ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS);
+   mph->bs = 1;
+   } else {
+   mph->bs = 0;
+   }
+
+   return 0;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Strips MPLS from the packet. Doesn't update the ether type
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @return
+ *   0 on success, -1 on error
+ */
+__rte_experimental
+static inline int
+rte_mpls_strip_over_l2(struct rte_mbuf *m)
+{
+   struct rte_ether_hdr *eh = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+   struct rte_mpls_hdr *mph;
+   bool mpls_exist = true;
+
+   if (eh->ether_type != rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS))
+   return -1;
+
+   /* Stripping all MPLS header */
+   while (mpls_exist) {
+   mph = rte_pktmbuf_mtod_offset(m, struct rte_mpls_hdr*,
+   sizeof(struct rte_ether_hdr));
+   if (mph->bs & 1)
+   mpls_exist = false;
+   memmove(rte_pktmbuf_adj(m, sizeof(struct rte_mpls_hdr)),
+   eh, sizeof(struct rte_ether_hdr));
+   eh = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+   }
+
+   return 0;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.3.1



[PATCH v3] lib/net: add MPLS insert and strip functionality

2023-02-24 Thread Tanzeel-inline
From: Tanzeel Ahmed 

This patch is new version of [PATCH] lib/net: added push MPLS header API.
I have also added the MPLS strip functionality to address the question
asked in last patch.

> You should explain why you add this function.
None of the foundational NICs currently supports MPLS insertion and
stripping, this functionality can help users who rely on MPLS in their
network application.

> I'm not sure it should be inline
I did for performance in high-traffic application.

Signed-off-by: Tanzeel Ahmed 

---
v3:
* fixed patch checks failure

v2:
* marked experimental
* coding style fixed
* changed rte_memcpy to memcpy
* mpls header marked as const in parameter
* added MPLS stripping functionality
---
 .mailmap   |  1 +
 lib/net/rte_mpls.h | 97 ++
 2 files changed, 98 insertions(+)

diff --git a/.mailmap b/.mailmap
index a9f4f28..2af4e0d 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1312,6 +1312,7 @@ Takeshi Yoshimura  

 Takuya Asada 
 Tal Avraham 
 Tal Shnaiderman  
+Tanzeel Ahmed 
 Tao Y Yang 
 Tao Zhu 
 Taripin Samuel 
diff --git a/lib/net/rte_mpls.h b/lib/net/rte_mpls.h
index 51523e7..14b55fe 100644
--- a/lib/net/rte_mpls.h
+++ b/lib/net/rte_mpls.h
@@ -13,6 +13,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #ifdef __cplusplus
 extern "C" {
@@ -36,6 +38,101 @@ struct rte_mpls_hdr {
uint8_t  ttl;   /**< Time to live. */
 } __rte_packed;
 
+#define RTE_MPLS_HLEN 4 /**< Length of MPLS header. */
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Insert MPLS header into the packet.
+ * If it's first MPLS header to be inserted in the packet,
+ *  - Updates the ether type.
+ *  - Sets the MPLS bottom-of-stack bit to 1.
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @param mp
+ *   The pointer to the MPLS header.
+ * @return
+ *   0 on success, -1 on error
+ */
+__rte_experimental
+static inline int
+rte_mpls_push_over_l2(struct rte_mbuf **m, const struct rte_mpls_hdr *mp)
+{
+   struct rte_ether_hdr *oh, *nh;
+   struct rte_mpls_hdr *mph;
+
+   /* Can't insert header if mbuf is shared */
+   if (!RTE_MBUF_DIRECT(*m) || rte_mbuf_refcnt_read(*m) > 1)
+   return -EINVAL;
+
+   /* Can't insert header if ethernet frame doesn't exist */
+   if (rte_pktmbuf_data_len(*m) < RTE_ETHER_HDR_LEN)
+   return -EINVAL;
+
+   oh = rte_pktmbuf_mtod(*m, struct rte_ether_hdr *);
+   nh = (struct rte_ether_hdr *)(void *)
+   rte_pktmbuf_prepend(*m, sizeof(struct rte_mpls_hdr));
+   if (nh == NULL)
+   return -ENOSPC;
+
+   memmove(nh, oh, RTE_ETHER_HDR_LEN);
+
+   /* Copy the MPLS header after ethernet frame */
+   mph = rte_pktmbuf_mtod_offset(*m, struct rte_mpls_hdr*,
+   sizeof(struct rte_ether_hdr));
+   memcpy(mph, mp, RTE_MPLS_HLEN);
+
+   mph->tag_msb = rte_cpu_to_be_16(mp->tag_msb);
+
+   /* If first MPLS header, update ether type and bottom-of-stack bit */
+   if (nh->ether_type != rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS)) {
+   nh->ether_type = rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS);
+   mph->bs = 1;
+   } else {
+   mph->bs = 0;
+   }
+
+   return 0;
+}
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Strips MPLS from the packet. Doesn't update the ether type
+ *
+ * @param m
+ *   The pointer to the mbuf.
+ * @return
+ *   0 on success, -1 on error
+ */
+__rte_experimental
+static inline int
+rte_mpls_strip_over_l2(struct rte_mbuf *m)
+{
+   struct rte_ether_hdr *eh = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+   struct rte_mpls_hdr *mph;
+   bool mpls_exist = true;
+
+   if (eh->ether_type != rte_cpu_to_be_16(RTE_ETHER_TYPE_MPLS))
+   return -1;
+
+   /* Stripping all MPLS header */
+   while (mpls_exist) {
+   mph = rte_pktmbuf_mtod_offset(m, struct rte_mpls_hdr*,
+   sizeof(struct rte_ether_hdr));
+   if (mph->bs & 1)
+   mpls_exist = false;
+   memmove(rte_pktmbuf_adj(m, sizeof(struct rte_mpls_hdr)),
+   eh, sizeof(struct rte_ether_hdr));
+   eh = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
+   }
+
+   return 0;
+}
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.8.3.1



RE: Bug in rte_mempool_do_generic_get?

2023-02-24 Thread Morten Brørup
> From: Harris, James R [mailto:james.r.har...@intel.com] 
> Sent: Friday, 24 February 2023 04.03
> 
> Hi,
> 
> I've tracked down a regression in SPDK to DPDK commit a2833ecc5 ("mempool: 
> fix get objects from mempool with cache").

The problem probably goes all the way back to the introduction of the cache 
flush threshold, which effectively increased the cache size to 1.5 times the 
configured cache size, in this commit:
http://git.dpdk.org/dpdk/commit/lib/librte_mempool/rte_mempool.h?id=ea5dd2744b90b330f07fd10f327ab99ef55c7266

It might even go further back.

> 
> Here's an example that demonstrates the problem:
> 
> Allocate mempool with 2048 buffers and cache size 256.
> Core 0 allocates 512 buffers.  Mempool pulls 512 + 256 buffers from backing 
> ring, returns 512 of them to caller, puts the other 256 in core 0 cache.  
> Backing ring now has 1280 buffers.
> Core 1 allocates 512 buffers.  Mempool pulls 512 + 256 buffers from backing 
> ring, returns 512 of them to caller, puts the other 256 in core 1 cache.  
> Backing ring now has 512 buffers.
> Core 2 allocates 512 buffers.  Mempool pulls remaining 512 buffers from 
> backing ring and returns all of them to caller.  Backing ring now has 0 
> buffers.
> Core 3 tries to allocate 512 buffers and it fails.
> 
> In the SPDK case, we don't really need or use the mempool cache in this case, 
> so changing the cache size to 0 fixes the problem and is what we're going to 
> move forward with.

If you are not making get/put requests smaller than the cache size, then yes, 
having no cache is the best solution.

> 
> But the behavior did cause a regression so I thought I'd mention it here.

Thank you.

> If you have a mempool with 2048 objects, shouldn't 4 cores each be able to do 
> a 512 buffer bulk get, regardless of the configured cache size?

No, the scenario you described above is the expected behavior. I think it is 
documented somewhere that objects in the caches are unavailable for other 
cores, but now I cannot find where this is documented.


Furthermore, since the effective per-core cache size is 1.5 * configured cache 
size, a configured cache size of 256 may leave up to 384 objects in each 
per-core cache.

With 4 cores, you can expect up to 3 * 384 = 1152 objects sitting in the caches 
of other cores. If you want to be able to pull 512 objects with each core, the 
pool size should be 4 * 512 + 1152 objects.



RE: Bug in rte_mempool_do_generic_get?

2023-02-24 Thread Honnappa Nagarahalli



> -Original Message-
> From: Morten Brørup 
> Sent: Friday, February 24, 2023 6:13 AM
> To: Harris, James R ; dev@dpdk.org
> Subject: RE: Bug in rte_mempool_do_generic_get?
> 
> > From: Harris, James R [mailto:james.r.har...@intel.com]
> > Sent: Friday, 24 February 2023 04.03
> >
> > Hi,
> >
> > I've tracked down a regression in SPDK to DPDK commit a2833ecc5
> ("mempool: fix get objects from mempool with cache").
> 
> The problem probably goes all the way back to the introduction of the cache
> flush threshold, which effectively increased the cache size to 1.5 times the
> configured cache size, in this commit:
> http://git.dpdk.org/dpdk/commit/lib/librte_mempool/rte_mempool.h?id=ea
> 5dd2744b90b330f07fd10f327ab99ef55c7266
> 
> It might even go further back.
> 
> >
> > Here's an example that demonstrates the problem:
> >
> > Allocate mempool with 2048 buffers and cache size 256.
> > Core 0 allocates 512 buffers.  Mempool pulls 512 + 256 buffers from
> backing ring, returns 512 of them to caller, puts the other 256 in core 0
> cache.  Backing ring now has 1280 buffers.
> > Core 1 allocates 512 buffers.  Mempool pulls 512 + 256 buffers from
> backing ring, returns 512 of them to caller, puts the other 256 in core 1
> cache.  Backing ring now has 512 buffers.
> > Core 2 allocates 512 buffers.  Mempool pulls remaining 512 buffers from
> backing ring and returns all of them to caller.  Backing ring now has 0 
> buffers.
> > Core 3 tries to allocate 512 buffers and it fails.
> >
> > In the SPDK case, we don't really need or use the mempool cache in this
> case, so changing the cache size to 0 fixes the problem and is what we're
> going to move forward with.
> 
> If you are not making get/put requests smaller than the cache size, then yes,
> having no cache is the best solution.
> 
> >
> > But the behavior did cause a regression so I thought I'd mention it here.
> 
> Thank you.
> 
> > If you have a mempool with 2048 objects, shouldn't 4 cores each be able to
> do a 512 buffer bulk get, regardless of the configured cache size?
> 
> No, the scenario you described above is the expected behavior. I think it is
> documented somewhere that objects in the caches are unavailable for other
> cores, but now I cannot find where this is documented.
> 
> 
> Furthermore, since the effective per-core cache size is 1.5 * configured cache
> size, a configured cache size of 256 may leave up to 384 objects in each per-
> core cache.
> 
> With 4 cores, you can expect up to 3 * 384 = 1152 objects sitting in the
> caches of other cores. If you want to be able to pull 512 objects with each
> core, the pool size should be 4 * 512 + 1152 objects.
May be we should document this in mempool library?



[PATCH] app/testeventdev: add cipher alg option for cryptodev

2023-02-24 Thread Aakash Sasidharan
Testeventdev crypto adapter symmetric tests are only attempting
NULL cipher algorithm. This limits crypto adapter usage with only
PMDs that can support NULL cipher algorithm. Also, since NULL cipher
algorithm doesn't perform any crypto operation, the performance numbers
reported may not reflect the actual crypto capabilities of the device.
Extend the application to support non-NULL cipher algorithms.

Signed-off-by: Aakash Sasidharan 
---
 app/test-eventdev/evt_common.h   |  7 +++
 app/test-eventdev/evt_options.c  | 68 
 app/test-eventdev/evt_options.h  | 10 
 app/test-eventdev/parser.c   |  4 +-
 app/test-eventdev/parser.h   |  2 +-
 app/test-eventdev/test_perf_common.c | 57 +--
 6 files changed, 141 insertions(+), 7 deletions(-)

diff --git a/app/test-eventdev/evt_common.h b/app/test-eventdev/evt_common.h
index 15e9c34e2c..fcb3571438 100644
--- a/app/test-eventdev/evt_common.h
+++ b/app/test-eventdev/evt_common.h
@@ -47,9 +47,12 @@ enum evt_prod_type {
 
 struct evt_options {
 #define EVT_TEST_NAME_MAX_LEN 32
+#define EVT_CRYPTO_MAX_KEY_SIZE   256
+#define EVT_CRYPTO_MAX_IV_SIZE16
char test_name[EVT_TEST_NAME_MAX_LEN];
bool plcores[RTE_MAX_LCORE];
bool wlcores[RTE_MAX_LCORE];
+   bool crypto_cipher_bit_mode;
int pool_sz;
int socket_id;
int nb_stages;
@@ -64,12 +67,14 @@ struct evt_options {
uint16_t wkr_deq_dep;
uint16_t vector_size;
uint16_t eth_queues;
+   uint16_t crypto_cipher_iv_sz;
uint32_t nb_flows;
uint32_t tx_first;
uint16_t tx_pkt_sz;
uint32_t max_pkt_sz;
uint32_t prod_enq_burst_sz;
uint32_t deq_tmo_nsec;
+   uint32_t crypto_cipher_key_sz;
uint32_t q_priority:1;
uint32_t fwd_latency:1;
uint32_t ena_vector : 1;
@@ -83,6 +88,8 @@ struct evt_options {
enum evt_prod_type prod_type;
enum rte_event_crypto_adapter_mode crypto_adptr_mode;
enum rte_crypto_op_type crypto_op_type;
+   enum rte_crypto_cipher_algorithm crypto_cipher_alg;
+   uint8_t crypto_cipher_key[EVT_CRYPTO_MAX_KEY_SIZE];
 };
 
 static inline bool
diff --git a/app/test-eventdev/evt_options.c b/app/test-eventdev/evt_options.c
index 6c3e0e5b6a..b175c067cd 100644
--- a/app/test-eventdev/evt_options.c
+++ b/app/test-eventdev/evt_options.c
@@ -40,6 +40,8 @@ evt_options_default(struct evt_options *opt)
opt->vector_size = 64;
opt->vector_tmo_nsec = 100E3;
opt->crypto_op_type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
+   opt->crypto_cipher_alg = RTE_CRYPTO_CIPHER_NULL;
+   opt->crypto_cipher_key_sz = 0;
 }
 
 typedef int (*option_parser_t)(struct evt_options *opt,
@@ -176,6 +178,61 @@ evt_parse_crypto_op_type(struct evt_options *opt, const 
char *arg)
return ret;
 }
 
+static bool
+cipher_alg_is_bit_mode(enum rte_crypto_cipher_algorithm alg)
+{
+   return (alg == RTE_CRYPTO_CIPHER_SNOW3G_UEA2 ||
+   alg == RTE_CRYPTO_CIPHER_ZUC_EEA3 ||
+   alg == RTE_CRYPTO_CIPHER_KASUMI_F8);
+}
+
+static int
+evt_parse_crypto_cipher_alg(struct evt_options *opt, const char *arg)
+{
+   enum rte_crypto_cipher_algorithm cipher_alg;
+
+   if (rte_cryptodev_get_cipher_algo_enum(&cipher_alg, arg) < 0) {
+   RTE_LOG(ERR, USER1, "Invalid cipher algorithm specified\n");
+   return -1;
+   }
+
+   opt->crypto_cipher_alg = cipher_alg;
+   opt->crypto_cipher_bit_mode = cipher_alg_is_bit_mode(cipher_alg);
+
+   return 0;
+}
+
+static int
+evt_parse_crypto_cipher_key(struct evt_options *opt, const char *arg)
+{
+   opt->crypto_cipher_key_sz = EVT_CRYPTO_MAX_KEY_SIZE;
+   if (parse_hex_string(arg, opt->crypto_cipher_key,
+(uint32_t *)&opt->crypto_cipher_key_sz)) {
+   RTE_LOG(ERR, USER1, "Invalid cipher key specified\n");
+   return -1;
+   }
+
+   return 0;
+}
+
+static int
+evt_parse_crypto_cipher_iv_sz(struct evt_options *opt, const char *arg)
+{
+   uint16_t iv_sz;
+   int ret;
+
+   ret = parser_read_uint16(&(iv_sz), arg);
+   if (iv_sz > EVT_CRYPTO_MAX_IV_SIZE) {
+   RTE_LOG(ERR, USER1,
+   "Unsupported cipher IV length [%d] specified\n",
+   iv_sz);
+   return -1;
+   }
+
+   opt->crypto_cipher_iv_sz = iv_sz;
+   return ret;
+}
+
 static int
 evt_parse_test_name(struct evt_options *opt, const char *arg)
 {
@@ -404,6 +461,11 @@ usage(char *program)
"\t  1 for OP_FORWARD mode.\n"
"\t--crypto_op_type   : 0 for SYM ops (default) and\n"
"\t 1 for ASYM ops.\n"
+   "\t--crypto_cipher_alg : cipher algorithm to be used\n"
+   "\t  default algorithm is NULL.\n"
+   "\t--crypto_cipher_key : key for the cip

Re: [PATCH v1] eventdev/crypto: wrong crypto enqueue count stats

2023-02-24 Thread Zhang, Fan

On 11/30/2022 3:11 PM, Ganapati Kundapura wrote:

crypto_enq_count is updated on failure to enqueue ops to cryptodev.

Updated crypto_enq_count on successful enqueue of ops to cryptodev.

Signed-off-by: Ganapati Kundapura 

diff --git a/lib/eventdev/rte_event_crypto_adapter.c 
b/lib/eventdev/rte_event_crypto_adapter.c
index c293a62..e1a0367 100644
--- a/lib/eventdev/rte_event_crypto_adapter.c
+++ b/lib/eventdev/rte_event_crypto_adapter.c
@@ -485,6 +485,9 @@ eca_enq_to_cryptodev(struct event_crypto_adapter *adapter, 
struct rte_event *ev,
cdev_id,
qp_id,
&nb_enqueued);
+   stats->crypto_enq_count += nb_enqueued;
+   n += nb_enqueued;
+
/**
 * If some crypto ops failed to flush to cdev and
 * space for another batch is not available, stop
@@ -495,9 +498,6 @@ eca_enq_to_cryptodev(struct event_crypto_adapter *adapter, 
struct rte_event *ev,
&qp_info->cbuf)))
adapter->stop_enq_to_cryptodev = true;
}
-
-   stats->crypto_enq_count += nb_enqueued;
-   n += nb_enqueued;
}
  
  	return n;

Acked-by: Fan Zhang 


Re: [PATCH v2] vhost: fix madvise arguments alignment

2023-02-24 Thread Patrick Robb
UNH CI reported an ABI failure for this patch which did not report due to a
bug on our end, so I'm manually reporting it now. I see Maxime you already
predicted the issue though!

*07:58:32*  1 function with some indirect sub-type change:*07:58:32*
*07:58:32*[C] 'function int rte_vhost_get_mem_table(int,
rte_vhost_memory**)' at vhost.c:922:1 has some indirect sub-type
changes:*07:58:32*  parameter 2 of type 'rte_vhost_memory**' has
sub-type changes:*07:58:32*in pointed to type
'rte_vhost_memory*':*07:58:32*  in pointed to type 'struct
rte_vhost_memory' at rte_vhost.h:145:1:*07:58:32*type size
hasn't changed*07:58:32*1 data member change:*07:58:32*
  type of 'rte_vhost_mem_region regions[]' changed:*07:58:32*
  array element type 'struct rte_vhost_mem_region'
changed:*07:58:32*  type size changed from 448 to 512
(in bits)*07:58:32*  1 data member
insertion:*07:58:32*'uint64_t alignment', at
offset 448 (in bits) at rte_vhost.h:139:1*07:58:32*
type size hasn't changed*07:58:32*  *07:58:32*  Error: ABI issue
reported for abidiff --suppr dpdk/devtools/libabigail.abignore
--no-added-syms --headers-dir1 reference/include --headers-dir2
build_install/include reference/dump/librte_vhost.dump
build_install/dump/librte_vhost.dump*07:58:32*  ABIDIFF_ABI_CHANGE,
this change requires a review (abidiff flagged this as a potential
issue).


On Thu, Feb 23, 2023 at 11:57 AM Mike Pattrick  wrote:

> On Thu, Feb 23, 2023 at 11:12 AM Maxime Coquelin
>  wrote:
> >
> > Hi Mike,
> >
> > Thanks for  looking into this issue.
> >
> > On 2/23/23 05:35, Mike Pattrick wrote:
> > > The arguments passed to madvise should be aligned to the alignment of
> > > the backing memory. Now we keep track of each regions alignment and use
> > > then when setting coredump preferences. To facilitate this, a new
> member
> > > was added to rte_vhost_mem_region. A new function was added to easily
> > > translate memory address back to region alignment. Unneeded calls to
> > > madvise were reduced, as the cache removal case should already be
> > > covered by the cache insertion case. The previously inline function
> > > mem_set_dump was removed from a header file and made not inline.
> > >
> > > Fixes: 338ad77c9ed3 ("vhost: exclude VM hugepages from coredumps")
> > >
> > > Signed-off-by: Mike Pattrick 
> > > ---
> > > Since v1:
> > >   - Corrected a cast for 32bit compiles
> > > ---
> > >   lib/vhost/iotlb.c  |  9 +++---
> > >   lib/vhost/rte_vhost.h  |  1 +
> > >   lib/vhost/vhost.h  | 12 ++--
> > >   lib/vhost/vhost_user.c | 63
> +++---
> > >   4 files changed, 60 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/lib/vhost/iotlb.c b/lib/vhost/iotlb.c
> > > index a0b8fd7302..5293507b63 100644
> > > --- a/lib/vhost/iotlb.c
> > > +++ b/lib/vhost/iotlb.c
> > > @@ -149,7 +149,6 @@ vhost_user_iotlb_cache_remove_all(struct
> vhost_virtqueue *vq)
> > >   rte_rwlock_write_lock(&vq->iotlb_lock);
> > >
> > >   RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) {
> > > - mem_set_dump((void *)(uintptr_t)node->uaddr, node->size,
> true);
> >
> > Hmm, it should have been called with enable=false here since we are
> > removing the entry from the IOTLB cache. It should be kept in order to
> > "DONTDUMP" pages evicted from the cache.
>
> Here I was thinking that if we add an entry and then remove a
> different entry, they could be in the same page. But on I should have
> kept an enable=false in remove_all().
>
> And now that I think about it again, I could just check if there are
> any active cache entries in the page on every evict/remove, they're
> sorted so that should be an easy check. Unless there are any
> objections I'll go forward with that.
>
> >
> > >   TAILQ_REMOVE(&vq->iotlb_list, node, next);
> > >   vhost_user_iotlb_pool_put(vq, node);
> > >   }
> > > @@ -171,7 +170,6 @@ vhost_user_iotlb_cache_random_evict(struct
> vhost_virtqueue *vq)
> > >
> > >   RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) {
> > >   if (!entry_idx) {
> > > - mem_set_dump((void *)(uintptr_t)node->uaddr,
> node->size, true);
> >
> > Same here.
> >
> > >   TAILQ_REMOVE(&vq->iotlb_list, node, next);
> > >   vhost_user_iotlb_pool_put(vq, node);
> > >   vq->iotlb_cache_nr--;
> > > @@ -224,14 +222,16 @@ vhost_user_iotlb_cache_insert(struct virtio_net
> *dev, struct vhost_virtqueue *vq
> > >   vhost_user_iotlb_pool_put(vq, new_node);
> > >   goto unlock;
> > >   } else if (node->iova > new_node->iova) {
> > > - mem_set_dump((void *)(uintptr_t)node->uaddr,
> node->size, true);
> > > + mem_set_dump((void *)(uintptr_t)new_node->uaddr,
> new_node->size, true,
> > 

[PATCH v2 01/20] malloc: rework heap lock handling

2023-02-24 Thread David Marchand
Move all heap->lock manipulation to malloc_heap.c to have a single
location where to look at and make higher level code unaware of this
locking constraint.

The destroy helper has been reworked to zero all the heap object but
leave the lock untouched. The heap lock is then released through the
standard API.
This makes the code less scary even though, at this point of its life,
the heap object is probably referenced only by the caller.

Signed-off-by: David Marchand 
---
 lib/eal/common/malloc_heap.c | 45 +++-
 lib/eal/common/rte_malloc.c  | 10 
 2 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index d7c410b786..cc84dce672 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -1292,6 +1292,8 @@ int
 malloc_heap_add_external_memory(struct malloc_heap *heap,
struct rte_memseg_list *msl)
 {
+   rte_spinlock_lock(&heap->lock);
+
/* erase contents of new memory */
memset(msl->base_va, 0, msl->len);
 
@@ -1308,6 +1310,11 @@ malloc_heap_add_external_memory(struct malloc_heap *heap,
eal_memalloc_mem_event_notify(RTE_MEM_EVENT_ALLOC,
msl->base_va, msl->len);
 
+   /* mark it as heap segment */
+   msl->heap = 1;
+
+   rte_spinlock_unlock(&heap->lock);
+
return 0;
 }
 
@@ -1315,7 +1322,12 @@ int
 malloc_heap_remove_external_memory(struct malloc_heap *heap, void *va_addr,
size_t len)
 {
-   struct malloc_elem *elem = heap->first;
+   struct malloc_elem *elem;
+   int ret = -1;
+
+   rte_spinlock_lock(&heap->lock);
+
+   elem = heap->first;
 
/* find element with specified va address */
while (elem != NULL && elem != va_addr) {
@@ -1323,20 +1335,24 @@ malloc_heap_remove_external_memory(struct malloc_heap 
*heap, void *va_addr,
/* stop if we've blown past our VA */
if (elem > (struct malloc_elem *)va_addr) {
rte_errno = ENOENT;
-   return -1;
+   goto out;
}
}
/* check if element was found */
if (elem == NULL || elem->msl->len != len) {
rte_errno = ENOENT;
-   return -1;
+   goto out;
}
/* if element's size is not equal to segment len, segment is busy */
if (elem->state == ELEM_BUSY || elem->size != len) {
rte_errno = EBUSY;
-   return -1;
+   goto out;
}
-   return destroy_elem(elem, len);
+   ret = destroy_elem(elem, len);
+
+out:
+   rte_spinlock_unlock(&heap->lock);
+   return ret;
 }
 
 int
@@ -1372,23 +1388,30 @@ malloc_heap_create(struct malloc_heap *heap, const char 
*heap_name)
 int
 malloc_heap_destroy(struct malloc_heap *heap)
 {
+   int ret = -1;
+
+   rte_spinlock_lock(&heap->lock);
+
if (heap->alloc_count != 0) {
RTE_LOG(ERR, EAL, "Heap is still in use\n");
rte_errno = EBUSY;
-   return -1;
+   goto fail;
}
if (heap->first != NULL || heap->last != NULL) {
RTE_LOG(ERR, EAL, "Heap still contains memory segments\n");
rte_errno = EBUSY;
-   return -1;
+   goto fail;
}
if (heap->total_size != 0)
RTE_LOG(ERR, EAL, "Total size not zero, heap is likely 
corrupt\n");
 
-   /* after this, the lock will be dropped */
-   memset(heap, 0, sizeof(*heap));
-
-   return 0;
+   RTE_BUILD_BUG_ON(offsetof(struct malloc_heap, lock) != 0);
+   memset(RTE_PTR_ADD(heap, sizeof(heap->lock)), 0,
+   sizeof(*heap) - sizeof(heap->lock));
+   ret = 0;
+fail:
+   rte_spinlock_unlock(&heap->lock);
+   return ret;
 }
 
 int
diff --git a/lib/eal/common/rte_malloc.c b/lib/eal/common/rte_malloc.c
index d39870bf3c..4f500892f2 100644
--- a/lib/eal/common/rte_malloc.c
+++ b/lib/eal/common/rte_malloc.c
@@ -436,10 +436,7 @@ rte_malloc_heap_memory_add(const char *heap_name, void 
*va_addr, size_t len,
goto unlock;
}
 
-   rte_spinlock_lock(&heap->lock);
ret = malloc_heap_add_external_memory(heap, msl);
-   msl->heap = 1; /* mark it as heap segment */
-   rte_spinlock_unlock(&heap->lock);
 
 unlock:
rte_mcfg_mem_write_unlock();
@@ -482,9 +479,7 @@ rte_malloc_heap_memory_remove(const char *heap_name, void 
*va_addr, size_t len)
goto unlock;
}
 
-   rte_spinlock_lock(&heap->lock);
ret = malloc_heap_remove_external_memory(heap, va_addr, len);
-   rte_spinlock_unlock(&heap->lock);
if (ret != 0)
goto unlock;
 
@@ -655,12 +650,7 @@ rte_malloc_heap_destroy(const char *heap_name)
goto unlock;
}
/* sanity checks done, now we can destroy the heap */
-   rt

[PATCH v2 00/20] Enable lock annotations on most libraries and drivers

2023-02-24 Thread David Marchand
This is a followup of the series that introduced lock annotations.
I reworked and made annotations work in what seemed the easier cases.
In most cases, I chose to convert inline wrappers around the EAL lock
API to simple macro: I did not see much value in those wrappers and this
is way simpler than adding __rte_*lock_function tags everywhere.

A list of libraries and drivers still need more work as their code have
non obvious locks handling. For those components, the check is opted
out.
I leave it to their respective maintainers to enable the checks later.

FreeBSD libc pthread API has lock annotations while Linux glibc has
none.
We could simply disable the check on FreeBSD, but having this check,
a few issues got raised in drivers that are built with FreeBSD.
For now, I went with a simple #ifdef FreeBSD for pthread mutex related
annotations in our code.

Maintainers, please review.


-- 
David Marchand

Changes since v1:
- annotate code relying on pthread mutexes for FreeBSD libc that
  put annotations on its pthread API,
- annotate Windows alarm code,


David Marchand (20):
  malloc: rework heap lock handling
  mem: rework malloc heap init
  mem: annotate shared memory config locks
  hash: annotate cuckoo hash lock
  graph: annotate graph lock
  drivers: inherit lock annotations for Intel drivers
  net/cxgbe: inherit lock annotations
  net/fm10k: annotate mailbox lock
  net/sfc: rework locking in proxy code
  net/sfc: inherit lock annotations
  net/virtio: annotate lock for guest announce
  raw/ifpga: inherit lock annotations
  vdpa/sfc: inherit lock annotations
  ipc: annotate pthread mutex
  ethdev: annotate pthread mutex
  net/failsafe: fix mutex locking
  net/failsafe: annotate pthread mutex
  net/hinic: annotate pthread mutex
  eal/windows: disable lock check on alarm code
  enable lock check

 .../prog_guide/env_abstraction_layer.rst  |   5 +-
 drivers/bus/dpaa/meson.build  |   1 +
 drivers/common/cnxk/meson.build   |   1 +
 drivers/common/iavf/iavf_osdep.h  |  39 +
 drivers/common/iavf/iavf_prototype.h  |   6 -
 drivers/common/idpf/base/idpf_osdep.h |  26 +---
 drivers/common/mlx5/meson.build   |   1 +
 drivers/event/cnxk/meson.build|   1 +
 drivers/meson.build   |   2 +-
 drivers/net/bnx2x/meson.build |   1 +
 drivers/net/bnxt/meson.build  |   1 +
 drivers/net/cnxk/meson.build  |   1 +
 drivers/net/cxgbe/base/adapter.h  |  35 +
 drivers/net/enic/meson.build  |   1 +
 drivers/net/failsafe/failsafe_ether.c |   3 +-
 drivers/net/failsafe/failsafe_flow.c  |  23 ++-
 drivers/net/failsafe/failsafe_ops.c   | 142 +-
 drivers/net/failsafe/failsafe_private.h   |   6 +
 drivers/net/fm10k/fm10k_ethdev.c  |   2 +
 drivers/net/hinic/base/hinic_compat.h |   6 +
 drivers/net/hns3/meson.build  |   1 +
 drivers/net/i40e/base/i40e_osdep.h|   8 +-
 drivers/net/i40e/base/i40e_prototype.h|   5 -
 drivers/net/i40e/i40e_ethdev.c|  24 ---
 drivers/net/ice/base/ice_osdep.h  |  26 +---
 drivers/net/mlx5/meson.build  |   1 +
 drivers/net/sfc/sfc.h |  41 +
 drivers/net/sfc/sfc_ev.c  |   6 +-
 drivers/net/sfc/sfc_repr.c|  38 +
 drivers/net/sfc/sfc_repr_proxy.c  |  59 
 drivers/net/virtio/virtio_ethdev.c|   8 +-
 drivers/net/virtio/virtio_ethdev.h|   7 +-
 drivers/raw/ifpga/afu_pmd_core.c  |  17 +--
 drivers/vdpa/sfc/sfc_vdpa.h   |  41 +
 drivers/vdpa/sfc/sfc_vdpa_ops.c   |  14 +-
 lib/eal/common/eal_common_mcfg.c  |  66 
 lib/eal/common/eal_common_memory.c|  10 +-
 lib/eal/common/eal_common_proc.c  |   3 +
 lib/eal/common/malloc_heap.c  |  55 +--
 lib/eal/common/malloc_heap.h  |   3 +
 lib/eal/common/rte_malloc.c   |  10 --
 lib/eal/freebsd/eal.c |  13 ++
 lib/eal/include/rte_eal_memconfig.h   |  63 ++--
 lib/eal/linux/eal.c   |  13 ++
 lib/eal/version.map   |   4 +
 lib/eal/windows/eal.c |  13 ++
 lib/eal/windows/eal_alarm.c   |   2 +
 lib/ethdev/rte_flow.c |   8 +
 lib/graph/graph.c |  10 +-
 lib/graph/graph_private.h |  10 +-
 lib/hash/rte_cuckoo_hash.c|   8 +
 lib/ipsec/meson.build |   1 +
 lib/meson.build   |   2 +-
 lib/timer/meson.build |   1 +
 lib/vhost/meson.build |   1 -
 55 files changed, 461 insertions(+), 434 deletion

[PATCH v2 03/20] mem: annotate shared memory config locks

2023-02-24 Thread David Marchand
Expose internal locks via some internal accessors.
Then annotate rte_mcfg_xxx_(read|write)_(|un)lock.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_mcfg.c| 66 +
 lib/eal/include/rte_eal_memconfig.h | 63 +--
 lib/eal/version.map |  4 ++
 3 files changed, 91 insertions(+), 42 deletions(-)

diff --git a/lib/eal/common/eal_common_mcfg.c b/lib/eal/common/eal_common_mcfg.c
index cf4a279905..b60d41f7b6 100644
--- a/lib/eal/common/eal_common_mcfg.c
+++ b/lib/eal/common/eal_common_mcfg.c
@@ -69,102 +69,112 @@ eal_mcfg_update_from_internal(void)
mcfg->version = RTE_VERSION;
 }
 
+rte_rwlock_t *
+rte_mcfg_mem_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->memory_hotplug_lock;
+}
+
 void
 rte_mcfg_mem_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_read_lock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_read_unlock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_write_lock(rte_mcfg_mem_get_lock());
 }
 
 void
 rte_mcfg_mem_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock);
+   rte_rwlock_write_unlock(rte_mcfg_mem_get_lock());
+}
+
+rte_rwlock_t *
+rte_mcfg_tailq_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->qlock;
 }
 
 void
 rte_mcfg_tailq_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->qlock);
+   rte_rwlock_read_lock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->qlock);
+   rte_rwlock_read_unlock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->qlock);
+   rte_rwlock_write_lock(rte_mcfg_tailq_get_lock());
 }
 
 void
 rte_mcfg_tailq_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->qlock);
+   rte_rwlock_write_unlock(rte_mcfg_tailq_get_lock());
+}
+
+rte_rwlock_t *
+rte_mcfg_mempool_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->mplock;
 }
 
 void
 rte_mcfg_mempool_read_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_lock(&mcfg->mplock);
+   rte_rwlock_read_lock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_read_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_read_unlock(&mcfg->mplock);
+   rte_rwlock_read_unlock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_write_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_lock(&mcfg->mplock);
+   rte_rwlock_write_lock(rte_mcfg_mempool_get_lock());
 }
 
 void
 rte_mcfg_mempool_write_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_rwlock_write_unlock(&mcfg->mplock);
+   rte_rwlock_write_unlock(rte_mcfg_mempool_get_lock());
+}
+
+rte_spinlock_t *
+rte_mcfg_timer_get_lock(void)
+{
+   return &rte_eal_get_configuration()->mem_config->tlock;
 }
 
 void
 rte_mcfg_timer_lock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_spinlock_lock(&mcfg->tlock);
+   rte_spinlock_lock(rte_mcfg_timer_get_lock());
 }
 
 void
 rte_mcfg_timer_unlock(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-   rte_spinlock_unlock(&mcfg->tlock);
+   rte_spinlock_unlock(rte_mcfg_timer_get_lock());
 }
 
 bool
diff --git a/lib/eal/include/rte_eal_memconfig.h 
b/lib/eal/include/rte_eal_memconfig.h
index e175564647..c527f9aa29 100644
--- a/lib/eal/include/rte_eal_memconfig.h
+++ b/lib/eal/include/rte_eal_memconfig.h
@@ -7,6 +7,8 @@
 
 #include 
 
+#include 
+#include 
 
 /**
  * @file
@@ -18,89 +20,122 @@
 extern "C" {
 #endif
 
+/**
+ * Internal helpers used for lock annotations.
+ */
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_mem_get_lock(void);
+
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_tailq_get_lock(void);
+
+__rte_internal
+rte_rwlock_t *
+rte_mcfg_mempool_get_

[PATCH v2 02/20] mem: rework malloc heap init

2023-02-24 Thread David Marchand
rte_eal_memory_init() and rte_eal_malloc_heap_init() must be called in
a common section taking rte_mcfg_mem_read_lock().
Split rte_eal_malloc_heap_init() in two so that the mem lock is taken
in rte_eal_init() making lock checks trivial (once annotated in the next
patch).

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_memory.c | 10 +-
 lib/eal/common/malloc_heap.c   | 10 ++
 lib/eal/common/malloc_heap.h   |  3 +++
 lib/eal/freebsd/eal.c  | 13 +
 lib/eal/linux/eal.c| 13 +
 lib/eal/windows/eal.c  | 13 +
 6 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/lib/eal/common/eal_common_memory.c 
b/lib/eal/common/eal_common_memory.c
index c917b981bc..5e162a1092 100644
--- a/lib/eal/common/eal_common_memory.c
+++ b/lib/eal/common/eal_common_memory.c
@@ -1078,18 +1078,11 @@ rte_eal_memory_detach(void)
 int
 rte_eal_memory_init(void)
 {
-   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
const struct internal_config *internal_conf =
eal_get_internal_configuration();
-
int retval;
-   RTE_LOG(DEBUG, EAL, "Setting up physically contiguous memory...\n");
-
-   if (!mcfg)
-   return -1;
 
-   /* lock mem hotplug here, to prevent races while we init */
-   rte_mcfg_mem_read_lock();
+   RTE_LOG(DEBUG, EAL, "Setting up physically contiguous memory...\n");
 
if (rte_eal_memseg_init() < 0)
goto fail;
@@ -1108,7 +1101,6 @@ rte_eal_memory_init(void)
 
return 0;
 fail:
-   rte_mcfg_mem_read_unlock();
return -1;
 }
 
diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c
index cc84dce672..7498a05ed3 100644
--- a/lib/eal/common/malloc_heap.c
+++ b/lib/eal/common/malloc_heap.c
@@ -1442,18 +1442,20 @@ rte_eal_malloc_heap_init(void)
}
}
 
-
if (register_mp_requests()) {
RTE_LOG(ERR, EAL, "Couldn't register malloc multiprocess 
actions\n");
-   rte_mcfg_mem_read_unlock();
return -1;
}
 
-   /* unlock mem hotplug here. it's safe for primary as no requests can
+   return 0;
+}
+
+int rte_eal_malloc_heap_populate(void)
+{
+   /* mem hotplug is unlocked here. it's safe for primary as no requests 
can
 * even come before primary itself is fully initialized, and secondaries
 * do not need to initialize the heap.
 */
-   rte_mcfg_mem_read_unlock();
 
/* secondary process does not need to initialize anything */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/lib/eal/common/malloc_heap.h b/lib/eal/common/malloc_heap.h
index 3b2fbc0aa0..8f3ab57154 100644
--- a/lib/eal/common/malloc_heap.h
+++ b/lib/eal/common/malloc_heap.h
@@ -85,6 +85,9 @@ malloc_socket_to_heap_id(unsigned int socket_id);
 int
 rte_eal_malloc_heap_init(void);
 
+int
+rte_eal_malloc_heap_populate(void);
+
 void
 rte_eal_malloc_heap_cleanup(void);
 
diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index 7db4239c51..7daf22e314 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -754,13 +755,25 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
+   rte_mcfg_mem_read_lock();
+
if (rte_eal_memory_init() < 0) {
+   rte_mcfg_mem_read_unlock();
rte_eal_init_alert("Cannot init memory");
rte_errno = ENOMEM;
return -1;
}
 
if (rte_eal_malloc_heap_init() < 0) {
+   rte_mcfg_mem_read_unlock();
+   rte_eal_init_alert("Cannot init malloc heap");
+   rte_errno = ENODEV;
+   return -1;
+   }
+
+   rte_mcfg_mem_read_unlock();
+
+   if (rte_eal_malloc_heap_populate() < 0) {
rte_eal_init_alert("Cannot init malloc heap");
rte_errno = ENODEV;
return -1;
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index fabafbc39b..7242ee2c9a 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1197,7 +1198,10 @@ rte_eal_init(int argc, char **argv)
return -1;
}
 
+   rte_mcfg_mem_read_lock();
+
if (rte_eal_memory_init() < 0) {
+   rte_mcfg_mem_read_unlock();
rte_eal_init_alert("Cannot init memory");
rte_errno = ENOMEM;
return -1;
@@ -1207,6 +1211,15 @@ rte_eal_init(int argc, char **argv)
eal_hugedirs_unlock();
 
if (rte_eal_malloc_heap_init() < 0) {
+   rte_mcfg_mem_read_unlock();
+   rte_eal_init_alert("Cannot init malloc heap");
+   rte_errno = ENODEV;
+   return -1;
+   }
+
+   

[PATCH v2 04/20] hash: annotate cuckoo hash lock

2023-02-24 Thread David Marchand
__hash_rw_(reader|writer)_(|un)lock helpers take locks depending on
conditions that are fixed at the rte_hash object initialisation.
So we can tell clang that those helpers unconditionally take/release
those locks (and waive the lock check on their implementation).

Signed-off-by: David Marchand 
---
 lib/hash/rte_cuckoo_hash.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index 829b79c89a..d92a903bb3 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -575,6 +575,8 @@ rte_hash_count(const struct rte_hash *h)
 /* Read write locks implemented using rte_rwlock */
 static inline void
 __hash_rw_writer_lock(const struct rte_hash *h)
+   __rte_exclusive_lock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_lock_tm(h->readwrite_lock);
@@ -584,6 +586,8 @@ __hash_rw_writer_lock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_reader_lock(const struct rte_hash *h)
+   __rte_shared_lock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->readwrite_concur_support && h->hw_trans_mem_support)
rte_rwlock_read_lock_tm(h->readwrite_lock);
@@ -593,6 +597,8 @@ __hash_rw_reader_lock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_writer_unlock(const struct rte_hash *h)
+   __rte_unlock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->writer_takes_lock && h->hw_trans_mem_support)
rte_rwlock_write_unlock_tm(h->readwrite_lock);
@@ -602,6 +608,8 @@ __hash_rw_writer_unlock(const struct rte_hash *h)
 
 static inline void
 __hash_rw_reader_unlock(const struct rte_hash *h)
+   __rte_unlock_function(&h->readwrite_lock)
+   __rte_no_thread_safety_analysis
 {
if (h->readwrite_concur_support && h->hw_trans_mem_support)
rte_rwlock_read_unlock_tm(h->readwrite_lock);
-- 
2.39.2



[PATCH v2 05/20] graph: annotate graph lock

2023-02-24 Thread David Marchand
Export internal lock and annotate associated helper.

Signed-off-by: David Marchand 
---
 lib/graph/graph.c | 10 --
 lib/graph/graph_private.h | 10 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/lib/graph/graph.c b/lib/graph/graph.c
index a839a2803b..5582631b53 100644
--- a/lib/graph/graph.c
+++ b/lib/graph/graph.c
@@ -30,16 +30,22 @@ graph_list_head_get(void)
return &graph_list;
 }
 
+rte_spinlock_t *
+graph_spinlock_get(void)
+{
+   return &graph_lock;
+}
+
 void
 graph_spinlock_lock(void)
 {
-   rte_spinlock_lock(&graph_lock);
+   rte_spinlock_lock(graph_spinlock_get());
 }
 
 void
 graph_spinlock_unlock(void)
 {
-   rte_spinlock_unlock(&graph_lock);
+   rte_spinlock_unlock(graph_spinlock_get());
 }
 
 static int
diff --git a/lib/graph/graph_private.h b/lib/graph/graph_private.h
index 7d1b30b8ac..eacdef45f0 100644
--- a/lib/graph/graph_private.h
+++ b/lib/graph/graph_private.h
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 #include "rte_graph.h"
 #include "rte_graph_worker.h"
@@ -148,20 +149,25 @@ STAILQ_HEAD(graph_head, graph);
  */
 struct graph_head *graph_list_head_get(void);
 
+rte_spinlock_t *
+graph_spinlock_get(void);
+
 /* Lock functions */
 /**
  * @internal
  *
  * Take a lock on the graph internal spin lock.
  */
-void graph_spinlock_lock(void);
+void graph_spinlock_lock(void)
+   __rte_exclusive_lock_function(graph_spinlock_get());
 
 /**
  * @internal
  *
  * Release a lock on the graph internal spin lock.
  */
-void graph_spinlock_unlock(void);
+void graph_spinlock_unlock(void)
+   __rte_unlock_function(graph_spinlock_get());
 
 /* Graph operations */
 /**
-- 
2.39.2



[PATCH v2 06/20] drivers: inherit lock annotations for Intel drivers

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

Signed-off-by: David Marchand 
---
 drivers/common/iavf/iavf_osdep.h   | 39 ++
 drivers/common/iavf/iavf_prototype.h   |  6 
 drivers/common/idpf/base/idpf_osdep.h  | 26 +++--
 drivers/net/i40e/base/i40e_osdep.h |  8 +++---
 drivers/net/i40e/base/i40e_prototype.h |  5 
 drivers/net/i40e/i40e_ethdev.c | 24 
 drivers/net/ice/base/ice_osdep.h   | 26 +++--
 7 files changed, 20 insertions(+), 114 deletions(-)

diff --git a/drivers/common/iavf/iavf_osdep.h b/drivers/common/iavf/iavf_osdep.h
index 31d3d809f9..263d92400c 100644
--- a/drivers/common/iavf/iavf_osdep.h
+++ b/drivers/common/iavf/iavf_osdep.h
@@ -170,11 +170,6 @@ struct iavf_virt_mem {
u32 size;
 } __rte_packed;
 
-/* SW spinlock */
-struct iavf_spinlock {
-   rte_spinlock_t spinlock;
-};
-
 #define iavf_allocate_dma_mem(h, m, unused, s, a) \
iavf_allocate_dma_mem_d(h, m, s, a)
 #define iavf_free_dma_mem(h, m) iavf_free_dma_mem_d(h, m)
@@ -182,32 +177,14 @@ struct iavf_spinlock {
 #define iavf_allocate_virt_mem(h, m, s) iavf_allocate_virt_mem_d(h, m, s)
 #define iavf_free_virt_mem(h, m) iavf_free_virt_mem_d(h, m)
 
-static inline void
-iavf_init_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_init(&sp->spinlock);
-}
-
-static inline void
-iavf_acquire_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_lock(&sp->spinlock);
-}
-
-static inline void
-iavf_release_spinlock_d(struct iavf_spinlock *sp)
-{
-   rte_spinlock_unlock(&sp->spinlock);
-}
-
-static inline void
-iavf_destroy_spinlock_d(__rte_unused struct iavf_spinlock *sp)
-{
-}
+/* SW spinlock */
+struct iavf_spinlock {
+   rte_spinlock_t spinlock;
+};
 
-#define iavf_init_spinlock(_sp) iavf_init_spinlock_d(_sp)
-#define iavf_acquire_spinlock(_sp) iavf_acquire_spinlock_d(_sp)
-#define iavf_release_spinlock(_sp) iavf_release_spinlock_d(_sp)
-#define iavf_destroy_spinlock(_sp) iavf_destroy_spinlock_d(_sp)
+#define iavf_init_spinlock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define iavf_acquire_spinlock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define iavf_release_spinlock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define iavf_destroy_spinlock(sp) RTE_SET_USED(sp)
 
 #endif /* _IAVF_OSDEP_H_ */
diff --git a/drivers/common/iavf/iavf_prototype.h 
b/drivers/common/iavf/iavf_prototype.h
index b5124de5bf..ba78ec5169 100644
--- a/drivers/common/iavf/iavf_prototype.h
+++ b/drivers/common/iavf/iavf_prototype.h
@@ -76,12 +76,6 @@ STATIC INLINE struct iavf_rx_ptype_decoded 
decode_rx_desc_ptype(u8 ptype)
return iavf_ptype_lookup[ptype];
 }
 
-/* prototype for functions used for SW spinlocks */
-void iavf_init_spinlock(struct iavf_spinlock *sp);
-void iavf_acquire_spinlock(struct iavf_spinlock *sp);
-void iavf_release_spinlock(struct iavf_spinlock *sp);
-void iavf_destroy_spinlock(struct iavf_spinlock *sp);
-
 __rte_internal
 void iavf_vf_parse_hw_config(struct iavf_hw *hw,
 struct virtchnl_vf_resource *msg);
diff --git a/drivers/common/idpf/base/idpf_osdep.h 
b/drivers/common/idpf/base/idpf_osdep.h
index 99ae9cf60a..49bd7c4b21 100644
--- a/drivers/common/idpf/base/idpf_osdep.h
+++ b/drivers/common/idpf/base/idpf_osdep.h
@@ -210,28 +210,10 @@ struct idpf_lock {
rte_spinlock_t spinlock;
 };
 
-static inline void
-idpf_init_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_init(&sp->spinlock);
-}
-
-static inline void
-idpf_acquire_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_lock(&sp->spinlock);
-}
-
-static inline void
-idpf_release_lock(struct idpf_lock *sp)
-{
-   rte_spinlock_unlock(&sp->spinlock);
-}
-
-static inline void
-idpf_destroy_lock(__rte_unused struct idpf_lock *sp)
-{
-}
+#define idpf_init_lock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define idpf_acquire_lock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define idpf_release_lock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define idpf_destroy_lock(sp) RTE_SET_USED(sp)
 
 struct idpf_hw;
 
diff --git a/drivers/net/i40e/base/i40e_osdep.h 
b/drivers/net/i40e/base/i40e_osdep.h
index 51537c5cf3..aa5dc61841 100644
--- a/drivers/net/i40e/base/i40e_osdep.h
+++ b/drivers/net/i40e/base/i40e_osdep.h
@@ -215,10 +215,10 @@ struct i40e_spinlock {
rte_spinlock_t spinlock;
 };
 
-#define i40e_init_spinlock(_sp) i40e_init_spinlock_d(_sp)
-#define i40e_acquire_spinlock(_sp) i40e_acquire_spinlock_d(_sp)
-#define i40e_release_spinlock(_sp) i40e_release_spinlock_d(_sp)
-#define i40e_destroy_spinlock(_sp) i40e_destroy_spinlock_d(_sp)
+#define i40e_init_spinlock(sp) rte_spinlock_init(&(sp)->spinlock)
+#define i40e_acquire_spinlock(sp) rte_spinlock_lock(&(sp)->spinlock)
+#define i40e_release_spinlock(sp) rte_spinlock_unlock(&(sp)->spinlock)
+#define i40e_destroy_spinlock(sp) RTE_SET_USED(sp)
 
 #define I40E_NTOHS(a) rte_be_to_cpu_16(a)
 #define

[PATCH v2 08/20] net/fm10k: annotate mailbox lock

2023-02-24 Thread David Marchand
Expose requirements for helpers dealing with the
FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back) lock.

Signed-off-by: David Marchand 
---
 drivers/net/fm10k/fm10k_ethdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index 8b83063f0a..4d3c4c10cf 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -116,6 +116,7 @@ fm10k_mbx_initlock(struct fm10k_hw *hw)
 
 static void
 fm10k_mbx_lock(struct fm10k_hw *hw)
+   __rte_exclusive_lock_function(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back))
 {
while (!rte_spinlock_trylock(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back)))
rte_delay_us(FM10K_MBXLOCK_DELAY_US);
@@ -123,6 +124,7 @@ fm10k_mbx_lock(struct fm10k_hw *hw)
 
 static void
 fm10k_mbx_unlock(struct fm10k_hw *hw)
+   __rte_unlock_function(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back))
 {
rte_spinlock_unlock(FM10K_DEV_PRIVATE_TO_MBXLOCK(hw->back));
 }
-- 
2.39.2



[PATCH v2 07/20] net/cxgbe: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

Signed-off-by: David Marchand 
---
 drivers/net/cxgbe/base/adapter.h | 35 +++-
 1 file changed, 7 insertions(+), 28 deletions(-)

diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h
index 16cbc1a345..8f2ffa0eeb 100644
--- a/drivers/net/cxgbe/base/adapter.h
+++ b/drivers/net/cxgbe/base/adapter.h
@@ -351,28 +351,19 @@ struct adapter {
  * t4_os_rwlock_init - initialize rwlock
  * @lock: the rwlock
  */
-static inline void t4_os_rwlock_init(rte_rwlock_t *lock)
-{
-   rte_rwlock_init(lock);
-}
+#define t4_os_rwlock_init(lock) rte_rwlock_init(lock)
 
 /**
  * t4_os_write_lock - get a write lock
  * @lock: the rwlock
  */
-static inline void t4_os_write_lock(rte_rwlock_t *lock)
-{
-   rte_rwlock_write_lock(lock);
-}
+#define t4_os_write_lock(lock) rte_rwlock_write_lock(lock)
 
 /**
  * t4_os_write_unlock - unlock a write lock
  * @lock: the rwlock
  */
-static inline void t4_os_write_unlock(rte_rwlock_t *lock)
-{
-   rte_rwlock_write_unlock(lock);
-}
+#define t4_os_write_unlock(lock) rte_rwlock_write_unlock(lock)
 
 /**
  * ethdev2pinfo - return the port_info structure associated with a rte_eth_dev
@@ -678,37 +669,25 @@ static inline void t4_os_set_hw_addr(struct adapter 
*adapter, int port_idx,
  * t4_os_lock_init - initialize spinlock
  * @lock: the spinlock
  */
-static inline void t4_os_lock_init(rte_spinlock_t *lock)
-{
-   rte_spinlock_init(lock);
-}
+#define t4_os_lock_init(lock) rte_spinlock_init(lock)
 
 /**
  * t4_os_lock - spin until lock is acquired
  * @lock: the spinlock
  */
-static inline void t4_os_lock(rte_spinlock_t *lock)
-{
-   rte_spinlock_lock(lock);
-}
+#define t4_os_lock(lock) rte_spinlock_lock(lock)
 
 /**
  * t4_os_unlock - unlock a spinlock
  * @lock: the spinlock
  */
-static inline void t4_os_unlock(rte_spinlock_t *lock)
-{
-   rte_spinlock_unlock(lock);
-}
+#define t4_os_unlock(lock) rte_spinlock_unlock(lock)
 
 /**
  * t4_os_trylock - try to get a lock
  * @lock: the spinlock
  */
-static inline int t4_os_trylock(rte_spinlock_t *lock)
-{
-   return rte_spinlock_trylock(lock);
-}
+#define t4_os_trylock(lock) rte_spinlock_trylock(lock)
 
 /**
  * t4_os_init_list_head - initialize
-- 
2.39.2



[PATCH v2 09/20] net/sfc: rework locking in proxy code

2023-02-24 Thread David Marchand
Remove one extra layer for proxy code: sfc_get_adapter_by_pf_port_id()
now only resolves the sa object and sfc_adapter_(|un)lock() are added
were necessary.

This will simplify lock checks later.

Signed-off-by: David Marchand 
---
 drivers/net/sfc/sfc_repr_proxy.c | 59 
 1 file changed, 30 insertions(+), 29 deletions(-)

diff --git a/drivers/net/sfc/sfc_repr_proxy.c b/drivers/net/sfc/sfc_repr_proxy.c
index 4b958ced61..4ba7683370 100644
--- a/drivers/net/sfc/sfc_repr_proxy.c
+++ b/drivers/net/sfc/sfc_repr_proxy.c
@@ -51,17 +51,9 @@ sfc_get_adapter_by_pf_port_id(uint16_t pf_port_id)
dev = &rte_eth_devices[pf_port_id];
sa = sfc_adapter_by_eth_dev(dev);
 
-   sfc_adapter_lock(sa);
-
return sa;
 }
 
-static void
-sfc_put_adapter(struct sfc_adapter *sa)
-{
-   sfc_adapter_unlock(sa);
-}
-
 static struct sfc_repr_proxy_port *
 sfc_repr_proxy_find_port(struct sfc_repr_proxy *rp, uint16_t repr_id)
 {
@@ -1289,6 +1281,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
int rc;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1341,7 +1334,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
}
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 
@@ -1352,7 +1345,7 @@ sfc_repr_proxy_add_port(uint16_t pf_port_id, uint16_t 
repr_id,
 fail_alloc_port:
 fail_port_exists:
sfc_log_init(sa, "failed: %s", rte_strerror(rc));
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return rc;
 }
@@ -1366,6 +1359,7 @@ sfc_repr_proxy_del_port(uint16_t pf_port_id, uint16_t 
repr_id)
int rc;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1393,14 +1387,14 @@ sfc_repr_proxy_del_port(uint16_t pf_port_id, uint16_t 
repr_id)
 
sfc_log_init(sa, "done");
 
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 
 fail_port_remove:
 fail_no_port:
sfc_log_init(sa, "failed: %s", rte_strerror(rc));
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return rc;
 }
@@ -1416,6 +1410,7 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1423,14 +1418,14 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return ENOENT;
}
 
rxq = &port->rxq[queue_id];
if (rp->dp_rxq[queue_id].mp != NULL && rp->dp_rxq[queue_id].mp != mp) {
sfc_err(sa, "multiple mempools per queue are not supported");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return ENOTSUP;
}
 
@@ -1440,7 +1435,7 @@ sfc_repr_proxy_add_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
rp->dp_rxq[queue_id].ref_count++;
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 
return 0;
 }
@@ -1455,6 +1450,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1462,7 +1458,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
return;
}
 
@@ -1475,7 +1471,7 @@ sfc_repr_proxy_del_rxq(uint16_t pf_port_id, uint16_t 
repr_id,
rp->dp_rxq[queue_id].mp = NULL;
 
sfc_log_init(sa, "done");
-   sfc_put_adapter(sa);
+   sfc_adapter_unlock(sa);
 }
 
 int
@@ -1489,6 +1485,7 @@ sfc_repr_proxy_add_txq(uint16_t pf_port_id, uint16_t 
repr_id,
struct sfc_adapter *sa;
 
sa = sfc_get_adapter_by_pf_port_id(pf_port_id);
+   sfc_adapter_lock(sa);
rp = sfc_repr_proxy_by_adapter(sa);
 
sfc_log_init(sa, "entry");
@@ -1496,7 +1493,7 @@ sfc_repr_proxy_add_txq(uint16_t pf_port_id, uint16_t 
repr_id,
port = sfc_repr_proxy_find_port(rp, repr_id);
if (port == NULL) {
sfc_err(sa, "%s() failed: no such port", __func__);
-   sfc_

[PATCH v2 10/20] net/sfc: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

One additional change is required in sfc_ev_qpoll() so that clang does
see the same lock is being manipulated.

Signed-off-by: David Marchand 
---
 drivers/net/sfc/sfc.h  | 41 ++
 drivers/net/sfc/sfc_ev.c   |  6 --
 drivers/net/sfc/sfc_repr.c | 38 +--
 3 files changed, 15 insertions(+), 70 deletions(-)

diff --git a/drivers/net/sfc/sfc.h b/drivers/net/sfc/sfc.h
index 0a1e224fa2..730d054aea 100644
--- a/drivers/net/sfc/sfc.h
+++ b/drivers/net/sfc/sfc.h
@@ -333,41 +333,12 @@ sfc_sa2shared(struct sfc_adapter *sa)
  * change the lock in one place.
  */
 
-static inline void
-sfc_adapter_lock_init(struct sfc_adapter *sa)
-{
-   rte_spinlock_init(&sa->lock);
-}
-
-static inline int
-sfc_adapter_is_locked(struct sfc_adapter *sa)
-{
-   return rte_spinlock_is_locked(&sa->lock);
-}
-
-static inline void
-sfc_adapter_lock(struct sfc_adapter *sa)
-{
-   rte_spinlock_lock(&sa->lock);
-}
-
-static inline int
-sfc_adapter_trylock(struct sfc_adapter *sa)
-{
-   return rte_spinlock_trylock(&sa->lock);
-}
-
-static inline void
-sfc_adapter_unlock(struct sfc_adapter *sa)
-{
-   rte_spinlock_unlock(&sa->lock);
-}
-
-static inline void
-sfc_adapter_lock_fini(__rte_unused struct sfc_adapter *sa)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_adapter_lock_init(sa) rte_spinlock_init(&(sa)->lock)
+#define sfc_adapter_is_locked(sa) rte_spinlock_is_locked(&(sa)->lock)
+#define sfc_adapter_lock(sa) rte_spinlock_lock(&(sa)->lock)
+#define sfc_adapter_trylock(sa) rte_spinlock_trylock(&(sa)->lock)
+#define sfc_adapter_unlock(sa) rte_spinlock_unlock(&(sa)->lock)
+#define sfc_adapter_lock_fini(sa) RTE_SET_USED(sa)
 
 static inline unsigned int
 sfc_nb_counter_rxq(const struct sfc_adapter_shared *sas)
diff --git a/drivers/net/sfc/sfc_ev.c b/drivers/net/sfc/sfc_ev.c
index f949abbfc3..c0d58c9554 100644
--- a/drivers/net/sfc/sfc_ev.c
+++ b/drivers/net/sfc/sfc_ev.c
@@ -570,6 +570,8 @@ static const efx_ev_callbacks_t sfc_ev_callbacks_dp_tx = {
 void
 sfc_ev_qpoll(struct sfc_evq *evq)
 {
+   struct sfc_adapter *sa;
+
SFC_ASSERT(evq->init_state == SFC_EVQ_STARTED ||
   evq->init_state == SFC_EVQ_STARTING);
 
@@ -577,8 +579,8 @@ sfc_ev_qpoll(struct sfc_evq *evq)
 
efx_ev_qpoll(evq->common, &evq->read_ptr, evq->callbacks, evq);
 
-   if (unlikely(evq->exception) && sfc_adapter_trylock(evq->sa)) {
-   struct sfc_adapter *sa = evq->sa;
+   sa = evq->sa;
+   if (unlikely(evq->exception) && sfc_adapter_trylock(sa)) {
int rc;
 
if (evq->dp_rxq != NULL) {
diff --git a/drivers/net/sfc/sfc_repr.c b/drivers/net/sfc/sfc_repr.c
index 4b03b101d8..017c3fb247 100644
--- a/drivers/net/sfc/sfc_repr.c
+++ b/drivers/net/sfc/sfc_repr.c
@@ -112,39 +112,11 @@ sfc_repr_by_eth_dev(struct rte_eth_dev *eth_dev)
  * change the lock in one place.
  */
 
-static inline void
-sfc_repr_lock_init(struct sfc_repr *sr)
-{
-   rte_spinlock_init(&sr->lock);
-}
-
-#if defined(RTE_LIBRTE_SFC_EFX_DEBUG) || defined(RTE_ENABLE_ASSERT)
-
-static inline int
-sfc_repr_lock_is_locked(struct sfc_repr *sr)
-{
-   return rte_spinlock_is_locked(&sr->lock);
-}
-
-#endif
-
-static inline void
-sfc_repr_lock(struct sfc_repr *sr)
-{
-   rte_spinlock_lock(&sr->lock);
-}
-
-static inline void
-sfc_repr_unlock(struct sfc_repr *sr)
-{
-   rte_spinlock_unlock(&sr->lock);
-}
-
-static inline void
-sfc_repr_lock_fini(__rte_unused struct sfc_repr *sr)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_repr_lock_init(sr) rte_spinlock_init(&(sr)->lock)
+#define sfc_repr_lock_is_locked(sr) rte_spinlock_is_locked(&(sr)->lock)
+#define sfc_repr_lock(sr) rte_spinlock_lock(&(sr)->lock)
+#define sfc_repr_unlock(sr) rte_spinlock_unlock(&(sr)->lock)
+#define sfc_repr_lock_fini(sr) RTE_SET_USED(sr)
 
 static void
 sfc_repr_rx_queue_stop(void *queue)
-- 
2.39.2



[PATCH v2 12/20] raw/ifpga: inherit lock annotations

2023-02-24 Thread David Marchand
The checks in those helpers are useless:
- all (start/stop/reset/test) callers ensure that dev != NULL,
- dev->sd can't be NULL either as it would mean the application is calling
  those helpers for a dev pointer that did not pass initialisation,

Once the checks are removed, the only thing that remains is calls to the
rte_spinlock API, so simply use macros and inherit annotations from the
lock API.

Signed-off-by: David Marchand 
---
 drivers/raw/ifpga/afu_pmd_core.c | 17 ++---
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/raw/ifpga/afu_pmd_core.c b/drivers/raw/ifpga/afu_pmd_core.c
index ddf7a34f33..3ab1f47ac1 100644
--- a/drivers/raw/ifpga/afu_pmd_core.c
+++ b/drivers/raw/ifpga/afu_pmd_core.c
@@ -23,21 +23,8 @@ static struct rte_afu_uuid 
afu_pmd_uuid_map[AFU_RAWDEV_MAX_DRVS+1];
 TAILQ_HEAD(afu_drv_list, afu_rawdev_drv);
 static struct afu_drv_list afu_pmd_list = TAILQ_HEAD_INITIALIZER(afu_pmd_list);
 
-static inline int afu_rawdev_trylock(struct afu_rawdev *dev)
-{
-   if (!dev || !dev->sd)
-   return 0;
-
-   return rte_spinlock_trylock(&dev->sd->lock);
-}
-
-static inline void afu_rawdev_unlock(struct afu_rawdev *dev)
-{
-   if (!dev || !dev->sd)
-   return;
-
-   rte_spinlock_unlock(&dev->sd->lock);
-}
+#define afu_rawdev_trylock(dev) rte_spinlock_trylock(&dev->sd->lock)
+#define afu_rawdev_unlock(dev) rte_spinlock_unlock(&dev->sd->lock)
 
 static int afu_rawdev_configure(const struct rte_rawdev *rawdev,
rte_rawdev_obj_t config, size_t config_size)
-- 
2.39.2



[PATCH v2 11/20] net/virtio: annotate lock for guest announce

2023-02-24 Thread David Marchand
Expose requirements for helpers dealing with the
VIRTIO_DEV_TO_HW(dev)->state_lock lock.

Signed-off-by: David Marchand 
---
 drivers/net/virtio/virtio_ethdev.c | 8 
 drivers/net/virtio/virtio_ethdev.h | 7 +--
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 0103d95920..a3de44958c 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1149,11 +1149,11 @@ virtio_dev_pause(struct rte_eth_dev *dev)
 {
struct virtio_hw *hw = dev->data->dev_private;
 
-   rte_spinlock_lock(&hw->state_lock);
+   rte_spinlock_lock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 
if (hw->started == 0) {
/* Device is just stopped. */
-   rte_spinlock_unlock(&hw->state_lock);
+   rte_spinlock_unlock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
return -1;
}
hw->started = 0;
@@ -1174,7 +1174,7 @@ virtio_dev_resume(struct rte_eth_dev *dev)
struct virtio_hw *hw = dev->data->dev_private;
 
hw->started = 1;
-   rte_spinlock_unlock(&hw->state_lock);
+   rte_spinlock_unlock(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 }
 
 /*
@@ -1217,7 +1217,7 @@ virtio_notify_peers(struct rte_eth_dev *dev)
}
 
/* If virtio port just stopped, no need to send RARP */
-   if (virtio_dev_pause(dev) < 0) {
+   if (virtio_dev_pause(dev) != 0) {
rte_pktmbuf_free(rarp_mbuf);
return;
}
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index c08f382791..ece0130603 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -112,8 +112,11 @@ int eth_virtio_dev_init(struct rte_eth_dev *eth_dev);
 
 void virtio_interrupt_handler(void *param);
 
-int virtio_dev_pause(struct rte_eth_dev *dev);
-void virtio_dev_resume(struct rte_eth_dev *dev);
+#define VIRTIO_DEV_TO_HW(dev) ((struct virtio_hw *)(dev)->data->dev_private)
+int virtio_dev_pause(struct rte_eth_dev *dev)
+   __rte_exclusive_trylock_function(0, &VIRTIO_DEV_TO_HW(dev)->state_lock);
+void virtio_dev_resume(struct rte_eth_dev *dev)
+   __rte_unlock_function(&VIRTIO_DEV_TO_HW(dev)->state_lock);
 int virtio_dev_stop(struct rte_eth_dev *dev);
 int virtio_dev_close(struct rte_eth_dev *dev);
 int virtio_inject_pkts(struct rte_eth_dev *dev, struct rte_mbuf **tx_pkts,
-- 
2.39.2



[PATCH v2 13/20] vdpa/sfc: inherit lock annotations

2023-02-24 Thread David Marchand
Due to clang limitation, inline helpers don't inherit lock annotations
from the EAL lock API.
Replace them with macros.

sfc_vdpa_ops.c was relying on an implicit cast of the dev_handle to a
vdpa adapter object. Add an explicit conversion.

Signed-off-by: David Marchand 
---
 drivers/vdpa/sfc/sfc_vdpa.h | 41 +
 drivers/vdpa/sfc/sfc_vdpa_ops.c | 14 +--
 2 files changed, 13 insertions(+), 42 deletions(-)

diff --git a/drivers/vdpa/sfc/sfc_vdpa.h b/drivers/vdpa/sfc/sfc_vdpa.h
index b25eb3a5fe..2b843e563d 100644
--- a/drivers/vdpa/sfc/sfc_vdpa.h
+++ b/drivers/vdpa/sfc/sfc_vdpa.h
@@ -122,40 +122,11 @@ sfc_vdpa_adapter_by_dev_handle(void *dev_handle)
  * Add wrapper functions to acquire/release lock to be able to remove or
  * change the lock in one place.
  */
-static inline void
-sfc_vdpa_adapter_lock_init(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_init(&sva->lock);
-}
-
-static inline int
-sfc_vdpa_adapter_is_locked(struct sfc_vdpa_adapter *sva)
-{
-   return rte_spinlock_is_locked(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_lock(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_lock(&sva->lock);
-}
-
-static inline int
-sfc_vdpa_adapter_trylock(struct sfc_vdpa_adapter *sva)
-{
-   return rte_spinlock_trylock(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_unlock(struct sfc_vdpa_adapter *sva)
-{
-   rte_spinlock_unlock(&sva->lock);
-}
-
-static inline void
-sfc_vdpa_adapter_lock_fini(__rte_unused struct sfc_vdpa_adapter *sva)
-{
-   /* Just for symmetry of the API */
-}
+#define sfc_vdpa_adapter_lock_init(sva) rte_spinlock_init(&(sva)->lock)
+#define sfc_vdpa_adapter_is_locked(sva) rte_spinlock_is_locked(&(sva)->lock)
+#define sfc_vdpa_adapter_lock(sva) rte_spinlock_lock(&(sva)->lock)
+#define sfc_vdpa_adapter_trylock(sva) rte_spinlock_trylock(&(sva)->lock)
+#define sfc_vdpa_adapter_unlock(sva) rte_spinlock_unlock(&(sva)->lock)
+#define sfc_vdpa_adapter_lock_fini(sva) RTE_SET_USED(sva)
 
 #endif  /* _SFC_VDPA_H */
diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.c b/drivers/vdpa/sfc/sfc_vdpa_ops.c
index 6401d4e16f..afa6b7e8ef 100644
--- a/drivers/vdpa/sfc/sfc_vdpa_ops.c
+++ b/drivers/vdpa/sfc/sfc_vdpa_ops.c
@@ -577,7 +577,7 @@ sfc_vdpa_notify_ctrl(void *arg)
if (ops_data == NULL)
return NULL;
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
vid = ops_data->vid;
 
@@ -586,7 +586,7 @@ sfc_vdpa_notify_ctrl(void *arg)
  "vDPA (%s): Notifier could not get configured",
  ops_data->vdpa_dev->device->name);
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return NULL;
 }
@@ -637,7 +637,7 @@ sfc_vdpa_dev_config(int vid)
 
ops_data->vid = vid;
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
sfc_vdpa_log_init(ops_data->dev_handle, "configuring");
rc = sfc_vdpa_configure(ops_data);
@@ -653,7 +653,7 @@ sfc_vdpa_dev_config(int vid)
if (rc != 0)
goto fail_vdpa_notify;
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
sfc_vdpa_log_init(ops_data->dev_handle, "done");
 
@@ -666,7 +666,7 @@ sfc_vdpa_dev_config(int vid)
sfc_vdpa_close(ops_data);
 
 fail_vdpa_config:
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return -1;
 }
@@ -688,7 +688,7 @@ sfc_vdpa_dev_close(int vid)
return -1;
}
 
-   sfc_vdpa_adapter_lock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_lock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
if (ops_data->is_notify_thread_started == true) {
void *status;
ret = pthread_cancel(ops_data->notify_tid);
@@ -710,7 +710,7 @@ sfc_vdpa_dev_close(int vid)
sfc_vdpa_stop(ops_data);
sfc_vdpa_close(ops_data);
 
-   sfc_vdpa_adapter_unlock(ops_data->dev_handle);
+   
sfc_vdpa_adapter_unlock(sfc_vdpa_adapter_by_dev_handle(ops_data->dev_handle));
 
return 0;
 }
-- 
2.39.2



[PATCH v2 14/20] ipc: annotate pthread mutex

2023-02-24 Thread David Marchand
pthread_cond_timedwait() requires pending_requests.lock being taken.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_proc.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 1fc1d6c53b..aa060ae4dc 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -908,6 +908,9 @@ mp_request_async(const char *dst, struct rte_mp_msg *req,
 static int
 mp_request_sync(const char *dst, struct rte_mp_msg *req,
   struct rte_mp_reply *reply, const struct timespec *ts)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_exclusive_locks_required(pending_requests.lock)
+#endif
 {
int ret;
pthread_condattr_t attr;
-- 
2.39.2



[PATCH v2 15/20] ethdev: annotate pthread mutex

2023-02-24 Thread David Marchand
fts_enter/exit take mutexes depending on device flags set at
initialisation.
Annotate those helpers and, since clang does not support conditional
locking, waive the lock check on their implementation.

Signed-off-by: David Marchand 
---
 lib/ethdev/rte_flow.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index 69e6e749f7..c66625d1fe 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -297,6 +297,10 @@ rte_flow_dynf_metadata_register(void)
 
 static inline void
 fts_enter(struct rte_eth_dev *dev)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_exclusive_lock_function(dev->data->flow_ops_mutex)
+   __rte_no_thread_safety_analysis
+#endif
 {
if (!(dev->data->dev_flags & RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE))
pthread_mutex_lock(&dev->data->flow_ops_mutex);
@@ -304,6 +308,10 @@ fts_enter(struct rte_eth_dev *dev)
 
 static inline void
 fts_exit(struct rte_eth_dev *dev)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_unlock_function(dev->data->flow_ops_mutex)
+   __rte_no_thread_safety_analysis
+#endif
 {
if (!(dev->data->dev_flags & RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE))
pthread_mutex_unlock(&dev->data->flow_ops_mutex);
-- 
2.39.2



[PATCH v2 16/20] net/failsafe: fix mutex locking

2023-02-24 Thread David Marchand
The pthread mutex API describes cases where locking might fail.
Check fts_enter wrapper return code.

Signed-off-by: David Marchand 
---
 drivers/net/failsafe/failsafe_ether.c |   3 +-
 drivers/net/failsafe/failsafe_flow.c  |  23 +++--
 drivers/net/failsafe/failsafe_ops.c   | 142 +++---
 3 files changed, 123 insertions(+), 45 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_ether.c 
b/drivers/net/failsafe/failsafe_ether.c
index 10b90fd837..031f3eb13f 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -592,7 +592,8 @@ failsafe_eth_rmv_event_callback(uint16_t port_id 
__rte_unused,
 {
struct sub_device *sdev = cb_arg;
 
-   fs_lock(fs_dev(sdev), 0);
+   if (fs_lock(fs_dev(sdev), 0) != 0)
+   return -1;
/* Switch as soon as possible tx_dev. */
fs_switch_dev(fs_dev(sdev), sdev);
/* Use safe bursts in any case. */
diff --git a/drivers/net/failsafe/failsafe_flow.c 
b/drivers/net/failsafe/failsafe_flow.c
index 354f9fec20..707e6c63b5 100644
--- a/drivers/net/failsafe/failsafe_flow.c
+++ b/drivers/net/failsafe/failsafe_flow.c
@@ -72,7 +72,9 @@ fs_flow_validate(struct rte_eth_dev *dev,
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
DEBUG("Calling rte_flow_validate on sub_device %d", i);
ret = rte_flow_validate(PORT_ID(sdev),
@@ -99,7 +101,8 @@ fs_flow_create(struct rte_eth_dev *dev,
struct rte_flow *flow;
uint8_t i;
 
-   fs_lock(dev, 0);
+   if (fs_lock(dev, 0) != 0)
+   return NULL;
flow = fs_flow_allocate(attr, patterns, actions);
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
flow->flows[i] = rte_flow_create(PORT_ID(sdev),
@@ -137,8 +140,9 @@ fs_flow_destroy(struct rte_eth_dev *dev,
ERROR("Invalid flow");
return -EINVAL;
}
-   ret = 0;
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
int local_ret;
 
@@ -169,7 +173,9 @@ fs_flow_flush(struct rte_eth_dev *dev,
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
DEBUG("Calling rte_flow_flush on sub_device %d", i);
ret = rte_flow_flush(PORT_ID(sdev), error);
@@ -197,7 +203,8 @@ fs_flow_query(struct rte_eth_dev *dev,
 {
struct sub_device *sdev;
 
-   fs_lock(dev, 0);
+   if (fs_lock(dev, 0) != 0)
+   return -1;
sdev = TX_SUBDEV(dev);
if (sdev != NULL) {
int ret = rte_flow_query(PORT_ID(sdev),
@@ -223,7 +230,9 @@ fs_flow_isolate(struct rte_eth_dev *dev,
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV(sdev, i, dev) {
if (sdev->state < DEV_PROBED)
continue;
diff --git a/drivers/net/failsafe/failsafe_ops.c 
b/drivers/net/failsafe/failsafe_ops.c
index d357e1bc83..35649b6244 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -28,7 +28,9 @@ fs_dev_configure(struct rte_eth_dev *dev)
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV(sdev, i, dev) {
int rmv_interrupt = 0;
int lsc_interrupt = 0;
@@ -129,7 +131,9 @@ fs_dev_start(struct rte_eth_dev *dev)
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
ret = failsafe_rx_intr_install(dev);
if (ret) {
fs_unlock(dev, 0);
@@ -189,7 +193,9 @@ fs_dev_stop(struct rte_eth_dev *dev)
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
PRIV(dev)->state = DEV_STARTED - 1;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_STARTED) {
ret = rte_eth_dev_stop(PORT_ID(sdev));
@@ -217,7 +223,9 @@ fs_dev_set_link_up(struct rte_eth_dev *dev)
uint8_t i;
int ret;
 
-   fs_lock(dev, 0);
+   ret = fs_lock(dev, 0);
+   if (ret != 0)
+   return ret;
FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
DEBUG("Calling rte_eth_dev_set_link_up on sub_device %d", i);
ret = rte_eth_dev_set_link_up(PORT_ID(sdev));
@@ -239,7 +247,9 @@ fs_dev_set_link_down(struct rte_eth_dev *dev)
uint8_t i;
int ret;
 
-   fs

[PATCH v2 18/20] net/hinic: annotate pthread mutex

2023-02-24 Thread David Marchand
Annotate wrappers on top of the pthread mutex API.

Signed-off-by: David Marchand 
---
 drivers/net/hinic/base/hinic_compat.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/hinic/base/hinic_compat.h 
b/drivers/net/hinic/base/hinic_compat.h
index aea332046e..89732ca11a 100644
--- a/drivers/net/hinic/base/hinic_compat.h
+++ b/drivers/net/hinic/base/hinic_compat.h
@@ -224,6 +224,9 @@ static inline int hinic_mutex_destroy(pthread_mutex_t 
*pthreadmutex)
 }
 
 static inline int hinic_mutex_lock(pthread_mutex_t *pthreadmutex)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_exclusive_trylock_function(0, *pthreadmutex)
+#endif
 {
int err;
struct timespec tout;
@@ -239,6 +242,9 @@ static inline int hinic_mutex_lock(pthread_mutex_t 
*pthreadmutex)
 }
 
 static inline int hinic_mutex_unlock(pthread_mutex_t *pthreadmutex)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_unlock_function(*pthreadmutex)
+#endif
 {
return pthread_mutex_unlock(pthreadmutex);
 }
-- 
2.39.2



[PATCH v2 17/20] net/failsafe: annotate pthread mutex

2023-02-24 Thread David Marchand
Annotate wrappers on top of the pthread mutex API.

Signed-off-by: David Marchand 
---
 drivers/net/failsafe/failsafe_private.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_private.h 
b/drivers/net/failsafe/failsafe_private.h
index 53a451c1b1..97b6f5300d 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -403,6 +403,9 @@ fs_dev(struct sub_device *sdev) {
  */
 static inline int
 fs_lock(struct rte_eth_dev *dev, unsigned int is_alarm)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_exclusive_trylock_function(0, PRIV(dev)->hotplug_mutex)
+#endif
 {
int ret;
 
@@ -430,6 +433,9 @@ fs_lock(struct rte_eth_dev *dev, unsigned int is_alarm)
  */
 static inline void
 fs_unlock(struct rte_eth_dev *dev, unsigned int is_alarm)
+#ifdef RTE_EXEC_ENV_FREEBSD
+   __rte_unlock_function(PRIV(dev)->hotplug_mutex)
+#endif
 {
int ret;
 
-- 
2.39.2



[PATCH v2 20/20] enable lock check

2023-02-24 Thread David Marchand
Now that a lot of components can be compiled with the lock checks,
invert the logic and opt out for components not ready yet:
- drivers/bus/dpaa,
- drivers/common/cnxk,
- drivers/common/mlx5,
- drivers/event/cnxk,
- drivers/net/bnx2x,
- drivers/net/bnxt,
- drivers/net/cnxk,
- drivers/net/enic,
- drivers/net/hns3,
- drivers/net/mlx5,
- lib/ipsec,
- lib/timer,

Signed-off-by: David Marchand 
---
 doc/guides/prog_guide/env_abstraction_layer.rst | 5 +++--
 drivers/bus/dpaa/meson.build| 1 +
 drivers/common/cnxk/meson.build | 1 +
 drivers/common/mlx5/meson.build | 1 +
 drivers/event/cnxk/meson.build  | 1 +
 drivers/meson.build | 2 +-
 drivers/net/bnx2x/meson.build   | 1 +
 drivers/net/bnxt/meson.build| 1 +
 drivers/net/cnxk/meson.build| 1 +
 drivers/net/enic/meson.build| 1 +
 drivers/net/hns3/meson.build| 1 +
 drivers/net/mlx5/meson.build| 1 +
 lib/ipsec/meson.build   | 1 +
 lib/meson.build | 2 +-
 lib/timer/meson.build   | 1 +
 lib/vhost/meson.build   | 1 -
 16 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst 
b/doc/guides/prog_guide/env_abstraction_layer.rst
index 3f33621e05..93c8a031be 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -550,8 +550,9 @@ Some general comments:
   waiving checks with ``__rte_no_thread_safety_analysis`` in your code, please
   discuss it on the mailing list,
 
-A DPDK library/driver can enable/disable the checks by setting
-``annotate_locks`` accordingly in its ``meson.build`` file.
+The checks are enabled by default for libraries and drivers.
+They can be disabled by setting ``annotate_locks`` to ``false`` in
+the concerned library/driver ``meson.build``.
 
 IOVA Mode Detection
 ~~~
diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build
index 5506f2bffc..183b251459 100644
--- a/drivers/bus/dpaa/meson.build
+++ b/drivers/bus/dpaa/meson.build
@@ -29,3 +29,4 @@ if cc.has_argument('-Wno-pointer-arith')
 endif
 
 includes += include_directories('include', 'base/qbman')
+annotate_locks = false
diff --git a/drivers/common/cnxk/meson.build b/drivers/common/cnxk/meson.build
index 849735921c..9beb1cddba 100644
--- a/drivers/common/cnxk/meson.build
+++ b/drivers/common/cnxk/meson.build
@@ -88,3 +88,4 @@ sources += files('cnxk_telemetry_bphy.c',
 
 deps += ['bus_pci', 'net', 'telemetry']
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/common/mlx5/meson.build b/drivers/common/mlx5/meson.build
index 60ccd95cbc..d38255dc82 100644
--- a/drivers/common/mlx5/meson.build
+++ b/drivers/common/mlx5/meson.build
@@ -40,3 +40,4 @@ endif
 mlx5_config = configuration_data()
 subdir(exec_env)
 configure_file(output: 'mlx5_autoconf.h', configuration: mlx5_config)
+annotate_locks = false
diff --git a/drivers/event/cnxk/meson.build b/drivers/event/cnxk/meson.build
index aa42ab3a90..20c6a0484a 100644
--- a/drivers/event/cnxk/meson.build
+++ b/drivers/event/cnxk/meson.build
@@ -480,3 +480,4 @@ endforeach
 
 deps += ['bus_pci', 'common_cnxk', 'net_cnxk', 'crypto_cnxk']
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/meson.build b/drivers/meson.build
index 0618c31a69..d529980fc5 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -91,7 +91,7 @@ foreach subpath:subdirs
 build = true # set to false to disable, e.g. missing deps
 reason = '' # set if build == false to explain
 name = drv
-annotate_locks = false
+annotate_locks = true
 sources = []
 headers = []
 driver_sdk_headers = [] # public headers included by drivers
diff --git a/drivers/net/bnx2x/meson.build b/drivers/net/bnx2x/meson.build
index 156f97d31f..dbf9c7225d 100644
--- a/drivers/net/bnx2x/meson.build
+++ b/drivers/net/bnx2x/meson.build
@@ -21,3 +21,4 @@ sources = files(
 'ecore_sp.c',
 'elink.c',
 )
+annotate_locks = false
diff --git a/drivers/net/bnxt/meson.build b/drivers/net/bnxt/meson.build
index 09d494e90f..f43dbfc445 100644
--- a/drivers/net/bnxt/meson.build
+++ b/drivers/net/bnxt/meson.build
@@ -68,3 +68,4 @@ if arch_subdir == 'x86'
 elif arch_subdir == 'arm' and dpdk_conf.get('RTE_ARCH_64')
 sources += files('bnxt_rxtx_vec_neon.c')
 endif
+annotate_locks = false
diff --git a/drivers/net/cnxk/meson.build b/drivers/net/cnxk/meson.build
index c7ca24d437..86ed2d13dd 100644
--- a/drivers/net/cnxk/meson.build
+++ b/drivers/net/cnxk/meson.build
@@ -196,3 +196,4 @@ endforeach
 
 headers = files('rte_pmd_cnxk.h')
 pmd_supports_disable_iova_as_pa = true
+annotate_locks = false
diff --git a/drivers/net/enic/meson.build

[PATCH v2 19/20] eal/windows: disable lock check on alarm code

2023-02-24 Thread David Marchand
This code uses locks to implement synchronisation between two threads.
There seems nothing wrong with it, just silence the clang lock check.

Signed-off-by: David Marchand 
---
 lib/eal/windows/eal_alarm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/eal/windows/eal_alarm.c b/lib/eal/windows/eal_alarm.c
index 48203a2870..34b52380ce 100644
--- a/lib/eal/windows/eal_alarm.c
+++ b/lib/eal/windows/eal_alarm.c
@@ -224,6 +224,7 @@ struct intr_task {
 
 static void
 intr_thread_entry(void *arg)
+   __rte_no_thread_safety_analysis
 {
struct intr_task *task = arg;
task->func(task->arg);
@@ -232,6 +233,7 @@ intr_thread_entry(void *arg)
 
 static int
 intr_thread_exec_sync(void (*func)(void *arg), void *arg)
+   __rte_no_thread_safety_analysis
 {
struct intr_task task;
int ret;
-- 
2.39.2



Re: [PATCH v2 16/20] net/failsafe: fix mutex locking

2023-02-24 Thread Gaëtan Rivet
On Fri, Feb 24, 2023, at 16:11, David Marchand wrote:
> The pthread mutex API describes cases where locking might fail.
> Check fts_enter wrapper return code.
>
> Signed-off-by: David Marchand 

Hello David,

Thanks for the fix,
Acked-by: Gaetan Rivet 

-- 
Gaetan Rivet


Re: [PATCH v2 00/20] Enable lock annotations on most libraries and drivers

2023-02-24 Thread Gaëtan Rivet
On Fri, Feb 24, 2023, at 16:11, David Marchand wrote:
> This is a followup of the series that introduced lock annotations.
> I reworked and made annotations work in what seemed the easier cases.
> In most cases, I chose to convert inline wrappers around the EAL lock
> API to simple macro: I did not see much value in those wrappers and this
> is way simpler than adding __rte_*lock_function tags everywhere.
>
> A list of libraries and drivers still need more work as their code have
> non obvious locks handling. For those components, the check is opted
> out.
> I leave it to their respective maintainers to enable the checks later.
>
> FreeBSD libc pthread API has lock annotations while Linux glibc has
> none.
> We could simply disable the check on FreeBSD, but having this check,
> a few issues got raised in drivers that are built with FreeBSD.
> For now, I went with a simple #ifdef FreeBSD for pthread mutex related
> annotations in our code.
>

Hi David,

This is a great change, thanks for doing it.
However I am not sure I understand the logic regarding the '#ifdef FREEBSD'.

These annotations provide static hints to clang's thread safety analysis.
What is the dependency on FreeBSD glibc?

-- 
Gaetan Rivet


Re: [PATCH v3] lib/net: add MPLS insert and strip functionality

2023-02-24 Thread Stephen Hemminger
On Fri, 24 Feb 2023 16:25:21 +0500
Tanzeel-inline  wrote:

> + oh = rte_pktmbuf_mtod(*m, struct rte_ether_hdr *);
> + nh = (struct rte_ether_hdr *)(void *)
> + rte_pktmbuf_prepend(*m, sizeof(struct rte_mpls_hdr));

Don't need void * cast. Can cast result of prepend (char *) to ether header 
directly.

Note: rte_pktmbuf_append/prepend should have been written initially to
return void * to reduce the number of casts.


Re: Bug in rte_mempool_do_generic_get?

2023-02-24 Thread Harris, James R


> On Feb 24, 2023, at 6:56 AM, Honnappa Nagarahalli 
>  wrote:
> 
> 
> 
>> -Original Message-
>> From: Morten Brørup 
>> Sent: Friday, February 24, 2023 6:13 AM
>> To: Harris, James R ; dev@dpdk.org
>> Subject: RE: Bug in rte_mempool_do_generic_get?
>> 
>>> 
>> 
>>> If you have a mempool with 2048 objects, shouldn't 4 cores each be able to 
>>> do a 512 buffer bulk get, regardless of the configured cache size?
>> 
>> No, the scenario you described above is the expected behavior. I think it is
>> documented somewhere that objects in the caches are unavailable for other
>> cores, but now I cannot find where this is documented.
>> 

Thanks Morten.

Yeah, I think it is documented somewhere, but I also couldn’t find it.  I was 
aware of cores not being able to allocate from another core’s cache.  My 
surprise was that in a pristine new mempool, that 4 cores could not each do one 
initial 512 buffer bulk get.  But I also see that even before the a2833ecc5 
patch, the cache would get populated on gets less than cache size, in addition 
to the buffers requested by the user.  So if cache size is 256, and bulk get is 
for 128 buffers, it pulls 384 buffers from backing pool - 128 for the caller, 
another 256 to prefill the cache.  Your patch makes this cache filling 
consistent between less-than-cache-size and greater-than-or-equal-to-cache-size 
cases.

>> Furthermore, since the effective per-core cache size is 1.5 * configured 
>> cache
>> size, a configured cache size of 256 may leave up to 384 objects in each per-
>> core cache.
>> 
>> With 4 cores, you can expect up to 3 * 384 = 1152 objects sitting in the
>> caches of other cores. If you want to be able to pull 512 objects with each
>> core, the pool size should be 4 * 512 + 1152 objects.
> May be we should document this in mempool library?
> 

Maybe.  But this case I described here is a bit wonky - SPDK should never have 
been specifying a non-zero cache in this case.  We only noticed this change in 
behavior because we were creating the mempool with a cache when we shouldn’t 
have.

-Jim




RE: [PATCH v2 04/16] test/bbdev: add timeout for latency tests

2023-02-24 Thread Vargas, Hernan
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Thursday, February 23, 2023 2:32 AM
> To: Vargas, Hernan ; dev@dpdk.org;
> gak...@marvell.com; Rix, Tom 
> Cc: Chautru, Nicolas ; Zhang, Qi Z
> 
> Subject: Re: [PATCH v2 04/16] test/bbdev: add timeout for latency tests
> 
> 
> 
> On 2/22/23 22:13, Vargas, Hernan wrote:
> > Hi Maxime,
> >
> >> -Original Message-
> >> From: Maxime Coquelin 
> >> Sent: Monday, February 20, 2023 10:33 AM
> >> To: Vargas, Hernan ; dev@dpdk.org;
> >> gak...@marvell.com; Rix, Tom 
> >> Cc: Chautru, Nicolas ; Zhang, Qi Z
> >> 
> >> Subject: Re: [PATCH v2 04/16] test/bbdev: add timeout for latency
> >> tests
> >>
> >>
> >>
> >> On 2/15/23 18:09, Hernan Vargas wrote:
> >>> Add a timeout to force exit the latency tests in case dequeue never
> >>> happens.
> >>>
> >>> Signed-off-by: Hernan Vargas 
> >>> ---
> >>>app/test-bbdev/test_bbdev_perf.c | 24 +++-
> >>>1 file changed, 19 insertions(+), 5 deletions(-)
> >>>
> >>> diff --git a/app/test-bbdev/test_bbdev_perf.c
> >>> b/app/test-bbdev/test_bbdev_perf.c
> >>> index 19b9a5b119..dede0f900e 100644
> >>> --- a/app/test-bbdev/test_bbdev_perf.c
> >>> +++ b/app/test-bbdev/test_bbdev_perf.c
> >>> @@ -26,6 +26,7 @@
> >>>
> >>>#define MAX_QUEUES RTE_MAX_LCORE
> >>>#define TEST_REPETITIONS 100
> >>> +#define TIME_OUT_POLL 1e8
> >>>#define WAIT_OFFLOAD_US 1000
> >>>
> >>>#ifdef RTE_BASEBAND_FPGA_LTE_FEC
> >>> @@ -4546,6 +4547,7 @@ latency_test_ldpc_dec(struct rte_mempool
> >>> *mempool,
> >>>
> >>>   for (i = 0, dequeued = 0; dequeued < num_to_process; ++i) {
> >>>   uint16_t enq = 0, deq = 0;
> >>> + uint32_t time_out = 0;
> >>>   bool first_time = true;
> >>>   last_time = 0;
> >>>
> >>> @@ -4597,7 +4599,8 @@ latency_test_ldpc_dec(struct rte_mempool
> >> *mempool,
> >>>   last_time = rte_rdtsc_precise() - 
> >>> start_time;
> >>>   first_time = false;
> >>>   }
> >>> - } while (unlikely(burst_sz != deq));
> >>> + time_out++;
> >>> + } while ((burst_sz != deq) && (time_out < TIME_OUT_POLL));
> >>>
> >>>   *max_time = RTE_MAX(*max_time, last_time);
> >>>   *min_time = RTE_MIN(*min_time, last_time); @@ -4606,7
> >> +4609,12 @@
> >>> latency_test_ldpc_dec(struct rte_mempool *mempool,
> >>>   if (extDdr)
> >>>   retrieve_harq_ddr(dev_id, queue_id, ops_enq,
> >> burst_sz);
> >>>
> >>> - if (test_vector.op_type != RTE_BBDEV_OP_NONE) {
> >>> + if (burst_sz != deq) {
> >>> + struct rte_bbdev_info info;
> >>> + ret = TEST_FAILED;
> >>> + rte_bbdev_info_get(dev_id, &info);
> >>
> >> What is the point of calling rte_bbdev_info_get() here and below?
> >> info is not used afterwards.
> >
> > The reason for calling this function is to check the device status and if 
> > there
> is something wrong the PMD would display it to standard output.
> 
> What kind of info exactly, I don't see much meaningful logs in
> rte_bbdev_info_get() except printing error if dev_info == NULL.

rte_bbdev_info_get() calls the device's info_get function that is specified in 
the PMD.
For example, for ACC100, acc100_dev_info_get() gets called to check the device 
status.

Thanks,
Hernan
> 
> Regards,
> Maxime
> 
> >>
> >>> + TEST_ASSERT_SUCCESS(ret, "Dequeue timeout!");
> >>> + } else if (test_vector.op_type != RTE_BBDEV_OP_NONE) {
> >>>   ret = validate_ldpc_dec_op(ops_deq, burst_sz,
> >> ref_op,
> >>>   vector_mask);
> >>>   TEST_ASSERT_SUCCESS(ret, "Validation failed!"); 
> >>> @@
> >> -4632,6
> >>> +4640,7 @@ latency_test_enc(struct rte_mempool *mempool,
> >>>
> >>>   for (i = 0, dequeued = 0; dequeued < num_to_process; ++i) {
> >>>   uint16_t enq = 0, deq = 0;
> >>> + uint32_t time_out = 0;
> >>>   bool first_time = true;
> >>>   last_time = 0;
> >>>
> >>> @@ -4667,13 +4676,18 @@ latency_test_enc(struct rte_mempool
> >> *mempool,
> >>>   last_time += rte_rdtsc_precise() - 
> >>> start_time;
> >>>   first_time = false;
> >>>   }
> >>> - } while (unlikely(burst_sz != deq));
> >>> + time_out++;
> >>> + } while ((burst_sz != deq) && (time_out < TIME_OUT_POLL));
> >>>
> >>>   *max_time = RTE_MAX(*max_time, last_time);
> >>>   *min_time = RTE_MIN(*min_time, last_time);
> >>>   *total_time += last_time;
> >>> -
> >>> - if (test_vector.op_type != RTE_BBDEV_OP_NONE) {
> >>> + if (burst_sz != deq) {
> >>> + struct rte_bbdev_info in

Re: [PATCH] net/mana: use RTE_LOG_DP for logs on datapath

2023-02-24 Thread Stephen Hemminger
On Thu, 23 Feb 2023 10:09:17 -0800
Stephen Hemminger  wrote:

> On Thu, 23 Feb 2023 14:07:25 +
> Ferruh Yigit  wrote:
> 
> > Overall I am not sure if anyone is interested in driver datapath logs
> > other than driver developers themselves.
> > 
> > For datapath logging I think there are two concerns,
> > 1) It should not eat *any* cycles unless explicitly enabled
> > 2) Capability of enable/disable them because of massive amount of log it
> > can generate
> > 
> > 
> > Currently there are two existing approaches for driver datapath logging:
> > i)  Controlled by 'RTE_ETHDEV_DEBUG_RX/TX' compile time flag,
> > when enabled 'rte_log()' is used with Rx/Tx specific log type.
> > ii) 'RTE_LOG_DP' ', compile time control per logtype via
> > 'RTE_LOG_DP_LEVEL',
> >  when enabled 'rte_log()' is used with PMD logtype.
> > 
> > 
> > In (ii), need to re-compile code when you need to increase the log
> > verbosity, and it leaks to production code as mentioned above.
> > 
> > For (i), developer compiles once enabling debug, later can fine grain
> > log level dynamically. This is more DPDK developer focused approach.
> > 
> > 
> > [1]
> > According above, what do you think to retire 'RTE_LOG_DP', (at least
> > within ethdev datapath), and chose (i) as preferred datapath logging?  
> 
> I agree, the current tx/rx logging is a mess.
> Each driver is different, each driver has to have something to enable it;
> and it really isn't useful beyond the driver developer.
> 
> Using tracing seems like a much better option. Could we agree on a common
> set of trace points for drivers and fix all drivers to use the same thing.
> Probably will cause some upset among driver developers:
> "where did my nice printf's go, now I have to learn tracing"
> but DPDK has a good facility here, lets use it.
> 
> My proposal would be:
>   - agree on common set of trace points
>   - apply to all drivers
>   - remove RTE_LOG_DP()
>   - remove per driver RX/TX options
>   - side effect, more uses of RTE_LOGTYPE_PMD go away.

Here is an example of using tracepoints instead.
Compile tested for example only.

Note: using tracepoints it is possible to keep some of the tracepoints
even if fastpath is not enabled.  Things like running out of Tx or Mbuf
is not something that is perf critical; but would be good for application
to see.


---
 drivers/net/ixgbe/ixgbe_ethdev.c |   7 --
 drivers/net/ixgbe/ixgbe_logs.h   |  18 -
 drivers/net/ixgbe/ixgbe_rxtx.c   |  95 +++---
 drivers/net/ixgbe/ixgbe_trace.h  | 110 +++
 4 files changed, 133 insertions(+), 97 deletions(-)
 create mode 100644 drivers/net/ixgbe/ixgbe_trace.h

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 88118bc30560..07f8ccdd97cb 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -8317,10 +8317,3 @@ RTE_PMD_REGISTER_PARAM_STRING(net_ixgbe_vf,
 
 RTE_LOG_REGISTER_SUFFIX(ixgbe_logtype_init, init, NOTICE);
 RTE_LOG_REGISTER_SUFFIX(ixgbe_logtype_driver, driver, NOTICE);
-
-#ifdef RTE_ETHDEV_DEBUG_RX
-RTE_LOG_REGISTER_SUFFIX(ixgbe_logtype_rx, rx, DEBUG);
-#endif
-#ifdef RTE_ETHDEV_DEBUG_TX
-RTE_LOG_REGISTER_SUFFIX(ixgbe_logtype_tx, tx, DEBUG);
-#endif
diff --git a/drivers/net/ixgbe/ixgbe_logs.h b/drivers/net/ixgbe/ixgbe_logs.h
index 00ef797ca103..f70fa1fd4afd 100644
--- a/drivers/net/ixgbe/ixgbe_logs.h
+++ b/drivers/net/ixgbe/ixgbe_logs.h
@@ -12,24 +12,6 @@ extern int ixgbe_logtype_init;
 
 #define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
 
-#ifdef RTE_ETHDEV_DEBUG_RX
-extern int ixgbe_logtype_rx;
-#define PMD_RX_LOG(level, fmt, args...)\
-   rte_log(RTE_LOG_ ## level, ixgbe_logtype_rx,\
-   "%s(): " fmt "\n", __func__, ## args)
-#else
-#define PMD_RX_LOG(level, fmt, args...) do { } while (0)
-#endif
-
-#ifdef RTE_ETHDEV_DEBUG_TX
-extern int ixgbe_logtype_tx;
-#define PMD_TX_LOG(level, fmt, args...)\
-   rte_log(RTE_LOG_ ## level, ixgbe_logtype_tx,\
-   "%s(): " fmt "\n", __func__, ## args)
-#else
-#define PMD_TX_LOG(level, fmt, args...) do { } while (0)
-#endif
-
 extern int ixgbe_logtype_driver;
 #define PMD_DRV_LOG_RAW(level, fmt, args...) \
rte_log(RTE_LOG_ ## level, ixgbe_logtype_driver, "%s(): " fmt, \
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index c9d6ca9efea4..f9f0ffebee8e 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -46,6 +46,7 @@
 #include 
 
 #include "ixgbe_logs.h"
+#include "ixgbe_trace.h"
 #include "base/ixgbe_api.h"
 #include "base/ixgbe_vf.h"
 #include "ixgbe_ethdev.h"
@@ -581,11 +582,8 @@ ixgbe_xmit_cleanup(struct ixgbe_tx_queue *txq)
desc_to_clean_to = sw_ring[desc_to_clean_to].last_id;
status = txr[desc_to_clean_to].wb.status;
if (!(status & rte_cpu_to_le_32(IXGBE_TXD_STAT_DD))) {
-   PMD_TX_LOG(DEBUG,
-  

[PATCH] service: split tests to perf and autotest to avoid spurious CI failures

2023-02-24 Thread Harry van Haaren
On some CI runs, some service-cores tests spuriously fail as the
service lcore thread is not actually scheduled by the OS in the
given amount of time.

Increasing timeouts has not resolved the issue in the CI, so the
solution in this patch is to move them to a separate perf test
suite.

Signed-off-by: Harry van Haaren 

---

See DPDK ML discussion in this thread:
http://mails.dpdk.org/archives/dev/2023-February/263523.html
---
 app/test/meson.build  |  1 +
 app/test/test_service_cores.c | 32 +++-
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index f34d19e3c3..2db5ccf4ff 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -287,6 +287,7 @@ perf_test_names = [
 'pie_perf',
 'distributor_perf_autotest',
 'pmd_perf_autotest',
+'service_perf_autotest',
 'stack_perf_autotest',
 'stack_lf_perf_autotest',
 'rand_perf_autotest',
diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index 637fcd7cf9..06653dfdef 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -1022,17 +1022,12 @@ static struct unit_test_suite service_tests  = {
TEST_CASE_ST(dummy_register, NULL, service_name),
TEST_CASE_ST(dummy_register, NULL, service_get_by_name),
TEST_CASE_ST(dummy_register, NULL, service_dump),
-   TEST_CASE_ST(dummy_register, NULL, service_attr_get),
-   TEST_CASE_ST(dummy_register, NULL, service_lcore_attr_get),
TEST_CASE_ST(dummy_register, NULL, service_probe_capability),
TEST_CASE_ST(dummy_register, NULL, service_start_stop),
TEST_CASE_ST(dummy_register, NULL, service_lcore_add_del),
-   TEST_CASE_ST(dummy_register, NULL, service_lcore_start_stop),
TEST_CASE_ST(dummy_register, NULL, service_lcore_en_dis_able),
TEST_CASE_ST(dummy_register, NULL, service_mt_unsafe_poll),
TEST_CASE_ST(dummy_register, NULL, service_mt_safe_poll),
-   TEST_CASE_ST(dummy_register, NULL, service_app_lcore_mt_safe),
-   TEST_CASE_ST(dummy_register, NULL, service_app_lcore_mt_unsafe),
TEST_CASE_ST(dummy_register, NULL, service_may_be_active),
TEST_CASE_ST(dummy_register, NULL, service_active_two_cores),
TEST_CASES_END() /**< NULL terminate unit test array */
@@ -1046,3 +1041,30 @@ test_service_common(void)
 }
 
 REGISTER_TEST_COMMAND(service_autotest, test_service_common);
+
+
+/* The tests below have been split from the auto-test suite, as the
+ * when they are run in a cloud CI environment they can give false-positive
+ * errors, due to the service-cores not being scheduled by the OS.
+ */
+static struct unit_test_suite service_perf_tests  = {
+   .suite_name = "service core test suite",
+   .setup = testsuite_setup,
+   .teardown = testsuite_teardown,
+   .unit_test_cases = {
+   TEST_CASE_ST(dummy_register, NULL, service_attr_get),
+   TEST_CASE_ST(dummy_register, NULL, service_lcore_attr_get),
+   TEST_CASE_ST(dummy_register, NULL, service_lcore_start_stop),
+   TEST_CASE_ST(dummy_register, NULL, service_app_lcore_mt_safe),
+   TEST_CASE_ST(dummy_register, NULL, service_app_lcore_mt_unsafe),
+   TEST_CASES_END() /**< NULL terminate unit test array */
+   }
+};
+
+static int
+test_service_perf(void)
+{
+   return unit_test_suite_runner(&service_perf_tests);
+}
+
+REGISTER_TEST_COMMAND(service_perf_autotest, test_service_perf);
-- 
2.34.1



[PATCH v10 0/2] zero-copy get and put functions

2023-02-24 Thread Kamalakshitha Aligeri
This version combines the following two patches from Morten Brørup and
Kamalakshitha Aligeri
1. 
https://patches.dpdk.org/project/dpdk/patch/20230213122437.122858-1...@smartsharesystems.com/
2. 
https://patches.dpdk.org/project/dpdk/patch/20230221055205.22984-3-kamalakshitha.alig...@arm.com/
The squashed patch has zero-copy get and put functions in mempool cache
and the mempool test cases with those new API's

As requested on the mailing list, I have removed the reviewed-by and
acked-by tags. Please provide your tags again for patch 1/2 (mempool
cache: add zero copy get and put function)

Kamalakshitha Aligeri (1):
  net/i40e: replace put function

Morten Brørup (1):
  mempool cache: add zero-copy get and put functions

 .mailmap|   1 +
 app/test/test_mempool.c |  81 +---
 drivers/net/i40e/i40e_rxtx_vec_common.h |  27 ++-
 lib/mempool/mempool_trace_points.c  |   9 +
 lib/mempool/rte_mempool.h   | 239 +---
 lib/mempool/rte_mempool_trace_fp.h  |  23 +++
 lib/mempool/version.map |   9 +
 7 files changed, 334 insertions(+), 55 deletions(-)

--
2.25.1



[PATCH v10 1/2] mempool cache: add zero-copy get and put functions

2023-02-24 Thread Kamalakshitha Aligeri
From: = Morten Brørup 

Zero-copy access to mempool caches is beneficial for PMD performance, and
must be provided by the mempool library to fix [Bug 1052] without a
performance regression.

[Bug 1052]: https://bugs.dpdk.org/show_bug.cgi?id=1052

Bugzilla ID: 1052

Signed-off-by: Morten Brørup 
Signed-off-by: Kamalakshitha Aligeri 
---
v10:
* Added mempool test cases with zero-copy API's
v9:
* Also set rte_errno in zero-copy put function, if returning NULL.
  (Honnappa)
* Revert v3 comparison to prevent overflow if n is really huge and len is
  non-zero. (Olivier)
v8:
* Actually include the rte_errno header file.
  Note to self: The changes only take effect on the disk after the file in
  the text editor has been saved.
v7:
* Fix typo in function description. (checkpatch)
* Zero-copy functions may set rte_errno; include rte_errno header file.
  (ci/loongarch-compilation)
v6:
* Improve description of the 'n' parameter to the zero-copy get function.
  (Konstantin, Bruce)
* The caches used for zero-copy may not be user-owned, so remove this word
  from the function descriptions. (Kamalakshitha)
v5:
* Bugfix: Compare zero-copy get request to the cache size instead of the
  flush threshold; otherwise refill could overflow the memory allocated
  for the cache. (Andrew)
* Split the zero-copy put function into an internal function doing the
  work, and a public function with trace.
* Avoid code duplication by rewriting rte_mempool_do_generic_put() to use
  the internal zero-copy put function. (Andrew)
* Corrected the return type of rte_mempool_cache_zc_put_bulk() from void *
  to void **; it returns a pointer to an array of objects.
* Fix coding style: Add missing curly brackets. (Andrew)
v4:
* Fix checkpatch warnings.
v3:
* Bugfix: Respect the cache size; compare to the flush threshold instead
  of RTE_MEMPOOL_CACHE_MAX_SIZE.
* Added 'rewind' function for incomplete 'put' operations. (Konstantin)
* Replace RTE_ASSERTs with runtime checks of the request size.
  Instead of failing, return NULL if the request is too big. (Konstantin)
* Modified comparison to prevent overflow if n is really huge and len is
  non-zero. (Andrew)
* Updated the comments in the code.
v2:
* Fix checkpatch warnings.
* Fix missing registration of trace points.
* The functions are inline, so they don't go into the map file.
v1 changes from the RFC:
* Removed run-time parameter checks. (Honnappa)
  This is a hot fast path function; requiring correct application
  behaviour, i.e. function parameters must be valid.
* Added RTE_ASSERT for parameters instead.
  Code for this is only generated if built with RTE_ENABLE_ASSERT.
* Removed fallback when 'cache' parameter is not set. (Honnappa)
* Chose the simple get function; i.e. do not move the existing objects in
  the cache to the top of the new stack, just leave them at the bottom.
* Renamed the functions. Other suggestions are welcome, of course. ;-)
* Updated the function descriptions.
* Added the functions to trace_fp and version.map.

 app/test/test_mempool.c|  81 +++---
 lib/mempool/mempool_trace_points.c |   9 ++
 lib/mempool/rte_mempool.h  | 239 +
 lib/mempool/rte_mempool_trace_fp.h |  23 +++
 lib/mempool/version.map|   9 ++
 5 files changed, 311 insertions(+), 50 deletions(-)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 8e493eda47..6d29f5bc7b 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -74,7 +74,7 @@ my_obj_init(struct rte_mempool *mp, __rte_unused void *arg,

 /* basic tests (done on one core) */
 static int
-test_mempool_basic(struct rte_mempool *mp, int use_external_cache)
+test_mempool_basic(struct rte_mempool *mp, int use_external_cache, int 
use_zc_api)
 {
uint32_t *objnum;
void **objtable;
@@ -84,6 +84,7 @@ test_mempool_basic(struct rte_mempool *mp, int 
use_external_cache)
unsigned i, j;
int offset;
struct rte_mempool_cache *cache;
+   void **cache_objs;

if (use_external_cache) {
/* Create a user-owned mempool cache. */
@@ -100,8 +101,13 @@ test_mempool_basic(struct rte_mempool *mp, int 
use_external_cache)
rte_mempool_dump(stdout, mp);

printf("get an object\n");
-   if (rte_mempool_generic_get(mp, &obj, 1, cache) < 0)
-   GOTO_ERR(ret, out);
+   if (use_zc_api) {
+   cache_objs = rte_mempool_cache_zc_get_bulk(cache, mp, 1);
+   obj = *cache_objs;
+   } else {
+   if (rte_mempool_generic_get(mp, &obj, 1, cache) < 0)
+   GOTO_ERR(ret, out);
+   }
rte_mempool_dump(stdout, mp);

/* tests that improve coverage */
@@ -123,21 +129,41 @@ test_mempool_basic(struct rte_mempool *mp, int 
use_external_cache)
 #endif

printf("put the object back\n");
-   rte_mempool_generic_put(mp, &obj, 1, cache);
+   if (use_zc_api) {
+   cache_objs = rte_mempool_cache_zc_put_bulk(cach

[PATCH v10 2/2] net/i40e: replace put function

2023-02-24 Thread Kamalakshitha Aligeri
Integrated zero-copy put API in mempool cache in i40e PMD.
On Ampere Altra server, l3fwd single core's performance improves by 5%
with the new API

Signed-off-by: Kamalakshitha Aligeri 
Reviewed-by: Ruifeng Wang 
Reviewed-by: Feifei Wang 
Reviewed-by: Morten Brørup 
---
v3:
Fixed the way mbufs are accessed from txep (Morten Brorup)
v2:
Fixed the code for n > RTE_MEMPOOL_CACHE_MAX_SIZE (Morten Brorup)
v1:
1. Integrated the zc_put API in i40e PMD
2. Added mempool test cases with the zero-cpoy API's

.mailmap|  1 +
 drivers/net/i40e/i40e_rxtx_vec_common.h | 27 -
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/.mailmap b/.mailmap
index a9f4f28fba..2581d0efe7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -677,6 +677,7 @@ Kai Ji 
 Kaiwen Deng 
 Kalesh AP 
 Kamalakannan R 
+Kamalakshitha Aligeri 
 Kamil Bednarczyk 
 Kamil Chalupnik 
 Kamil Rytarowski 
diff --git a/drivers/net/i40e/i40e_rxtx_vec_common.h 
b/drivers/net/i40e/i40e_rxtx_vec_common.h
index fe1a6ec75e..35cdb31b2e 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_common.h
+++ b/drivers/net/i40e/i40e_rxtx_vec_common.h
@@ -95,18 +95,35 @@ i40e_tx_free_bufs(struct i40e_tx_queue *txq)

n = txq->tx_rs_thresh;

-/* first buffer to free from S/W ring is at index
- * tx_next_dd - (tx_rs_thresh-1)
- */
+   /* first buffer to free from S/W ring is at index
+* tx_next_dd - (tx_rs_thresh-1)
+*/
txep = &txq->sw_ring[txq->tx_next_dd - (n - 1)];

if (txq->offloads & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) {
+   struct rte_mempool *mp = txep[0].mbuf->pool;
+   struct rte_mempool_cache *cache = rte_mempool_default_cache(mp, 
rte_lcore_id());
+   void **cache_objs;
+
+   if (unlikely(!cache))
+   goto fallback;
+
+   cache_objs = rte_mempool_cache_zc_put_bulk(cache, mp, n);
+   if (unlikely(!cache_objs))
+   goto fallback;
+
for (i = 0; i < n; i++) {
-   free[i] = txep[i].mbuf;
+   cache_objs[i] = txep[i].mbuf;
/* no need to reset txep[i].mbuf in vector path */
}
-   rte_mempool_put_bulk(free[0]->pool, (void **)free, n);
goto done;
+
+fallback:
+   for (i = 0; i < n; i++)
+   free[i] = txep[i].mbuf;
+   rte_mempool_generic_put(mp, (void **)free, n, cache);
+   goto done;
+
}

m = rte_pktmbuf_prefree_seg(txep[0].mbuf);
--
2.25.1



[PATCH v1 0/1] doc: PMD update

2023-02-24 Thread Nicolas Chautru
Hi Maxime, 
Just one small change to documentation. I should not have used acronym
in a few places.
Thanks
Nic

Nicolas Chautru (1):
  doc: update to Intel vRAN Boost PMD

 doc/guides/bbdevs/vrb1.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.34.1



[PATCH v1 1/1] doc: update to Intel vRAN Boost PMD

2023-02-24 Thread Nicolas Chautru
Cosmetic change to refer explictly to the full name of the PMD
Intel vRAN Boost v1.
VRB1 acronym is only used as a prefix within the code.

Signed-off-by: Nicolas Chautru 
---
 doc/guides/bbdevs/vrb1.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/guides/bbdevs/vrb1.rst b/doc/guides/bbdevs/vrb1.rst
index f5a9dff804..9c48d30964 100644
--- a/doc/guides/bbdevs/vrb1.rst
+++ b/doc/guides/bbdevs/vrb1.rst
@@ -104,7 +104,7 @@ Initialization
 --
 
 When the device first powers up, its PCI Physical Functions (PF)
-can be listed through these commands for VRB1:
+can be listed through these commands for Intel vRAN Boost v1:
 
 .. code-block:: console
 
@@ -123,7 +123,7 @@ Install the DPDK igb_uio driver, bind it with the PF PCI 
device ID and use
 ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
 
 The igb_uio driver may be bound to the PF PCI device using one of two methods
-for VRB1:
+for Intel vRAN Boost v1:
 
 #. PCI functions (physical or virtual, depending on the use case) can be bound
 to the UIO driver by repeating this command for every function.
@@ -252,7 +252,7 @@ from the VF and not only limited to the PF as captured 
above.
 
 See for more details: https://github.com/intel/pf-bb-config
 
-Specifically for the bbdev VRB1 PMD, the command below can be used
+Specifically for the bbdev Intel vRAN Boost v1 PMD, the command below can be 
used
 (note that ACC200 was used previously to refer to VRB1):
 
 .. code-block:: console
-- 
2.34.1



Re: [PATCH v6 1/3] eal: add rte thread create control API

2023-02-24 Thread Tyler Retzlaff
On Fri, Feb 24, 2023 at 09:13:56AM +0100, David Marchand wrote:
> On Wed, Feb 8, 2023 at 10:26 PM Tyler Retzlaff
>  wrote:
> >
> > Add rte_thread_create_control API as a replacement for
> > rte_ctrl_thread_create to allow deprecation of the use of platform
> > specific types in DPDK public API.
> >
> > Add test from David Marchand to exercise the new API.
> >
> > Signed-off-by: Tyler Retzlaff 
> > Acked-by: Morten Brørup 
> > Reviewed-by: Mattias Rönnblom 
> > ---
> >  app/test/test_threads.c| 26 
> >  lib/eal/common/eal_common_thread.c | 85 
> > ++
> >  lib/eal/include/rte_thread.h   | 33 +++
> >  lib/eal/version.map|  1 +
> >  4 files changed, 137 insertions(+), 8 deletions(-)
> >
> > diff --git a/app/test/test_threads.c b/app/test/test_threads.c
> > index e0f18e4..657ecad 100644
> > --- a/app/test/test_threads.c
> > +++ b/app/test/test_threads.c
> > @@ -232,6 +232,31 @@
> > return 0;
> >  }
> >
> > +static int
> > +test_thread_create_control_join(void)
> > +{
> > +   rte_thread_t thread_id;
> > +   rte_thread_t thread_main_id;
> > +
> > +   thread_id_ready = 0;
> > +   RTE_TEST_ASSERT(rte_thread_create_control(&thread_id, 
> > "test_control_threads",
> > +   NULL, thread_main, &thread_main_id) == 0,
> > +   "Failed to create thread.");
> > +
> > +   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
> > +   ;
> > +
> > +   RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
> > +   "Unexpected thread id.");
> > +
> > +   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
> > +
> > +   RTE_TEST_ASSERT(rte_thread_join(thread_id, NULL) == 0,
> > +   "Failed to join thread.");
> > +
> > +   return 0;
> > +}
> > +
> >  static struct unit_test_suite threads_test_suite = {
> > .suite_name = "threads autotest",
> > .setup = NULL,
> > @@ -243,6 +268,7 @@
> > TEST_CASE(test_thread_priority),
> > TEST_CASE(test_thread_attributes_affinity),
> > TEST_CASE(test_thread_attributes_priority),
> > +   TEST_CASE(test_thread_create_control_join),
> > TEST_CASES_END()
> > }
> >  };
> > diff --git a/lib/eal/common/eal_common_thread.c 
> > b/lib/eal/common/eal_common_thread.c
> > index 3181515..4f83c97 100644
> > --- a/lib/eal/common/eal_common_thread.c
> > +++ b/lib/eal/common/eal_common_thread.c
> > @@ -232,7 +232,10 @@ enum __rte_ctrl_thread_status {
> >  };
> >
> >  struct rte_thread_ctrl_params {
> > -   void *(*start_routine)(void *);
> > +   union {
> > +   void *(*ctrl_start_routine)(void *arg);
> > +   rte_thread_func control_start_routine;
> > +   } u;
> > void *arg;
> > int ret;
> > /* Control thread status.
> > @@ -241,27 +244,47 @@ struct rte_thread_ctrl_params {
> > enum __rte_ctrl_thread_status ctrl_thread_status;
> >  };
> >
> > -static void *ctrl_thread_init(void *arg)
> > +static int ctrl_thread_init(void *arg)
> >  {
> > struct internal_config *internal_conf =
> > eal_get_internal_configuration();
> > rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> > struct rte_thread_ctrl_params *params = arg;
> > -   void *(*start_routine)(void *) = params->start_routine;
> > -   void *routine_arg = params->arg;
> >
> > __rte_thread_init(rte_lcore_id(), cpuset);
> > params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), 
> > cpuset);
> > if (params->ret != 0) {
> > __atomic_store_n(¶ms->ctrl_thread_status,
> > CTRL_THREAD_ERROR, __ATOMIC_RELEASE);
> > -   return NULL;
> > +   return params->ret;
> > }
> >
> > __atomic_store_n(¶ms->ctrl_thread_status,
> > CTRL_THREAD_RUNNING, __ATOMIC_RELEASE);
> >
> > -   return start_routine(routine_arg);
> > +   return 0;
> > +}
> > +
> > +static void *ctrl_thread_start(void *arg)
> > +{
> > +   struct rte_thread_ctrl_params *params = arg;
> > +   void *(*start_routine)(void *) = params->u.ctrl_start_routine;
> > +
> > +   if (ctrl_thread_init(arg) != 0)
> > +   return NULL;
> > +
> > +   return start_routine(params->arg);
> 
> Does this patch open a race on params->arg between the newly created
> thread, and the one that asked for creation?
> Reading params->arg is not under the ctrl_thread_status
> syncrhonisation point anymore (which was protecting against
> free(params)).
> 
> WDYT?

i will take a look, thanks for letting me know.

> 
> (Note that I doubt this patch has anything to do with the recent bz
> opened by Intel QE)
> 
> 
> > +}
> > +
> > +static uint32_t control_thread_start(void *arg)
> > +{
> > +   struct rte_thread_ctrl_params *params = arg;
> > +   rte_thre

Re: [PATCH v6 1/3] eal: add rte thread create control API

2023-02-24 Thread Tyler Retzlaff
On Fri, Feb 24, 2023 at 09:13:56AM +0100, David Marchand wrote:
> On Wed, Feb 8, 2023 at 10:26 PM Tyler Retzlaff
>  wrote:
> >
> > Add rte_thread_create_control API as a replacement for
> > rte_ctrl_thread_create to allow deprecation of the use of platform
> > specific types in DPDK public API.
> >
> > Add test from David Marchand to exercise the new API.
> >
> > Signed-off-by: Tyler Retzlaff 
> > Acked-by: Morten Brørup 
> > Reviewed-by: Mattias Rönnblom 
> > ---
> >  app/test/test_threads.c| 26 
> >  lib/eal/common/eal_common_thread.c | 85 
> > ++
> >  lib/eal/include/rte_thread.h   | 33 +++
> >  lib/eal/version.map|  1 +
> >  4 files changed, 137 insertions(+), 8 deletions(-)
> >
> > diff --git a/app/test/test_threads.c b/app/test/test_threads.c
> > index e0f18e4..657ecad 100644
> > --- a/app/test/test_threads.c
> > +++ b/app/test/test_threads.c
> > @@ -232,6 +232,31 @@
> > return 0;
> >  }
> >
> > +static int
> > +test_thread_create_control_join(void)
> > +{
> > +   rte_thread_t thread_id;
> > +   rte_thread_t thread_main_id;
> > +
> > +   thread_id_ready = 0;
> > +   RTE_TEST_ASSERT(rte_thread_create_control(&thread_id, 
> > "test_control_threads",
> > +   NULL, thread_main, &thread_main_id) == 0,
> > +   "Failed to create thread.");
> > +
> > +   while (__atomic_load_n(&thread_id_ready, __ATOMIC_ACQUIRE) == 0)
> > +   ;
> > +
> > +   RTE_TEST_ASSERT(rte_thread_equal(thread_id, thread_main_id) != 0,
> > +   "Unexpected thread id.");
> > +
> > +   __atomic_store_n(&thread_id_ready, 2, __ATOMIC_RELEASE);
> > +
> > +   RTE_TEST_ASSERT(rte_thread_join(thread_id, NULL) == 0,
> > +   "Failed to join thread.");
> > +
> > +   return 0;
> > +}
> > +
> >  static struct unit_test_suite threads_test_suite = {
> > .suite_name = "threads autotest",
> > .setup = NULL,
> > @@ -243,6 +268,7 @@
> > TEST_CASE(test_thread_priority),
> > TEST_CASE(test_thread_attributes_affinity),
> > TEST_CASE(test_thread_attributes_priority),
> > +   TEST_CASE(test_thread_create_control_join),
> > TEST_CASES_END()
> > }
> >  };
> > diff --git a/lib/eal/common/eal_common_thread.c 
> > b/lib/eal/common/eal_common_thread.c
> > index 3181515..4f83c97 100644
> > --- a/lib/eal/common/eal_common_thread.c
> > +++ b/lib/eal/common/eal_common_thread.c
> > @@ -232,7 +232,10 @@ enum __rte_ctrl_thread_status {
> >  };
> >
> >  struct rte_thread_ctrl_params {
> > -   void *(*start_routine)(void *);
> > +   union {
> > +   void *(*ctrl_start_routine)(void *arg);
> > +   rte_thread_func control_start_routine;
> > +   } u;
> > void *arg;
> > int ret;
> > /* Control thread status.
> > @@ -241,27 +244,47 @@ struct rte_thread_ctrl_params {
> > enum __rte_ctrl_thread_status ctrl_thread_status;
> >  };
> >
> > -static void *ctrl_thread_init(void *arg)
> > +static int ctrl_thread_init(void *arg)
> >  {
> > struct internal_config *internal_conf =
> > eal_get_internal_configuration();
> > rte_cpuset_t *cpuset = &internal_conf->ctrl_cpuset;
> > struct rte_thread_ctrl_params *params = arg;
> > -   void *(*start_routine)(void *) = params->start_routine;
> > -   void *routine_arg = params->arg;
> >
> > __rte_thread_init(rte_lcore_id(), cpuset);
> > params->ret = rte_thread_set_affinity_by_id(rte_thread_self(), 
> > cpuset);
> > if (params->ret != 0) {
> > __atomic_store_n(¶ms->ctrl_thread_status,
> > CTRL_THREAD_ERROR, __ATOMIC_RELEASE);
> > -   return NULL;
> > +   return params->ret;

here too, though a narrow window.

> > }
> >
> > __atomic_store_n(¶ms->ctrl_thread_status,
> > CTRL_THREAD_RUNNING, __ATOMIC_RELEASE);
> >
> > -   return start_routine(routine_arg);
> > +   return 0;
> > +}
> > +
> > +static void *ctrl_thread_start(void *arg)
> > +{
> > +   struct rte_thread_ctrl_params *params = arg;
> > +   void *(*start_routine)(void *) = params->u.ctrl_start_routine;
> > +
> > +   if (ctrl_thread_init(arg) != 0)
> > +   return NULL;
> > +
> > +   return start_routine(params->arg);
> 
> Does this patch open a race on params->arg between the newly created
> thread, and the one that asked for creation?
> Reading params->arg is not under the ctrl_thread_status
> syncrhonisation point anymore (which was protecting against
> free(params)).
> 
> WDYT?

yeah, this is definitely wrong. i'm preparing a patch now. both
params->ret and params->arg are loaded unsynchronized in multiple
places.

sorry about the mistake. fix coming soon.

> 
> (Note that I doubt this patch has anything to do with the recent bz
> opened 

Re: [PATCH 2/2] ethdev: fix race condition in fast-path ops setup

2023-02-24 Thread fengchengwen
On 2023/2/23 21:31, Konstantin Ananyev wrote:
> 
> 
> If ethdev enqueue or dequeue function is called during
> eth_dev_fp_ops_setup(), it may get pre-empted after setting the
> function pointers, but before setting the pointer to port data.
> In this case the newly registered enqueue/dequeue function will
> use dummy port data and end up in seg fault.
>
> This patch moves the updation of each data pointers before
> updating corresponding function pointers.
>
> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate
> structure")
> Cc: sta...@dpdk.org
>>
>> Why is something calling enqueue/dequeue when device is not fully
 started.
>> A correctly written application would not call rx/tx burst until
>> after ethdev start had finished.
>
> Please refer the eb0d471a894 (ethdev: add proactive error handling
> mode), when driver recover itself, the application may still invoke
 enqueue/dequeue API.

 Right now DPDK ethdev layer *does not* provide synchronization
 mechanisms between data-path and control-path functions.
 That was a deliberate deisgn choice. If we want to change that rule, then I
 suppose we need a community consensus for it.
 I think that if the driver wants to provide some sort of error recovery
 procedure, then it has to provide some synchronization mechanism inside it
 between data-path and control-path functions.
 Actually looking at eb0d471a894 (ethdev: add proactive error handling
 mode), and following patches I wonder how it creeped in?
 It seems we just introduced a loophole for race condition with this
 approach...
>>
>> Could you try to describe the specific scenario of loophole ?
> 
> Ok, as I understand the existing mechanism: 
> 
> When PMD wants to start a recovery it has to:
>  - invoke  rte_eth_dev_callback_process(RTE_ETH_EVENT_ERR_RECOVERING);
>That supposed to call user provided callback. After callback is finished 
> PMD assumes
>that user is aware that recovery is about to start and should make some 
> precautions.
> - when recovery is finished it invokes another callback: 
>   RTE_ETH_EVENT_RECOVERY_(SUCCESS/FAILED). After that user either can 
> continue to
>   use port or have to treat is as faulty.
> 
> The idea is ok in principle, but there is a problem.
> 
> lib/ethdev/rte_ethdev.h:
>  
>  /** Port recovering from a hardware or firmware error.
>  * If PMD supports proactive error recovery,
>  * it should trigger this event to notify application
>  * that it detected an error and the recovery is being started.
> 
> <<< !
>  * Upon receiving the event, the application should not invoke any 
> control path API
>  * (such as rte_eth_dev_configure/rte_eth_dev_stop...) until receiving
>  * RTE_ETH_EVENT_RECOVERY_SUCCESS or RTE_ETH_EVENT_RECOVERY_FAILED 
> event.
>  * The PMD will set the data path pointers to dummy functions,
>  * and re-set the data path pointers to non-dummy functions
>  * before reporting RTE_ETH_EVENT_RECOVERY_SUCCESS event.
> <<< !
> 
> That part is just wrong I believe.
> It should be:
> Upon receiving the event, the application should not invoke any *both control 
> and data-path* API
> until receiving  RTE_ETH_EVENT_RECOVERY_SUCCESS or 
> RTE_ETH_EVENT_RECOVERY_FAILED event. 
> Resetting data path pointers to dummy functions by PMD *before* invoking
> rte_eth_dev_callback_process(RTE_ETH_EVENT_ERR_RECOVERING); 
> introduces a race-condition with data-path threads, as such thread could 
> already be inside RX/TX function
> or can already read RX/TX function/data pointers and be about to use them.

Current practices: the PMDs already add some delay after set Rx/Tx callback to 
dummy, and plus the DPDK
worker thread is busypolling, the probability of occurence in reality is zero. 
But in theoretically exist
the above race-condition.

> And right now rte_ethdev layer doesn't provide any mechanism to check it or 
> wait when they'll finish, etc.

Yes

> 
> So, probably the simplest way to fix it with existing DPDK design:
> - user level callback  RTE_ETH_EVENT_ERR_RECOVERING should return only after 
> it ensures that *all*
>   application threads (and processes) stopped using either control or 
> data-path functions for that port

Agree

>   (yes it means that application that wants to use this feature has to 
> provide its own synchronization mechanism
>   around data-path functions (RX/TX) that it is going to use). 
> - after that PMD is safe to reset rte_eth_fp_ops[] values to dummy ones.
> 
> And message to all PMD developers:
> *please stop updating rte_eth_fp_ops[] on your own*.
> That's a bad practice and it is not supposed to do things that way.
> There is a special API provided for these purposes:
> eth_dev_fp_ops_reset(), eth_dev_fp_ops_setup(), so use it.

This two f