Re: [PATCH v2 11/11] eventdev: RFC clarify docs on event object fields

2024-02-02 Thread Mattias Rönnblom

On 2024-02-01 18:02, Bruce Richardson wrote:

On Wed, Jan 24, 2024 at 12:34:50PM +0100, Mattias Rönnblom wrote:

On 2024-01-19 18:43, Bruce Richardson wrote:

Clarify the meaning of the NEW, FORWARD and RELEASE event types.
For the fields in "rte_event" struct, enhance the comments on each to
clarify the field's use, and whether it is preserved between enqueue and
dequeue, and it's role, if any, in scheduling.

Signed-off-by: Bruce Richardson 
---

As with the previous patch, please review this patch to ensure that the
expected semantics of the various event types and event fields have not
changed in an unexpected way.
---
   lib/eventdev/rte_eventdev.h | 105 ++--
   1 file changed, 77 insertions(+), 28 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index cb13602ffb..4eff1c4958 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h




   /**
@@ -1473,53 +1475,100 @@ struct rte_event {
/**< Targeted flow identifier for the enqueue and
 * dequeue operation.
 * The value must be in the range of
-* [0, nb_event_queue_flows - 1] which
+* [0, @ref rte_event_dev_config.nb_event_queue_flows - 
1] which


The same comment as I had before about ranges for unsigned types.


Actually, is this correct, does a range actually apply here?

I thought that the number of queue flows supported was a guide as to how
internal HW resources were to be allocated, and that the flow_id was always
a 20-bit value, where it was up to the scheduler to work out how to map
that to internal atomic locks (when combined with queue ids etc.). It
should not be up to the app to have to do the range limiting itself!



Indeed, I also operated under this belief, which is reflected in DSW, 
which just takes the flow_id and (pseudo-)hash+mask it into the 
appropriate range.


Leaving it to the app allows app-level attempts to avoid collisions 
between large flows, I guess. Not sure I think apps will (or even 
should) really do this.


Re: [PATCH v2] app/testpmd: support updating flow rule actions

2024-02-02 Thread Thomas Monjalon
01/02/2024 10:59, Oleksandr Kolomeiets:
> "flow actions_update" updates a flow rule specified by a rule ID with a
> new action list by making a call to "rte_flow_actions_update()":
> 
> flow actions_update {port_id} {rule_id}
> actions {action} [/ {action} [...]] / end [user_id]
> 
> Creating, updating and destroying a flow rule:
> 
> testpmd> flow create 0 group 1 pattern eth / end actions drop / end
> Flow rule #0 created
> testpmd> flow actions_update 0 0 actions queue index 1 / end
> Flow rule #0 updated with new actions
> testpmd> flow destroy 0 rule 0
> Flow rule #0 destroyed

Why not a simple "flow update" command name?




Re: [RFC v3] eal: add bitset type

2024-02-02 Thread Mattias Rönnblom

On 2024-02-01 09:04, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Wednesday, 31 January 2024 19.46

On 2024-01-31 17:06, Stephen Hemminger wrote:

On Wed, 31 Jan 2024 14:13:01 +0100
Mattias Rönnblom  wrote:


[...]


FYI - the linux kernel has a similar but more complete set of

operations.

It might be more efficient to use unsigned long rather than requiring
the elements to be uint64_t. Thinking of the few 32 bit platforms.



Keeping it 64-bit avoids a popcount-related #ifdef. DPDK doesn't have
an
equivalent to __builtin_popcountl().

How much do we need to care about 32-bit ISA performance?


At the 2023 DPDK Summit I talked to someone at a very well known network 
equipment vendor using 32 bit CPUs in some of their products; some sort of CPE, 
IIRC. 32 bit CPUs are still out there, and 32-bit CPU support has not been 
deprecated in DPDK.

For the bitset parameter to functions, you could either use "unsigned long*" (as 
suggested by Stephen), or "void*" (followed by type casting inside the functions).

If only using this library for the command line argument parser and similar, 
performance is irrelevant. If we foresee using it in the fast path, e.g. with 
the htimer library, we shouldn't tie its API tightly to 64 bit.



I'm not even sure performance will be that much worse. Sure, two 
popcount instead of one. What is probably worse is older ISAs (32- or 
64-bit, e.g. original x64_64) that lack machine instructions for 
counting set bits of *any* word size.


That said, the only real concern I have about going "unsigned long" -> 
"uint64_t" is that I might feel I need to go fix  first.




I'll go through the below API and some other APIs to see if there's
something obvious missing.

When I originally wrote this code there were a few potential features
where I wasn't sure to what extent they were useful. One example was
the
shift operation. Any input is appreciated.


Start off with what you already have. If we need more operations, they can 
always be added later.




Also, what if any thread safety guarantees? or atomic.



Currently, it's all MT unsafe.

An atomic set and get/test would make sense, and maybe other operations
would as well.

Bringing in atomicity into the design makes it much less obvious:

Would the atomic operations imply some memory ordering, or be
"relaxed".
I would lean toward relaxed, but then shouldn't bit-level atomics be
consistent with the core DPDK atomics API? With that in mind, memory
ordering should be user-configurable.

If the code needs to be pure C11 atomics-wise, the words that makes up
the bitset must be _Atomic uint64_t. Then you need to be careful or end
up with "lock"-prefixed instructions if you manipulate the bitset
words.
Just a pure words[N] = 0; gives you a mov+mfence on x86, for example,
plus all the fun memory_order_seq_cst in terms of preventing
compiler-level optimizations. So you definitely can't have the bitset
always using _Atomic uint64_t, since would risk non-shared use cases.
You could have a variant I guess. Just duplicate the whole thing, or
something with macros.


It seems like MT unsafe suffices for the near term use cases.

We can add an atomic variant of the library later, if the need should arise.



Agreed. The only concern I have here is that you end up wanting to 
change the original design, to better be able to fit atomic bit operations.




With GCC C11 builtins, you can both have the atomic cake and eat it, in
that you both access the data non-atomically/normally, and in an atomic
manner.


Yep. And we care quite a lot about performance, so we are likely to keep using 
those until the compilers offer similar performance for C11 standard atomics.



Re: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into sub-functions

2024-02-02 Thread Thomas Monjalon
29/01/2024 08:55, David Marchand:
> On Mon, Jan 29, 2024 at 6:35 AM Rahul Gupta
>  wrote:
> > > Looking at what this patch does.. I am under the impression all you
> > > really need is rte_eal_init without initial probing.
> > > Such behavior can probably be achieved with a allowlist set to a non
> > > existing device (like for example "-a :00:00.0"), then later, use
> > > device hotplug.
> > The patch will be useful to all the adapters irrespective of their
> > host plug support.
> 
> I did not say hotplug support is needed.
> If what I described already works, this patch adds nothing.

I agree with David.
Disabling initial probing should provide what you want.
Did you test his proposal?





Re: [PATCH v2 04/11] eventdev: cleanup doxygen comments on info structure

2024-02-02 Thread Bruce Richardson
On Fri, Feb 02, 2024 at 10:24:54AM +0100, Mattias Rönnblom wrote:
> On 2024-01-31 15:37, Bruce Richardson wrote:
> > On Wed, Jan 24, 2024 at 12:51:03PM +0100, Mattias Rönnblom wrote:
> > > On 2024-01-23 10:43, Bruce Richardson wrote:
> > > > On Tue, Jan 23, 2024 at 10:35:02AM +0100, Mattias Rönnblom wrote:
> > > > > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > > > > Some small rewording changes to the doxygen comments on struct
> > > > > > rte_event_dev_info.
> > > > > > 
> > > > > > Signed-off-by: Bruce Richardson 
> > > > > > ---
> > > > > > lib/eventdev/rte_eventdev.h | 46 
> > > > > > -
> > > > > > 1 file changed, 25 insertions(+), 21 deletions(-)
> > > > > > 
> > > > > > diff --git a/lib/eventdev/rte_eventdev.h 
> > > > > > b/lib/eventdev/rte_eventdev.h
> > > > > > index 57a2791946..872f241df2 100644
> > > > > > --- a/lib/eventdev/rte_eventdev.h
> > > > > > +++ b/lib/eventdev/rte_eventdev.h
> > > > > > @@ -482,54 +482,58 @@ struct rte_event_dev_info {
> > > > > > const char *driver_name;/**< Event driver name 
> > > > > > */
> > > > > > struct rte_device *dev; /**< Device information */
> > > > > > uint32_t min_dequeue_timeout_ns;
> > > > > > -   /**< Minimum supported global dequeue timeout(ns) by this 
> > > > > > device */
> > > > > > +   /**< Minimum global dequeue timeout(ns) supported by this 
> > > > > > device */
> > > > > 
> > > > > Are we missing a bunch of "." here and in the other fields?
> > > > > 
> > > > > > uint32_t max_dequeue_timeout_ns;
> > > > > > -   /**< Maximum supported global dequeue timeout(ns) by this 
> > > > > > device */
> > > > > > +   /**< Maximum global dequeue timeout(ns) supported by this 
> > > > > > device */
> > > > > > uint32_t dequeue_timeout_ns;
> > > > > > /**< Configured global dequeue timeout(ns) for this 
> > > > > > device */
> > > > > > uint8_t max_event_queues;
> > > > > > -   /**< Maximum event_queues supported by this device */
> > > > > > +   /**< Maximum event queues supported by this device */
> > > > > > uint32_t max_event_queue_flows;
> > > > > > -   /**< Maximum supported flows in an event queue by this device*/
> > > > > > +   /**< Maximum number of flows within an event queue supported by 
> > > > > > this device*/
> > > > > > uint8_t max_event_queue_priority_levels;
> > > > > > /**< Maximum number of event queue priority levels by 
> > > > > > this device.
> > > > > > -* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS 
> > > > > > capability
> > > > > > +* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS 
> > > > > > capability.
> > > > > > +* The priority levels are evenly distributed between
> > > > > > +* @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref 
> > > > > > RTE_EVENT_DEV_PRIORITY_LOWEST.
> > > > > 
> > > > > This is a change of the API, in the sense it's defining something 
> > > > > previously
> > > > > left undefined?
> > > > > 
> > > > 
> > > > Well, undefined is pretty useless for app writers, no?
> > > > However, agreed that the range between HIGHEST and LOWEST is an 
> > > > assumption
> > > > on my part, chosen because it matches what happens to the event 
> > > > priorities
> > > > which are documented in event struct as "The implementation shall 
> > > > normalize
> > > >the requested priority to supported priority value" - which, while 
> > > > better
> > > > than nothing, does technically leave the details of how normalization
> > > > occurs up to the implementation.
> > > > 
> > > > > If you need 6 different priority levels in an app, how do you go about
> > > > > making sure you find the correct (distinct) Eventdev levels on any 
> > > > > event
> > > > > device supporting >= 6 levels?
> > > > > 
> > > > > #define NUM_MY_LEVELS 6
> > > > > 
> > > > > #define MY_LEVEL_TO_EVENTDEV_LEVEL(my_level) (((my_level) *
> > > > > (RTE_EVENT_DEV_PRIORITY_HIGHEST-RTE_EVENT_DEV_PRIORTY_LOWEST) /
> > > > > NUM_MY_LEVELS)
> > > > > 
> > > > > This way? One would worry a bit exactly what "evenly" means, in terms 
> > > > > of
> > > > > rounding errors. If you have an event device with 255 priority levels 
> > > > > of
> > > > > (say) 256 levels available in the API, which two levels are the same
> > > > > priority?
> > > > 
> > > > Yes, round etc. will be an issue in cases of non-powers-of 2.
> > > > However, I think we do need to clarify this behaviour, so I'm open to
> > > > alternative suggestions as to how update this.
> > > > 
> > > 
> > > In retrospect, maybe it would have been better to just express the number 
> > > of
> > > priority levels an event device supported, only allow [0, max_levels - 1] 
> > > in
> > > the prio field, and leave it to the app to do the 
> > > conversion/normalization.
> > > 
> > 
> > Yes, in many ways that would be better.
> > > Maybe a new  helper macro would at least suggest to the 
> > > PMD
> > >

Re: [PATCH v2 11/11] eventdev: RFC clarify docs on event object fields

2024-02-02 Thread Bruce Richardson
On Fri, Feb 02, 2024 at 10:45:34AM +0100, Mattias Rönnblom wrote:
> On 2024-02-01 18:02, Bruce Richardson wrote:
> > On Wed, Jan 24, 2024 at 12:34:50PM +0100, Mattias Rönnblom wrote:
> > > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > > Clarify the meaning of the NEW, FORWARD and RELEASE event types.
> > > > For the fields in "rte_event" struct, enhance the comments on each to
> > > > clarify the field's use, and whether it is preserved between enqueue and
> > > > dequeue, and it's role, if any, in scheduling.
> > > > 
> > > > Signed-off-by: Bruce Richardson 
> > > > ---
> > > > 
> > > > As with the previous patch, please review this patch to ensure that the
> > > > expected semantics of the various event types and event fields have not
> > > > changed in an unexpected way.
> > > > ---
> > > >lib/eventdev/rte_eventdev.h | 105 
> > > > ++--
> > > >1 file changed, 77 insertions(+), 28 deletions(-)
> > > > 
> > > > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > > > index cb13602ffb..4eff1c4958 100644
> > > > --- a/lib/eventdev/rte_eventdev.h
> > > > +++ b/lib/eventdev/rte_eventdev.h
> > 
> > 
> > > >/**
> > > > @@ -1473,53 +1475,100 @@ struct rte_event {
> > > > /**< Targeted flow identifier for the enqueue 
> > > > and
> > > >  * dequeue operation.
> > > >  * The value must be in the range of
> > > > -* [0, nb_event_queue_flows - 1] which
> > > > +* [0, @ref 
> > > > rte_event_dev_config.nb_event_queue_flows - 1] which
> > > 
> > > The same comment as I had before about ranges for unsigned types.
> > > 
> > Actually, is this correct, does a range actually apply here?
> > 
> > I thought that the number of queue flows supported was a guide as to how
> > internal HW resources were to be allocated, and that the flow_id was always
> > a 20-bit value, where it was up to the scheduler to work out how to map
> > that to internal atomic locks (when combined with queue ids etc.). It
> > should not be up to the app to have to do the range limiting itself!
> > 
> 
> Indeed, I also operated under this belief, which is reflected in DSW, which
> just takes the flow_id and (pseudo-)hash+mask it into the appropriate range.
> 
> Leaving it to the app allows app-level attempts to avoid collisions between
> large flows, I guess. Not sure I think apps will (or even should) really do
> this.

I'm just going to drop this restriction from v3.


Re: [PATCH] net/iavf: fix access to null value

2024-02-02 Thread Bruce Richardson
On Wed, Jan 24, 2024 at 02:05:55AM +, Mingjin Ye wrote:
> The "vsi" may be null, so it needs to be used after checking.
> 
> Fixes: ab28aad9c24f ("net/iavf: fix Rx Tx burst in multi-process")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Mingjin Ye 
> ---
>  drivers/net/iavf/iavf_rxtx.c | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
This looks safer with the checks this way.

Acked-by: Bruce Richardson 


Re: [PATCH v3 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread Jerin Jacob
On Fri, Feb 2, 2024 at 2:54 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Add Rx ARM NEON SIMD routine.
>
> Signed-off-by: Pavan Nikhilesh 
> ---
>  drivers/net/octeon_ep/cnxk_ep_rx_neon.c | 148 
>  drivers/net/octeon_ep/meson.build   |   6 +-
>  drivers/net/octeon_ep/otx_ep_ethdev.c   |   5 +-
>  drivers/net/octeon_ep/otx_ep_rxtx.h |   6 +
>  4 files changed, 163 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/net/octeon_ep/cnxk_ep_rx_neon.c

Please fix


### [PATCH] net/octeon_ep: add Rx NEON routine

CHECK:BRACES: Blank lines aren't necessary after an open brace '{'
#44: FILE: drivers/net/octeon_ep/cnxk_ep_rx_neon.c:29:
+   while (pkts < new_pkts) {
+

total: 0 errors, 0 warnings, 1 checks, 189 lines checked


[PATCH] cnxk: fix representor stats

2024-02-02 Thread Amit Prakash Shukla
Representor stats were not matching the PF/VF stats as seen
linux kernel. This patch fixes the same.

Depends-on: series-30966 ("support for port representors")

Signed-off-by: Amit Prakash Shukla 
---
 drivers/common/cnxk/roc_mbox.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/common/cnxk/roc_mbox.h b/drivers/common/cnxk/roc_mbox.h
index 39c1132792..634694ad79 100644
--- a/drivers/common/cnxk/roc_mbox.h
+++ b/drivers/common/cnxk/roc_mbox.h
@@ -1856,6 +1856,7 @@ struct nix_get_lf_stats_req {
 
 struct nix_lf_stats_rsp {
struct mbox_msghdr hdr;
+   uint16_t __io pcifunc;
struct {
uint64_t __io octs;
uint64_t __io ucast;
-- 
2.34.1



Re: [PATCH v2 03/11] eventdev: update documentation on device capability flags

2024-02-02 Thread Bruce Richardson
On Fri, Feb 02, 2024 at 09:58:25AM +0100, Mattias Rönnblom wrote:
> On 2024-01-31 15:09, Bruce Richardson wrote:
> > On Tue, Jan 23, 2024 at 10:18:53AM +0100, Mattias Rönnblom wrote:
> > > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > > Update the device capability docs, to:
> > > > 
> > > > * include more cross-references
> > > > * split longer text into paragraphs, in most cases with each flag having
> > > > a single-line summary at the start of the doc block
> > > > * general comment rewording and clarification as appropriate
> > > > 
> > > > Signed-off-by: Bruce Richardson 
> > > > ---
> > > >lib/eventdev/rte_eventdev.h | 130 
> > > > ++--
> > > >1 file changed, 93 insertions(+), 37 deletions(-)
> > > > 
> > 
> > > > * If this capability is not set, the queue only supports events of 
> > > > the
> > > > - *  *RTE_SCHED_TYPE_* type that it was created with.
> > > > + * *RTE_SCHED_TYPE_* type that it was created with.
> > > > + * Any events of other types scheduled to the queue will handled in an
> > > > + * implementation-dependent manner. They may be dropped by the
> > > > + * event device, or enqueued with the scheduling type adjusted to the
> > > > + * correct/supported value.
> > > 
> > > Having the application setting sched_type when it was already set on a the
> > > level of the queue never made sense to me.
> > > 
> > > I can't see any reasons why this field shouldn't be ignored by the event
> > > device on non-RTE_EVENT_QUEUE_CFG_ALL_TYPES queues.
> > > 
> > > If the behavior is indeed undefined, I think it's better to just say
> > > "undefined" rather than the above speculation.
> > > 
> > 
> > Updating in v3 to just say it's undefined.
> > 
> > > > *
> > > > - * @see RTE_SCHED_TYPE_* values
> > 
> > > >#define RTE_EVENT_DEV_CAP_RUNTIME_QUEUE_ATTR (1ULL << 11)
> > > >/**< Event device is capable of changing the queue attributes at 
> > > > runtime i.e
> > > > - * after rte_event_queue_setup() or rte_event_start() call sequence. 
> > > > If this
> > > > - * flag is not set, eventdev queue attributes can only be configured 
> > > > during
> > > > + * after rte_event_queue_setup() or rte_event_dev_start() call 
> > > > sequence.
> > > > + *
> > > > + * If this flag is not set, eventdev queue attributes can only be 
> > > > configured during
> > > > * rte_event_queue_setup().
> > > 
> > > "event queue" or just "queue".
> > > 
> > Ack.
> > 
> > > > + *
> > > > + * @see rte_event_queue_setup
> > > > */
> > > >#define RTE_EVENT_DEV_CAP_PROFILE_LINK (1ULL << 12)
> > > > -/**< Event device is capable of supporting multiple link profiles per 
> > > > event port
> > > > - * i.e., the value of `rte_event_dev_info::max_profiles_per_port` is 
> > > > greater
> > > > - * than one.
> > > > +/**< Event device is capable of supporting multiple link profiles per 
> > > > event port.
> > > > + *
> > > > + *
> > > > + * When set, the value of `rte_event_dev_info::max_profiles_per_port` 
> > > > is greater
> > > > + * than one, and multiple profiles may be configured and then switched 
> > > > at runtime.
> > > > + * If not set, only a single profile may be configured, which may 
> > > > itself be
> > > > + * runtime adjustable (if @ref RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK is 
> > > > set).
> > > > + *
> > > > + * @see rte_event_port_profile_links_set 
> > > > rte_event_port_profile_links_get
> > > > + * @see rte_event_port_profile_switch
> > > > + * @see RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK
> > > > */
> > > >/* Event device priority levels */
> > > >#define RTE_EVENT_DEV_PRIORITY_HIGHEST   0
> > > > -/**< Highest priority expressed across eventdev subsystem
> > > > +/**< Highest priority expressed across eventdev subsystem.
> > > 
> > > "The highest priority an event device may support."
> > > or
> > > "The highest priority any event device may support."
> > > 
> > > Maybe this is a further improvement, beyond punctuation? "across eventdev
> > > subsystem" sounds awkward.
> > > 
> > 
> > Still not very clear. Talking about device support implies that its
> > possible some devices may not support it. How about:
> >  > "highest priority level for events and queues".
> > 
> 
> Sounds good. I guess it's totally, 100% obvious highest means most urgent?
> 
> Otherwise, "highest (i.e., most urgent) priority level for events queues"

I think it's clear enough that highest priority is most urgent.


[PATCH v4 1/2] net/octeon_ep: improve Rx performance

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Use mempool API instead of pktmbuf alloc to avoid mbuf reset
as it will be done by rearm on receive.
Reorder refill to avoid unnecessary write commits on mbuf data.

Signed-off-by: Pavan Nikhilesh 
---
 v2 Changes:
 - Fix compilation with distro gcc.
 v3 Changes:
 - Fix aarch32 compilation.
 v4 Changes:
 - Fix checkpatch.

 drivers/net/octeon_ep/cnxk_ep_rx.c |  4 +--
 drivers/net/octeon_ep/cnxk_ep_rx.h | 13 ++---
 drivers/net/octeon_ep/cnxk_ep_rx_avx.c | 20 +++---
 drivers/net/octeon_ep/cnxk_ep_rx_sse.c | 38 ++
 drivers/net/octeon_ep/otx_ep_rxtx.h|  2 +-
 5 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx.c
index f3e4fb27d1..7465e0a017 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.c
@@ -76,12 +76,12 @@ cnxk_ep_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts)
uint16_t new_pkts;

new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
-   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
-
/* Refill RX buffers */
if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
cnxk_ep_rx_refill(droq);

+   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
+
return new_pkts;
 }

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.h 
b/drivers/net/octeon_ep/cnxk_ep_rx.h
index e71fc0de5c..61263e651e 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.h
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.h
@@ -21,13 +21,16 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
uint32_t i;
int rc;

-   rc = rte_pktmbuf_alloc_bulk(droq->mpool, &recv_buf_list[refill_idx], 
count);
+   rc = rte_mempool_get_bulk(droq->mpool, (void 
**)&recv_buf_list[refill_idx], count);
if (unlikely(rc)) {
droq->stats.rx_alloc_failure++;
return rc;
}

for (i = 0; i < count; i++) {
+   rte_prefetch_non_temporal(&desc_ring[(refill_idx + 1) & 3]);
+   if (i < count - 1)
+   rte_prefetch_non_temporal(recv_buf_list[refill_idx + 
1]);
buf = recv_buf_list[refill_idx];
desc_ring[refill_idx].buffer_ptr = 
rte_mbuf_data_iova_default(buf);
refill_idx++;
@@ -42,9 +45,9 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
 static inline void
 cnxk_ep_rx_refill(struct otx_ep_droq *droq)
 {
-   uint32_t desc_refilled = 0, count;
-   uint32_t nb_desc = droq->nb_desc;
+   const uint32_t nb_desc = droq->nb_desc;
uint32_t refill_idx = droq->refill_idx;
+   uint32_t desc_refilled = 0, count;
int rc;

if (unlikely(droq->read_idx == refill_idx))
@@ -128,6 +131,8 @@ cnxk_ep_rx_pkts_to_process(struct otx_ep_droq *droq, 
uint16_t nb_pkts)
return RTE_MIN(nb_pkts, droq->pkts_pending);
 }

+#define cnxk_pktmbuf_mtod(m, t) ((t)(void *)((char *)(m)->buf_addr + 
RTE_PKTMBUF_HEADROOM))
+
 static __rte_always_inline void
 cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq, uint16_t new_pkts)
 {
@@ -147,7 +152,7 @@ cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq,
  void *));

mbuf = recv_buf_list[read_idx];
-   info = rte_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
+   info = cnxk_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
read_idx = otx_ep_incr_index(read_idx, 1, nb_desc);
pkt_len = rte_bswap16(info->length >> 48);
mbuf->pkt_len = pkt_len;
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
index ae4615e6da..47eb1d2ef7 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
@@ -49,7 +49,7 @@ cnxk_ep_process_pkts_vec_avx(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq
/* Load rearm data and packet length for shuffle. */
for (i = 0; i < CNXK_EP_OQ_DESC_PER_LOOP_AVX; i++)
data[i] = _mm256_set_epi64x(0,
-   rte_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)->length >> 16,
+   cnxk_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)->length >> 16,
0, rearm_data);

/* Shuffle data to its place and sum the packet length. */
@@ -81,15 +81,15 @@ cnxk_ep_recv_pkts_avx(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkt
struct otx_ep_droq *droq = (struct otx_ep_droq *)rx_queue;
uint16_t new_pkts, vpkts;

+   /* Refill RX buffers */
+   if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
+   cnxk_ep_rx_refill(droq);
+
new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
vpkts = RTE_ALIGN_FLOOR(new_pkts, CNXK_E

[PATCH v4 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Add Rx ARM NEON SIMD routine.

Signed-off-by: Pavan Nikhilesh 
---
 drivers/net/octeon_ep/cnxk_ep_rx_neon.c | 147 
 drivers/net/octeon_ep/meson.build   |   6 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c   |   5 +-
 drivers/net/octeon_ep/otx_ep_rxtx.h |   6 +
 4 files changed, 162 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/octeon_ep/cnxk_ep_rx_neon.c

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_neon.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
new file mode 100644
index 00..4c46a7ea08
--- /dev/null
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
@@ -0,0 +1,147 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell.
+ */
+
+#include "cnxk_ep_rx.h"
+
+static __rte_always_inline void
+cnxk_ep_process_pkts_vec_neon(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq,
+ uint16_t new_pkts)
+{
+   const uint8x16_t mask0 = {0, 1, 0xff, 0xff, 0, 1, 0xff, 0xff,
+ 4, 5, 0xff, 0xff, 4, 5, 0xff, 0xff};
+   const uint8x16_t mask1 = {8,  9,  0xff, 0xff, 8,  9,  0xff, 0xff,
+ 12, 13, 0xff, 0xff, 12, 13, 0xff, 0xff};
+   struct rte_mbuf **recv_buf_list = droq->recv_buf_list;
+   uint32_t pidx0, pidx1, pidx2, pidx3;
+   struct rte_mbuf *m0, *m1, *m2, *m3;
+   uint32_t read_idx = droq->read_idx;
+   uint16_t nb_desc = droq->nb_desc;
+   uint32_t idx0, idx1, idx2, idx3;
+   uint64x2_t s01, s23;
+   uint32x4_t bytes;
+   uint16_t pkts = 0;
+
+   idx0 = read_idx;
+   s01 = vdupq_n_u64(0);
+   bytes = vdupq_n_u32(0);
+   while (pkts < new_pkts) {
+   idx1 = otx_ep_incr_index(idx0, 1, nb_desc);
+   idx2 = otx_ep_incr_index(idx1, 1, nb_desc);
+   idx3 = otx_ep_incr_index(idx2, 1, nb_desc);
+
+   if (new_pkts - pkts > 4) {
+   pidx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   pidx1 = otx_ep_incr_index(pidx0, 1, nb_desc);
+   pidx2 = otx_ep_incr_index(pidx1, 1, nb_desc);
+   pidx3 = otx_ep_incr_index(pidx2, 1, nb_desc);
+
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx0], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx1], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx2], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx3], void *));
+   }
+
+   m0 = recv_buf_list[idx0];
+   m1 = recv_buf_list[idx1];
+   m2 = recv_buf_list[idx2];
+   m3 = recv_buf_list[idx3];
+
+   /* Load packet size big-endian. */
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m0, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 0);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m1, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 1);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m2, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 2);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m3, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 3);
+   /* Convert to little-endian. */
+   s01 = vrev16q_u8(s01);
+
+   /* Vertical add, consolidate outside the loop. */
+   bytes += vaddq_u32(bytes, s01);
+   /* Segregate to packet length and data length. */
+   s23 = vqtbl1q_u8(s01, mask1);
+   s01 = vqtbl1q_u8(s01, mask0);
+
+   /* Store packet length and data length to mbuf. */
+   *(uint64_t *)&m0->pkt_len = vgetq_lane_u64(s01, 0);
+   *(uint64_t *)&m1->pkt_len = vgetq_lane_u64(s01, 1);
+   *(uint64_t *)&m2->pkt_len = vgetq_lane_u64(s23, 0);
+   *(uint64_t *)&m3->pkt_len = vgetq_lane_u64(s23, 1);
+
+   /* Reset rearm data. */
+   *(uint64_t *)&m0->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m1->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m2->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m3->rearm_data = droq->rearm_data;
+
+   rx_pkts[pkts++] = m0;
+   rx_pkts[pkts++] = m1;
+   rx_pkts[pkts++] = m2;
+   rx_pkts[pkts++] = m3;
+   idx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   }
+   droq->read_idx = idx0;
+
+   droq->refill_count += new_pkts;
+   droq->pkts_pending -= new_pkts;
+   /* Stats */
+   droq->stats.pkts_received += new_pkts;
+#if defined(RTE_ARCH_32)
+   droq->stats.bytes_received += vgetq_lane_u32(bytes, 0);
+   droq->stats.bytes_received += vgetq_lane_u32(bytes

Re: [PATCH v2 11/11] eventdev: RFC clarify docs on event object fields

2024-02-02 Thread Bruce Richardson
On Fri, Feb 02, 2024 at 10:38:10AM +0100, Mattias Rönnblom wrote:
> On 2024-02-01 17:59, Bruce Richardson wrote:
> > On Wed, Jan 24, 2024 at 12:34:50PM +0100, Mattias Rönnblom wrote:
> > > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > > Clarify the meaning of the NEW, FORWARD and RELEASE event types.
> > > > For the fields in "rte_event" struct, enhance the comments on each to
> > > > clarify the field's use, and whether it is preserved between enqueue and
> > > > dequeue, and it's role, if any, in scheduling.
> > > > 
> > > > Signed-off-by: Bruce Richardson 
> > > > ---
> > > > 

> > > Is the normalized or unnormalized value that is preserved?
> > > 
> > Very good point. It's the normalized & then denormalised version that is
> > guaranteed to be preserved, I suspect. SW eventdevs probably preserve
> > as-is, but HW eventdevs may lose precision. Rather than making this
> > "implementation defined" or "not preserved" which would be annoying for
> > apps, I think, I'm going to document this as "preserved, but with possible
> > loss of precision".
> > 
> 
> This makes me again think it may be worth noting that Eventdev -> API
> priority normalization is (event.priority * PMD_LEVELS) / EVENTDEV_LEVELS
> (rounded down) - assuming that's how it's supposed to be done - or something
> to that effect.
> 
Following my comment on the thread on the other patch about looking at
numbers of bits of priority being valid, I did a quick check of the evdev PMDs
by using grep for "max_event_priority_levels" in each driver. According to
that (and resolving some #defines), I see:

0 - dpaa, dpaa2
1 - cnxk, dsw, octeontx, opdl
4 - sw
8 - dlb2, skeleton

So it looks like switching to a bit-scheme is workable, where we measure
supported event levels in powers-of-two only. [And we can cut down priority
fields if we like].

Question for confirmation. For cases where the eventdev does not support
per-event prioritization, I suppose we should say that the priority field
is not preserved, as well as being ignored?

/Bruce


[PATCH 1/5] net/mlx5/hws: add support for resizable matchers

2024-02-02 Thread Gregory Etelson
From: Yevgeny Kliteynik 

Add support for matcher resize with the following new API calls:
 - mlx5dr_matcher_resize_set_target
 - mlx5dr_matcher_resize_rule_move

The first function links two matchers and allows moving rules from src
matcher to dst matcher. Both matchers should have the same characteristics
(e.g. same mt, same at). It is the user's responsibility to make sure that
the dst matcher has enough space for the moved rules.
After this function, the user can move rules from src into dst matcher,
and he is no longer allowed to insert rules to the src matcher.

The second function is used to move the rule from matcher that is being
resized to a bigger matcher. Moving a single rule includes creating a new
rule in the destination matcher, and deleting the rule from the source
matcher. This operation creates a single completion.

Signed-off-by: Yevgeny Kliteynik 
---
 drivers/net/mlx5/hws/mlx5dr.h |  39 +
 drivers/net/mlx5/hws/mlx5dr_definer.c |   5 +-
 drivers/net/mlx5/hws/mlx5dr_definer.h |   3 +
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 181 +++-
 drivers/net/mlx5/hws/mlx5dr_matcher.h |  21 +++
 drivers/net/mlx5/hws/mlx5dr_rule.c| 229 --
 drivers/net/mlx5/hws/mlx5dr_rule.h|  34 +++-
 drivers/net/mlx5/hws/mlx5dr_send.c|  45 +
 drivers/net/mlx5/mlx5_flow.h  |   2 +
 9 files changed, 537 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr.h b/drivers/net/mlx5/hws/mlx5dr.h
index d88f73ab57..9d8f8e13dc 100644
--- a/drivers/net/mlx5/hws/mlx5dr.h
+++ b/drivers/net/mlx5/hws/mlx5dr.h
@@ -139,6 +139,8 @@ struct mlx5dr_matcher_attr {
/* Define the insertion and distribution modes for this matcher */
enum mlx5dr_matcher_insert_mode insert_mode;
enum mlx5dr_matcher_distribute_mode distribute_mode;
+   /* Define whether the created matcher supports resizing into a bigger 
matcher */
+   bool resizable;
union {
struct {
uint8_t sz_row_log;
@@ -419,6 +421,43 @@ int mlx5dr_matcher_destroy(struct mlx5dr_matcher *matcher);
 int mlx5dr_matcher_attach_at(struct mlx5dr_matcher *matcher,
 struct mlx5dr_action_template *at);
 
+/* Link two matchers and enable moving rules from src matcher to dst matcher.
+ * Both matchers must be in the same table type, must be created with 
'resizable'
+ * property, and should have the same characteristics (e.g. same mt, same at).
+ *
+ * It is the user's responsibility to make sure that the dst matcher
+ * was allocated with the appropriate size.
+ *
+ * Once the function is completed, the user is:
+ *  - allowed to move rules from src into dst matcher
+ *  - no longer allowed to insert rules to the src matcher
+ *
+ * The user is always allowed to insert rules to the dst matcher and
+ * to delete rules from any matcher.
+ *
+ * @param[in] src_matcher
+ * source matcher for moving rules from
+ * @param[in] dst_matcher
+ * destination matcher for moving rules to
+ * @return zero on successful move, non zero otherwise.
+ */
+int mlx5dr_matcher_resize_set_target(struct mlx5dr_matcher *src_matcher,
+struct mlx5dr_matcher *dst_matcher);
+
+/* Enqueue moving rule operation: moving rule from src matcher to a dst matcher
+ *
+ * @param[in] src_matcher
+ * matcher that the rule belongs to
+ * @param[in] rule
+ * the rule to move
+ * @param[in] attr
+ * rule attributes
+ * @return zero on success, non zero otherwise.
+ */
+int mlx5dr_matcher_resize_rule_move(struct mlx5dr_matcher *src_matcher,
+   struct mlx5dr_rule *rule,
+   struct mlx5dr_rule_attr *attr);
+
 /* Get the size of the rule handle (mlx5dr_rule) to be used on rule creation.
  *
  * @return size in bytes of rule handle struct.
diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 0b60479406..6703c233bb 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -2919,9 +2919,8 @@ int mlx5dr_definer_get_id(struct mlx5dr_definer *definer)
return definer->obj->id;
 }
 
-static int
-mlx5dr_definer_compare(struct mlx5dr_definer *definer_a,
-  struct mlx5dr_definer *definer_b)
+int mlx5dr_definer_compare(struct mlx5dr_definer *definer_a,
+  struct mlx5dr_definer *definer_b)
 {
int i;
 
diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.h 
b/drivers/net/mlx5/hws/mlx5dr_definer.h
index 6f1c99e37a..9c3db53ff3 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.h
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.h
@@ -673,4 +673,7 @@ int mlx5dr_definer_init_cache(struct mlx5dr_definer_cache 
**cache);
 
 void mlx5dr_definer_uninit_cache(struct mlx5dr_definer_cache *cache);
 
+int mlx5dr_definer_compare(struct mlx5dr_definer *definer_a,
+  struct mlx5dr_definer *definer_b);
+
 #

[PATCH 2/5] net/mlx5: add resize function to ipool

2024-02-02 Thread Gregory Etelson
From: Maayan Kashani 

Before this patch, ipool size could be fixed by
setting max_idx in mlx5_indexed_pool_config upon
ipool creation. Or it can be auto resized to the
maximum limit by setting max_idx to zero upon
ipool creation and the saved value is the maximum
index possible.
This patch adds ipool_resize API that enables to
update the value of max_idx in case it is not set to
maximum, meaning not in auto resize mode. It
enables the allocation of new trunk when using
malloc/zmalloc up to the max_idx limit. Please
notice the resize number of entries should be divisible by trunk_size.

Signed-off-by: Maayan Kashani 
---
 drivers/net/mlx5/mlx5_utils.c | 29 +
 drivers/net/mlx5/mlx5_utils.h | 16 
 2 files changed, 45 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_utils.c b/drivers/net/mlx5/mlx5_utils.c
index 4db738785f..e28db2ec43 100644
--- a/drivers/net/mlx5/mlx5_utils.c
+++ b/drivers/net/mlx5/mlx5_utils.c
@@ -809,6 +809,35 @@ mlx5_ipool_get_next(struct mlx5_indexed_pool *pool, 
uint32_t *pos)
return NULL;
 }
 
+int
+mlx5_ipool_resize(struct mlx5_indexed_pool *pool, uint32_t num_entries)
+{
+   uint32_t cur_max_idx;
+   uint32_t max_index = mlx5_trunk_idx_offset_get(pool, TRUNK_MAX_IDX + 1);
+
+   if (num_entries % pool->cfg.trunk_size) {
+   DRV_LOG(ERR, "num_entries param should be trunk_size(=%u) 
multiplication\n",
+   pool->cfg.trunk_size);
+   return -EINVAL;
+   }
+
+   mlx5_ipool_lock(pool);
+   cur_max_idx = pool->cfg.max_idx + num_entries;
+   /* If the ipool max idx is above maximum or uint overflow occurred. */
+   if (cur_max_idx > max_index || cur_max_idx < num_entries) {
+   DRV_LOG(ERR, "Ipool resize failed\n");
+   DRV_LOG(ERR, "Adding %u entries to existing %u entries, will 
cross max limit(=%u)\n",
+   num_entries, cur_max_idx, max_index);
+   mlx5_ipool_unlock(pool);
+   return -EINVAL;
+   }
+
+   /* Update maximum entries number. */
+   pool->cfg.max_idx = cur_max_idx;
+   mlx5_ipool_unlock(pool);
+   return 0;
+}
+
 void
 mlx5_ipool_dump(struct mlx5_indexed_pool *pool)
 {
diff --git a/drivers/net/mlx5/mlx5_utils.h b/drivers/net/mlx5/mlx5_utils.h
index 82e8298781..f3c0d76a6d 100644
--- a/drivers/net/mlx5/mlx5_utils.h
+++ b/drivers/net/mlx5/mlx5_utils.h
@@ -427,6 +427,22 @@ void mlx5_ipool_flush_cache(struct mlx5_indexed_pool 
*pool);
  */
 void *mlx5_ipool_get_next(struct mlx5_indexed_pool *pool, uint32_t *pos);
 
+/**
+ * This function resize the ipool.
+ *
+ * @param pool
+ *   Pointer to the index memory pool handler.
+ * @param num_entries
+ *   Number of entries to be added to the pool.
+ *   This number should be divisible by trunk_size.
+ *
+ * @return
+ *   - non-zero value on error.
+ *   - 0 on success.
+ *
+ */
+int mlx5_ipool_resize(struct mlx5_indexed_pool *pool, uint32_t num_entries);
+
 /**
  * This function allocates new empty Three-level table.
  *
-- 
2.39.2



[PATCH 3/5] net/mlx5: fix parameters verification in HWS table create

2024-02-02 Thread Gregory Etelson
Modified the conditionals in `flow_hw_table_create()` to use bitwise
AND instead of equality checks when assessing
`table_cfg->attr->specialize` bitmask.
This will allow for greater flexibility as the bitmask may encapsulate
multiple flags.
The patch maintains the previous behavior with single flag values,
while providing support for multiple flags.

Fixes: 592d5367b5e4 ("net/mlx5: enable hint in async flow table")
Signed-off-by: Gregory Etelson 
---
 drivers/net/mlx5/mlx5_flow_hw.c | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index da873ae2e2..3125500641 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -4368,12 +4368,23 @@ flow_hw_table_create(struct rte_eth_dev *dev,
matcher_attr.rule.num_log = rte_log2_u32(nb_flows);
/* Parse hints information. */
if (attr->specialize) {
-   if (attr->specialize == 
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG)
-   matcher_attr.optimize_flow_src = 
MLX5DR_MATCHER_FLOW_SRC_WIRE;
-   else if (attr->specialize == 
RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG)
-   matcher_attr.optimize_flow_src = 
MLX5DR_MATCHER_FLOW_SRC_VPORT;
-   else
-   DRV_LOG(INFO, "Unsupported hint value %x", 
attr->specialize);
+   uint32_t val = RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG |
+  RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG;
+
+   if ((attr->specialize & val) == val) {
+   DRV_LOG(INFO, "Invalid hint value %x",
+   attr->specialize);
+   rte_errno = EINVAL;
+   goto it_error;
+   }
+   if (attr->specialize &
+   RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_WIRE_ORIG)
+   matcher_attr.optimize_flow_src =
+   MLX5DR_MATCHER_FLOW_SRC_WIRE;
+   else if (attr->specialize &
+RTE_FLOW_TABLE_SPECIALIZE_TRANSFER_VPORT_ORIG)
+   matcher_attr.optimize_flow_src =
+   MLX5DR_MATCHER_FLOW_SRC_VPORT;
}
/* Build the item template. */
for (i = 0; i < nb_item_templates; i++) {
-- 
2.39.2



[PATCH 0/5] net/mlx5: add support for flow table resizing

2024-02-02 Thread Gregory Etelson
Gregory Etelson (3):
  net/mlx5: fix parameters verification in HWS table create
  net/mlx5: move multi-pattern actions management to table level
  net/mlx5: add support for flow table resizing

Maayan Kashani (1):
  net/mlx5: add resize function to ipool

Yevgeny Kliteynik (1):
  net/mlx5/hws: add support for resizable matchers

 drivers/net/mlx5/hws/mlx5dr.h |  39 ++
 drivers/net/mlx5/hws/mlx5dr_definer.c |   5 +-
 drivers/net/mlx5/hws/mlx5dr_definer.h |   3 +
 drivers/net/mlx5/hws/mlx5dr_matcher.c | 181 ++-
 drivers/net/mlx5/hws/mlx5dr_matcher.h |  21 +
 drivers/net/mlx5/hws/mlx5dr_rule.c| 229 +++-
 drivers/net/mlx5/hws/mlx5dr_rule.h|  34 +-
 drivers/net/mlx5/hws/mlx5dr_send.c|  45 ++
 drivers/net/mlx5/mlx5.h   |   5 +
 drivers/net/mlx5/mlx5_flow.c  |  51 ++
 drivers/net/mlx5/mlx5_flow.h  | 103 +++-
 drivers/net/mlx5/mlx5_flow_hw.c   | 748 +++---
 drivers/net/mlx5/mlx5_host.c  | 211 
 drivers/net/mlx5/mlx5_utils.c |  29 +
 drivers/net/mlx5/mlx5_utils.h |  16 +
 15 files changed, 1498 insertions(+), 222 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_host.c

Depends-on: series-30952 ([v2] ethdev: add template table resize API)

-- 
2.39.2



[PATCH 4/5] net/mlx5: move multi-pattern actions management to table level

2024-02-02 Thread Gregory Etelson
The multi-pattern actions related structures and management code
have been moved to the table level.
That code refactor is required for the upcoming table resize feature.

Signed-off-by: Gregory Etelson 
---
 drivers/net/mlx5/mlx5_flow.h|  73 +-
 drivers/net/mlx5/mlx5_flow_hw.c | 229 +++-
 2 files changed, 177 insertions(+), 125 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index b003e97dc9..497d4b0f0c 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1390,7 +1390,6 @@ struct mlx5_hw_encap_decap_action {
/* Is header_reformat action shared across flows in table. */
uint32_t shared:1;
uint32_t multi_pattern:1;
-   volatile uint32_t *multi_pattern_refcnt;
size_t data_size; /* Action metadata size. */
uint8_t data[]; /* Action data. */
 };
@@ -1413,7 +1412,6 @@ struct mlx5_hw_modify_header_action {
/* Is MODIFY_HEADER action shared across flows in table. */
uint32_t shared:1;
uint32_t multi_pattern:1;
-   volatile uint32_t *multi_pattern_refcnt;
/* Amount of modification commands stored in the precompiled buffer. */
uint32_t mhdr_cmds_num;
/* Precompiled modification commands. */
@@ -1467,6 +1465,76 @@ struct mlx5_flow_group {
 #define MLX5_HW_TBL_MAX_ITEM_TEMPLATE 2
 #define MLX5_HW_TBL_MAX_ACTION_TEMPLATE 32
 
+#define MLX5_MULTIPATTERN_ENCAP_NUM 5
+#define MLX5_MAX_TABLE_RESIZE_NUM 64
+
+struct mlx5_multi_pattern_segment {
+   uint32_t capacity;
+   uint32_t head_index;
+   struct mlx5dr_action *mhdr_action;
+   struct mlx5dr_action *reformat_action[MLX5_MULTIPATTERN_ENCAP_NUM];
+};
+
+struct mlx5_tbl_multi_pattern_ctx {
+   struct {
+   uint32_t elements_num;
+   struct mlx5dr_action_reformat_header 
reformat_hdr[MLX5_HW_TBL_MAX_ACTION_TEMPLATE];
+   /**
+* insert_header structure is larger than reformat_header.
+* Enclosing these structures with union will case a gap between
+* reformat_hdr array elements.
+* mlx5dr_action_create_reformat() expects adjacent array 
elements.
+*/
+   struct mlx5dr_action_insert_header 
insert_hdr[MLX5_HW_TBL_MAX_ACTION_TEMPLATE];
+   } reformat[MLX5_MULTIPATTERN_ENCAP_NUM];
+
+   struct {
+   uint32_t elements_num;
+   struct mlx5dr_action_mh_pattern 
pattern[MLX5_HW_TBL_MAX_ACTION_TEMPLATE];
+   } mh;
+   struct mlx5_multi_pattern_segment segments[MLX5_MAX_TABLE_RESIZE_NUM];
+};
+
+static __rte_always_inline void
+mlx5_multi_pattern_activate(struct mlx5_tbl_multi_pattern_ctx *mpctx)
+{
+   mpctx->segments[0].head_index = 1;
+}
+
+static __rte_always_inline bool
+mlx5_is_multi_pattern_active(const struct mlx5_tbl_multi_pattern_ctx *mpctx)
+{
+   return mpctx->segments[0].head_index == 1;
+}
+
+static __rte_always_inline struct mlx5_multi_pattern_segment *
+mlx5_multi_pattern_segment_get_next(struct mlx5_tbl_multi_pattern_ctx *mpctx)
+{
+   int i;
+
+   for (i = 0; i < MLX5_MAX_TABLE_RESIZE_NUM; i++) {
+   if (!mpctx->segments[i].capacity)
+   return &mpctx->segments[i];
+   }
+   return NULL;
+}
+
+static __rte_always_inline struct mlx5_multi_pattern_segment *
+mlx5_multi_pattern_segment_find(struct mlx5_tbl_multi_pattern_ctx *mpctx,
+   uint32_t flow_resource_ix)
+{
+   int i;
+
+   for (i = 0; i < MLX5_MAX_TABLE_RESIZE_NUM; i++) {
+   uint32_t limit = mpctx->segments[i].head_index +
+mpctx->segments[i].capacity;
+
+   if (flow_resource_ix < limit)
+   return &mpctx->segments[i];
+   }
+   return NULL;
+}
+
 struct mlx5_flow_template_table_cfg {
struct rte_flow_template_table_attr attr; /* Table attributes passed 
through flow API. */
bool external; /* True if created by flow API, false if table is 
internal to PMD. */
@@ -1487,6 +1555,7 @@ struct rte_flow_template_table {
uint8_t nb_item_templates; /* Item template number. */
uint8_t nb_action_templates; /* Action template number. */
uint32_t refcnt; /* Table reference counter. */
+   struct mlx5_tbl_multi_pattern_ctx mpctx;
 };
 
 #endif
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 3125500641..e5c770c6fc 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -74,41 +74,14 @@ struct mlx5_indlst_legacy {
 #define MLX5_CONST_ENCAP_ITEM(encap_type, ptr) \
 (((const struct encap_type *)(ptr))->definition)
 
-struct mlx5_multi_pattern_ctx {
-   union {
-   struct mlx5dr_action_reformat_header reformat_hdr;
-   struct mlx5dr_action_mh_pattern mh_pattern;
-   };
-   union {
-   /* action template auxiliary 

[PATCH 5/5] net/mlx5: add support for flow table resizing

2024-02-02 Thread Gregory Etelson
Support template table API in PMD.
The patch allows to increase existing table capacity.

Signed-off-by: Gregory Etelson 
---
 drivers/net/mlx5/mlx5.h |   5 +
 drivers/net/mlx5/mlx5_flow.c|  51 
 drivers/net/mlx5/mlx5_flow.h|  84 --
 drivers/net/mlx5/mlx5_flow_hw.c | 512 +++-
 drivers/net/mlx5/mlx5_host.c| 211 +
 5 files changed, 758 insertions(+), 105 deletions(-)
 create mode 100644 drivers/net/mlx5/mlx5_host.c

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index f2e2e04429..ff0ca7fa42 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -380,6 +380,9 @@ enum mlx5_hw_job_type {
MLX5_HW_Q_JOB_TYPE_UPDATE, /* Flow update job type. */
MLX5_HW_Q_JOB_TYPE_QUERY, /* Flow query job type. */
MLX5_HW_Q_JOB_TYPE_UPDATE_QUERY, /* Flow update and query job type. */
+   MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_CREATE, /* Non-optimized flow create job 
type. */
+   MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_DESTROY, /* Non-optimized destroy create 
job type. */
+   MLX5_HW_Q_JOB_TYPE_RSZTBL_FLOW_MOVE, /* Move flow after table resize. */
 };
 
 enum mlx5_hw_indirect_type {
@@ -422,6 +425,8 @@ struct mlx5_hw_q {
struct mlx5_hw_q_job **job; /* LIFO header. */
struct rte_ring *indir_cq; /* Indirect action SW completion queue. */
struct rte_ring *indir_iq; /* Indirect action SW in progress queue. */
+   struct rte_ring *flow_transfer_pending;
+   struct rte_ring *flow_transfer_completed;
 } __rte_cache_aligned;
 
 
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index 85e8c77c81..521119e138 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -1198,6 +1198,20 @@ mlx5_flow_calc_table_hash(struct rte_eth_dev *dev,
  uint8_t pattern_template_index,
  uint32_t *hash, struct rte_flow_error *error);
 
+static int
+mlx5_template_table_resize(struct rte_eth_dev *dev,
+  struct rte_flow_template_table *table,
+  uint32_t nb_rules, struct rte_flow_error *error);
+static int
+mlx5_flow_async_update_resized(struct rte_eth_dev *dev, uint32_t queue,
+  const struct rte_flow_op_attr *attr,
+  struct rte_flow *rule, void *user_data,
+  struct rte_flow_error *error);
+static int
+mlx5_table_resize_complete(struct rte_eth_dev *dev,
+  struct rte_flow_template_table *table,
+  struct rte_flow_error *error);
+
 static const struct rte_flow_ops mlx5_flow_ops = {
.validate = mlx5_flow_validate,
.create = mlx5_flow_create,
@@ -1253,6 +1267,9 @@ static const struct rte_flow_ops mlx5_flow_ops = {
.async_action_list_handle_query_update =
mlx5_flow_async_action_list_handle_query_update,
.flow_calc_table_hash = mlx5_flow_calc_table_hash,
+   .flow_template_table_resize = mlx5_template_table_resize,
+   .flow_update_resized = mlx5_flow_async_update_resized,
+   .flow_template_table_resize_complete = mlx5_table_resize_complete,
 };
 
 /* Tunnel information. */
@@ -5,6 +11132,40 @@ mlx5_flow_calc_table_hash(struct rte_eth_dev *dev,
  hash, error);
 }
 
+static int
+mlx5_template_table_resize(struct rte_eth_dev *dev,
+  struct rte_flow_template_table *table,
+  uint32_t nb_rules, struct rte_flow_error *error)
+{
+   const struct mlx5_flow_driver_ops *fops;
+
+   MLX5_DRV_FOPS_OR_ERR(dev, fops, table_resize, ENOTSUP);
+   return fops->table_resize(dev, table, nb_rules, error);
+}
+
+static int
+mlx5_table_resize_complete(struct rte_eth_dev *dev,
+  struct rte_flow_template_table *table,
+  struct rte_flow_error *error)
+{
+   const struct mlx5_flow_driver_ops *fops;
+
+   MLX5_DRV_FOPS_OR_ERR(dev, fops, table_resize_complete, ENOTSUP);
+   return fops->table_resize_complete(dev, table, error);
+}
+
+static int
+mlx5_flow_async_update_resized(struct rte_eth_dev *dev, uint32_t queue,
+  const struct rte_flow_op_attr *op_attr,
+  struct rte_flow *rule, void *user_data,
+  struct rte_flow_error *error)
+{
+   const struct mlx5_flow_driver_ops *fops;
+
+   MLX5_DRV_FOPS_OR_ERR(dev, fops, flow_update_resized, ENOTSUP);
+   return fops->flow_update_resized(dev, queue, op_attr, rule, user_data, 
error);
+}
+
 /**
  * Destroy all indirect actions (shared RSS).
  *
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 497d4b0f0c..c7d84af659 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1210,6 +1210,7 @@ struct rte_flow {
uint32_t tunnel:1;
uint32_t meter:24

Re: [PATCH v2 11/11] eventdev: RFC clarify docs on event object fields

2024-02-02 Thread Bruce Richardson
On Fri, Feb 02, 2024 at 11:33:19AM +, Bruce Richardson wrote:
> On Fri, Feb 02, 2024 at 10:38:10AM +0100, Mattias Rönnblom wrote:
> > On 2024-02-01 17:59, Bruce Richardson wrote:
> > > On Wed, Jan 24, 2024 at 12:34:50PM +0100, Mattias Rönnblom wrote:
> > > > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > > > Clarify the meaning of the NEW, FORWARD and RELEASE event types.
> > > > > For the fields in "rte_event" struct, enhance the comments on each to
> > > > > clarify the field's use, and whether it is preserved between enqueue 
> > > > > and
> > > > > dequeue, and it's role, if any, in scheduling.
> > > > > 
> > > > > Signed-off-by: Bruce Richardson 
> > > > > ---
> > > > > 
> 
> > > > Is the normalized or unnormalized value that is preserved?
> > > > 
> > > Very good point. It's the normalized & then denormalised version that is
> > > guaranteed to be preserved, I suspect. SW eventdevs probably preserve
> > > as-is, but HW eventdevs may lose precision. Rather than making this
> > > "implementation defined" or "not preserved" which would be annoying for
> > > apps, I think, I'm going to document this as "preserved, but with possible
> > > loss of precision".
> > > 
> > 
> > This makes me again think it may be worth noting that Eventdev -> API
> > priority normalization is (event.priority * PMD_LEVELS) / EVENTDEV_LEVELS
> > (rounded down) - assuming that's how it's supposed to be done - or something
> > to that effect.
> > 
> Following my comment on the thread on the other patch about looking at
> numbers of bits of priority being valid, I did a quick check of the evdev PMDs
> by using grep for "max_event_priority_levels" in each driver. According to
> that (and resolving some #defines), I see:
> 
> 0 - dpaa, dpaa2
> 1 - cnxk, dsw, octeontx, opdl
> 4 - sw
> 8 - dlb2, skeleton
> 
> So it looks like switching to a bit-scheme is workable, where we measure
> supported event levels in powers-of-two only. [And we can cut down priority
> fields if we like].
> 
And just for reference, the advertized values for
max_event_queue_priority_levels are:

1 - dsw, opdl
8 - cnxk, dlb2, dpaa, dpaa2, octeontx, skeleton
255 - sw [though this should really be 256, it's an off-by-one error due to
  the range of uint8_t type. SW evdev just sorts queues by priority
  using the whole priority value specified.]

So I think we can treat queue priority similarly to event priority - giving
the number of bits which are valid. Also, if we decide to cut the event
priority level range to e.g. 0-15, I think we can do the same for the queue
priority levels, so that the ranges are similar, and then we can adjust the
min-max definitions to match.

/Bruce


Re: [dpdk-dev] [PATCH] net/cnxk: fix aged flows query

2024-02-02 Thread Jerin Jacob
On Fri, Feb 2, 2024 at 10:51 AM  wrote:
>
> From: Satheesh Paul 
>
> After all aged flows are destroyed, the aged_flows bitmap
> is free-ed. Querying aged flows tries to access this bitmap
> resulting in a segmentation fault. Fixing this by not accessing
> the bitmap if no aged flows are present.
>
> Fixes: 357f5ebc8a24 ("common/cnxk: support flow aging")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Satheesh Paul 
> Reviewed-by: Kiran Kumar K 

Applied to dpdk-next-net-mrvl/for-main. Thanks


[PATCH v3 00/11] improve eventdev API specification/documentation

2024-02-02 Thread Bruce Richardson
This patchset makes rewording improvements to the eventdev doxygen
documentation to try and ensure that it is as clear as possible,
describes the implementation as accurately as possible, and is
consistent within itself.

Most changes are just minor rewordings, along with plenty of changes to
change references into doxygen links/cross-references.

In tightening up the definitions, there may be subtle changes in meaning
which should be checked for carefully by reviewers. Where there was
ambiguity, the behaviour of existing code is documented so as to avoid
breaking existing apps.

V3:
* major cleanup following review by Mattias and on-list discussions
* old patch 7 split in two and merged with other changes in the same
  area rather than being standalone.
* new patch 11 added at end of series.

V2:
* additional cleanup and changes
* remove "escaped" accidental change to .c file

Bruce Richardson (11):
  eventdev: improve doxygen introduction text
  eventdev: move text on driver internals to proper section
  eventdev: update documentation on device capability flags
  eventdev: cleanup doxygen comments on info structure
  eventdev: improve function documentation for query fns
  eventdev: improve doxygen comments on configure struct
  eventdev: improve doxygen comments on config fns
  eventdev: improve doxygen comments for control APIs
  eventdev: improve comments on scheduling types
  eventdev: clarify docs on event object fields and op types
  eventdev: drop comment for anon union from doxygen

 lib/eventdev/rte_eventdev.h | 952 +++-
 1 file changed, 620 insertions(+), 332 deletions(-)

--
2.40.1



[PATCH v3 01/11] eventdev: improve doxygen introduction text

2024-02-02 Thread Bruce Richardson
Make some textual improvements to the introduction to eventdev and event
devices in the eventdev header file. This text appears in the doxygen
output for the header file, and introduces the key concepts, for
example: events, event devices, queues, ports and scheduling.

This patch makes the following improvements:
* small textual fixups, e.g. correcting use of singular/plural
* rewrites of some sentences to improve clarity
* using doxygen markdown to split the whole large block up into
  sections, thereby making it easier to read.

No large-scale changes are made, and blocks are not reordered

Signed-off-by: Bruce Richardson 

---
V3: reworked following feedback from Mattias
---
 lib/eventdev/rte_eventdev.h | 132 ++--
 1 file changed, 81 insertions(+), 51 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index ec9b02455d..a741832e8e 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -12,25 +12,33 @@
  * @file
  *
  * RTE Event Device API
+ * 
  *
- * In a polling model, lcores poll ethdev ports and associated rx queues
- * directly to look for packet. In an event driven model, by contrast, lcores
- * call the scheduler that selects packets for them based on programmer
- * specified criteria. Eventdev library adds support for event driven
- * programming model, which offer applications automatic multicore scaling,
- * dynamic load balancing, pipelining, packet ingress order maintenance and
- * synchronization services to simplify application packet processing.
+ * In a traditional run-to-completion application model, lcores pick up packets
+ * from Ethdev ports and associated RX queues, run the packet processing to 
completion,
+ * and enqueue the completed packets to a TX queue. NIC-level receive-side 
scaling (RSS)
+ * may be used to balance the load across multiple CPU cores.
+ *
+ * In contrast, in an event-driver model, as supported by this "eventdev" 
library,
+ * incoming packets are fed into an event device, which schedules those 
packets across
+ * the available lcores, in accordance with its configuration.
+ * This event-driven programming model offers applications automatic multicore 
scaling,
+ * dynamic load balancing, pipelining, packet order maintenance, 
synchronization,
+ * and prioritization/quality of service.
  *
  * The Event Device API is composed of two parts:
  *
  * - The application-oriented Event API that includes functions to setup
  *   an event device (configure it, setup its queues, ports and start it), to
- *   establish the link between queues to port and to receive events, and so 
on.
+ *   establish the links between queues and ports to receive events, and so on.
  *
  * - The driver-oriented Event API that exports a function allowing
- *   an event poll Mode Driver (PMD) to simultaneously register itself as
+ *   an event poll Mode Driver (PMD) to register itself as
  *   an event device driver.
  *
+ * Application-oriented Event API
+ * --
+ *
  * Event device components:
  *
  * +-+
@@ -75,27 +83,39 @@
  *|   |
  *+---+
  *
- * Event device: A hardware or software-based event scheduler.
+ * **Event device**: A hardware or software-based event scheduler.
  *
- * Event: A unit of scheduling that encapsulates a packet or other datatype
- * like SW generated event from the CPU, Crypto work completion notification,
- * Timer expiry event notification etc as well as metadata.
- * The metadata includes flow ID, scheduling type, event priority, event_type,
- * sub_event_type etc.
+ * **Event**: Represents an item of work and is the smallest unit of 
scheduling.
+ * An event carries metadata, such as queue ID, scheduling type, and event 
priority,
+ * and data such as one or more packets or other kinds of buffers.
+ * Some examples of events are:
+ * - a software-generated item of work originating from a lcore,
+ *   perhaps carrying a packet to be processed,
+ * - a crypto work completion notification
+ * - a timer expiry notification.
  *
- * Event queue: A queue containing events that are scheduled by the event dev.
+ * **Event queue**: A queue containing events that are scheduled by the event 
device.
  * An event queue contains events of different flows associated with scheduling
  * types, such as atomic, ordered, or parallel.
+ * Each event given to an event device must have a valid event queue id field 
in the metadata,
+ * to specify on which event queue in the device the event must be placed,
+ * for later scheduling.
  *
- * Event port: An application's interface into the event dev for enqueue and
+ * **Event port**: An application's interface into the event dev for enqueue 
and
  * dequeue operations. Each event port can be linked with one or more
  * event q

[PATCH v3 02/11] eventdev: move text on driver internals to proper section

2024-02-02 Thread Bruce Richardson
Inside the doxygen introduction text, some internal details of how
eventdev works was mixed in with application-relevant details. Move
these details on probing etc. to the driver-relevant section.

Signed-off-by: Bruce Richardson 
---
 lib/eventdev/rte_eventdev.h | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index a741832e8e..37493464f9 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -122,22 +122,6 @@
  * In all functions of the Event API, the Event device is
  * designated by an integer >= 0 named the device identifier *dev_id*
  *
- * At the Event driver level, Event devices are represented by a generic
- * data structure of type *rte_event_dev*.
- *
- * Event devices are dynamically registered during the PCI/SoC device probing
- * phase performed at EAL initialization time.
- * When an Event device is being probed, an *rte_event_dev* structure is 
allocated
- * for it and the event_dev_init() function supplied by the Event driver
- * is invoked to properly initialize the device.
- *
- * The role of the device init function is to reset the device hardware or
- * to initialize the software event driver implementation.
- *
- * If the device init operation is successful, the device is assigned a device
- * id (dev_id) for application use.
- * Otherwise, the *rte_event_dev* structure is freed.
- *
  * The functions exported by the application Event API to setup a device
  * must be invoked in the following order:
  * - rte_event_dev_configure()
@@ -173,6 +157,22 @@
  * Driver-Oriented Event API
  * -
  *
+ * At the Event driver level, Event devices are represented by a generic
+ * data structure of type *rte_event_dev*.
+ *
+ * Event devices are dynamically registered during the PCI/SoC device probing
+ * phase performed at EAL initialization time.
+ * When an Event device is being probed, an *rte_event_dev* structure is 
allocated
+ * for it and the event_dev_init() function supplied by the Event driver
+ * is invoked to properly initialize the device.
+ *
+ * The role of the device init function is to reset the device hardware or
+ * to initialize the software event driver implementation.
+ *
+ * If the device init operation is successful, the device is assigned a device
+ * id (dev_id) for application use.
+ * Otherwise, the *rte_event_dev* structure is freed.
+ *
  * Each function of the application Event API invokes a specific function
  * of the PMD that controls the target device designated by its device
  * identifier.
-- 
2.40.1



[PATCH v3 03/11] eventdev: update documentation on device capability flags

2024-02-02 Thread Bruce Richardson
Update the device capability docs, to:

* include more cross-references
* split longer text into paragraphs, in most cases with each flag having
  a single-line summary at the start of the doc block
* general comment rewording and clarification as appropriate

Signed-off-by: Bruce Richardson 
---
V3: Updated following feedback from Mattias
---
 lib/eventdev/rte_eventdev.h | 130 +---
 1 file changed, 92 insertions(+), 38 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 37493464f9..a33024479d 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -253,143 +253,197 @@ struct rte_event;
 /* Event device capability bitmap flags */
 #define RTE_EVENT_DEV_CAP_QUEUE_QOS   (1ULL << 0)
 /**< Event scheduling prioritization is based on the priority and weight
- * associated with each event queue. Events from a queue with highest priority
- * is scheduled first. If the queues are of same priority, weight of the queues
+ * associated with each event queue.
+ *
+ * Events from a queue with highest priority
+ * are scheduled first. If the queues are of same priority, weight of the 
queues
  * are considered to select a queue in a weighted round robin fashion.
  * Subsequent dequeue calls from an event port could see events from the same
  * event queue, if the queue is configured with an affinity count. Affinity
  * count is the number of subsequent dequeue calls, in which an event port
  * should use the same event queue if the queue is non-empty
  *
+ * NOTE: A device may use both queue prioritization and event prioritization
+ * (@ref RTE_EVENT_DEV_CAP_EVENT_QOS capability) when making packet scheduling 
decisions.
+ *
  *  @see rte_event_queue_setup(), rte_event_queue_attr_set()
  */
 #define RTE_EVENT_DEV_CAP_EVENT_QOS   (1ULL << 1)
 /**< Event scheduling prioritization is based on the priority associated with
- *  each event. Priority of each event is supplied in *rte_event* structure
+ *  each event.
+ *
+ *  Priority of each event is supplied in *rte_event* structure
  *  on each enqueue operation.
+ *  If this capability is not set, the priority field of the event structure
+ *  is ignored for each event.
  *
+ * NOTE: A device may use both queue prioritization (@ref 
RTE_EVENT_DEV_CAP_QUEUE_QOS capability)
+ * and event prioritization when making packet scheduling decisions.
+
  *  @see rte_event_enqueue_burst()
  */
 #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED   (1ULL << 2)
 /**< Event device operates in distributed scheduling mode.
+ *
  * In distributed scheduling mode, event scheduling happens in HW or
- * rte_event_dequeue_burst() or the combination of these two.
+ * rte_event_dequeue_burst() / rte_event_enqueue_burst() or the combination of 
these two.
  * If the flag is not set then eventdev is centralized and thus needs a
  * dedicated service core that acts as a scheduling thread .
  *
- * @see rte_event_dequeue_burst()
+ * @see rte_event_dev_service_id_get
  */
 #define RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES (1ULL << 3)
 /**< Event device is capable of enqueuing events of any type to any queue.
- * If this capability is not set, the queue only supports events of the
- *  *RTE_SCHED_TYPE_* type that it was created with.
  *
- * @see RTE_SCHED_TYPE_* values
+ * If this capability is not set, each queue only supports events of the
+ * *RTE_SCHED_TYPE_* type that it was created with.
+ * The behaviour when events of other scheduling types are sent to the queue is
+ * currently undefined.
+ *
+ * @see rte_event_enqueue_burst
+ * @see RTE_SCHED_TYPE_ATOMIC RTE_SCHED_TYPE_ORDERED RTE_SCHED_TYPE_PARALLEL
  */
 #define RTE_EVENT_DEV_CAP_BURST_MODE  (1ULL << 4)
 /**< Event device is capable of operating in burst mode for enqueue(forward,
- * release) and dequeue operation. If this capability is not set, application
- * still uses the rte_event_dequeue_burst() and rte_event_enqueue_burst() but
- * PMD accepts only one event at a time.
+ * release) and dequeue operation.
+ *
+ * If this capability is not set, application
+ * can still use the rte_event_dequeue_burst() and rte_event_enqueue_burst() 
but
+ * PMD accepts or returns only one event at a time.
  *
  * @see rte_event_dequeue_burst() rte_event_enqueue_burst()
  */
 #define RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE(1ULL << 5)
 /**< Event device ports support disabling the implicit release feature, in
  * which the port will release all unreleased events in its dequeue operation.
+ *
  * If this capability is set and the port is configured with implicit release
  * disabled, the application is responsible for explicitly releasing events
- * using either the RTE_EVENT_OP_FORWARD or the RTE_EVENT_OP_RELEASE event
+ * using either the @ref RTE_EVENT_OP_FORWARD or the @ref RTE_EVENT_OP_RELEASE 
event
  * enqueue operations.
  *
  * @see rte_event_dequeue_burst() rte_event_enqueue_burst()
  */
 
 #define RTE_EVENT_DEV_CAP_NONSEQ_MODE 

[PATCH v3 04/11] eventdev: cleanup doxygen comments on info structure

2024-02-02 Thread Bruce Richardson
Some small rewording changes to the doxygen comments on struct
rte_event_dev_info.

Signed-off-by: Bruce Richardson 

---
V3: reworked following feedback
- added closing "." on comments
- added more cross-reference links
- reworded priority level comments
---
 lib/eventdev/rte_eventdev.h | 85 +
 1 file changed, 58 insertions(+), 27 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index a33024479d..da3f72d89e 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -487,57 +487,88 @@ rte_event_dev_socket_id(uint8_t dev_id);
  * Event device information
  */
 struct rte_event_dev_info {
-   const char *driver_name;/**< Event driver name */
-   struct rte_device *dev; /**< Device information */
+   const char *driver_name;/**< Event driver name. */
+   struct rte_device *dev; /**< Device information. */
uint32_t min_dequeue_timeout_ns;
-   /**< Minimum supported global dequeue timeout(ns) by this device */
+   /**< Minimum global dequeue timeout(ns) supported by this device. */
uint32_t max_dequeue_timeout_ns;
-   /**< Maximum supported global dequeue timeout(ns) by this device */
+   /**< Maximum global dequeue timeout(ns) supported by this device. */
uint32_t dequeue_timeout_ns;
-   /**< Configured global dequeue timeout(ns) for this device */
+   /**< Configured global dequeue timeout(ns) for this device. */
uint8_t max_event_queues;
-   /**< Maximum event_queues supported by this device */
+   /**< Maximum event queues supported by this device.
+*
+* This count excludes any queues covered by @ref 
max_single_link_event_port_queue_pairs.
+*/
uint32_t max_event_queue_flows;
-   /**< Maximum supported flows in an event queue by this device*/
+   /**< Maximum number of flows within an event queue supported by this 
device. */
uint8_t max_event_queue_priority_levels;
-   /**< Maximum number of event queue priority levels by this device.
-* Valid when the device has RTE_EVENT_DEV_CAP_QUEUE_QOS capability
+   /**< Maximum number of event queue priority levels supported by this 
device.
+*
+* Valid when the device has @ref RTE_EVENT_DEV_CAP_QUEUE_QOS 
capability.
+*
+* The implementation shall normalize priority values specified between
+* @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref 
RTE_EVENT_DEV_PRIORITY_LOWEST
+* to map them internally to this range of priorities.
+* [For devices supporting a power-of-2 number of priority levels, this
+* normalization will be done via a right-shift operation, so only the 
top
+* log2(max_levels) bits will be used by the event device.]
+*
+* @see rte_event_queue_conf.priority
 */
uint8_t max_event_priority_levels;
/**< Maximum number of event priority levels by this device.
-* Valid when the device has RTE_EVENT_DEV_CAP_EVENT_QOS capability
+*
+* Valid when the device has @ref RTE_EVENT_DEV_CAP_EVENT_QOS 
capability.
+*
+* The implementation shall normalize priority values specified between
+* @ref RTE_EVENT_DEV_PRIORITY_HIGHEST and @ref 
RTE_EVENT_DEV_PRIORITY_LOWEST
+* to map them internally to this range of priorities.
+* [For devices supporting a power-of-2 number of priority levels, this
+* normalization will be done via a right-shift operation, so only the 
top
+* log2(max_levels) bits will be used by the event device.]
+*
+* @see rte_event.priority
 */
uint8_t max_event_ports;
-   /**< Maximum number of event ports supported by this device */
+   /**< Maximum number of event ports supported by this device.
+*
+* This count excludes any ports covered by @ref 
max_single_link_event_port_queue_pairs.
+*/
uint8_t max_event_port_dequeue_depth;
-   /**< Maximum number of events can be dequeued at a time from an
-* event port by this device.
-* A device that does not support bulk dequeue will set this as 1.
+   /**< Maximum number of events that can be dequeued at a time from an 
event port
+* on this device.
+*
+* A device that does not support burst dequeue
+* (@ref RTE_EVENT_DEV_CAP_BURST_MODE) will set this to 1.
 */
uint32_t max_event_port_enqueue_depth;
-   /**< Maximum number of events can be enqueued at a time from an
-* event port by this device.
-* A device that does not support bulk enqueue will set this as 1.
+   /**< Maximum number of events that can be enqueued at a time to an 
event port
+* on this device.
+*
+* A device that does not support burst enqueue
+* (@ref RTE_EVENT_DEV_CAP_BURST_MODE) will set this to 1.
 */
   

[PATCH v3 05/11] eventdev: improve function documentation for query fns

2024-02-02 Thread Bruce Richardson
General improvements to the doxygen docs for eventdev functions for
querying basic information:
* number of devices
* id for a particular device
* socket id of device
* capability information for a device

Signed-off-by: Bruce Richardson 

---
V3: minor changes following review
---
 lib/eventdev/rte_eventdev.h | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index da3f72d89e..3cba13e2c4 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -448,8 +448,7 @@ struct rte_event;
  */
 
 /**
- * Get the total number of event devices that have been successfully
- * initialised.
+ * Get the total number of event devices.
  *
  * @return
  *   The total number of usable event devices.
@@ -464,8 +463,10 @@ rte_event_dev_count(void);
  *   Event device name to select the event device identifier.
  *
  * @return
- *   Returns event device identifier on success.
- *   - <0: Failure to find named event device.
+ *   Event device identifier (dev_id >= 0) on success.
+ *   Negative error code on failure:
+ *   - -EINVAL - input name parameter is invalid.
+ *   - -ENODEV - no event device found with that name.
  */
 int
 rte_event_dev_get_dev_id(const char *name);
@@ -478,7 +479,8 @@ rte_event_dev_get_dev_id(const char *name);
  * @return
  *   The NUMA socket id to which the device is connected or
  *   a default of zero if the socket could not be determined.
- *   -(-EINVAL)  dev_id value is out of range.
+ *   -EINVAL on error, where the given dev_id value does not
+ *   correspond to any event device.
  */
 int
 rte_event_dev_socket_id(uint8_t dev_id);
@@ -574,18 +576,20 @@ struct rte_event_dev_info {
 };
 
 /**
- * Retrieve the contextual information of an event device.
+ * Retrieve details of an event device's capabilities and configuration limits.
  *
  * @param dev_id
  *   The identifier of the device.
  *
  * @param[out] dev_info
  *   A pointer to a structure of type *rte_event_dev_info* to be filled with 
the
- *   contextual information of the device.
+ *   information about the device's capabilities.
  *
  * @return
- *   - 0: Success, driver updates the contextual information of the event 
device
- *   - <0: Error code returned by the driver info get function.
+ *   - 0: Success, information about the event device is present in dev_info.
+ *   - <0: Failure, error code returned by the function.
+ * - -EINVAL - invalid input parameters, e.g. incorrect device id.
+ * - -ENOTSUP - device does not support returning capabilities information.
  */
 int
 rte_event_dev_info_get(uint8_t dev_id, struct rte_event_dev_info *dev_info);
-- 
2.40.1



[PATCH v3 06/11] eventdev: improve doxygen comments on configure struct

2024-02-02 Thread Bruce Richardson
General rewording and cleanup on the rte_event_dev_config structure.
Improved the wording of some sentences and created linked
cross-references out of the existing references to the dev_info
structure.

As part of the rework, fix issue with how single-link port-queue pairs
were counted in the rte_event_dev_config structure. This did not match
the actual implementation and, if following the documentation, certain
valid port/queue configurations would have been impossible to configure.
Fix this by changing the documentation to match the implementation

Bugzilla ID:  1368
Fixes: 75d113136f38 ("eventdev: express DLB/DLB2 PMD constraints")

Signed-off-by: Bruce Richardson 

---
V3:
- minor tweaks following review
- merged in doc fix for bugzilla 1368 into this patch, since it fit with
  other clarifications to the config struct.
---
 lib/eventdev/rte_eventdev.h | 61 ++---
 1 file changed, 37 insertions(+), 24 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 3cba13e2c4..027f5936fb 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -634,9 +634,9 @@ rte_event_dev_attr_get(uint8_t dev_id, uint32_t attr_id,
 struct rte_event_dev_config {
uint32_t dequeue_timeout_ns;
/**< rte_event_dequeue_burst() timeout on this device.
-* This value should be in the range of *min_dequeue_timeout_ns* and
-* *max_dequeue_timeout_ns* which previously provided in
-* rte_event_dev_info_get()
+* This value should be in the range of @ref 
rte_event_dev_info.min_dequeue_timeout_ns and
+* @ref rte_event_dev_info.max_dequeue_timeout_ns returned by
+* @ref rte_event_dev_info_get()
 * The value 0 is allowed, in which case, default dequeue timeout used.
 * @see RTE_EVENT_DEV_CFG_PER_DEQUEUE_TIMEOUT
 */
@@ -644,40 +644,53 @@ struct rte_event_dev_config {
/**< In a *closed system* this field is the limit on maximum number of
 * events that can be inflight in the eventdev at a given time. The
 * limit is required to ensure that the finite space in a closed system
-* is not overwhelmed. The value cannot exceed the *max_num_events*
-* as provided by rte_event_dev_info_get().
-* This value should be set to -1 for *open system*.
+* is not exhausted.
+* The value cannot exceed @ref rte_event_dev_info.max_num_events
+* returned by rte_event_dev_info_get().
+*
+* This value should be set to -1 for *open systems*, that is,
+* those systems returning -1 in @ref rte_event_dev_info.max_num_events.
+*
+* @see rte_event_port_conf.new_event_threshold
 */
uint8_t nb_event_queues;
/**< Number of event queues to configure on this device.
-* This value cannot exceed the *max_event_queues* which previously
-* provided in rte_event_dev_info_get()
+* This value *includes* any single-link queue-port pairs to be used.
+* This value cannot exceed @ref rte_event_dev_info.max_event_queues +
+* @ref rte_event_dev_info.max_single_link_event_port_queue_pairs
+* returned by rte_event_dev_info_get().
+* The number of non-single-link queues i.e. this value less
+* *nb_single_link_event_port_queues* in this struct, cannot exceed
+* @ref rte_event_dev_info.max_event_queues
 */
uint8_t nb_event_ports;
/**< Number of event ports to configure on this device.
-* This value cannot exceed the *max_event_ports* which previously
-* provided in rte_event_dev_info_get()
+* This value *includes* any single-link queue-port pairs to be used.
+* This value cannot exceed @ref rte_event_dev_info.max_event_ports +
+* @ref rte_event_dev_info.max_single_link_event_port_queue_pairs
+* returned by rte_event_dev_info_get().
+* The number of non-single-link ports i.e. this value less
+* *nb_single_link_event_port_queues* in this struct, cannot exceed
+* @ref rte_event_dev_info.max_event_ports
 */
uint32_t nb_event_queue_flows;
-   /**< Number of flows for any event queue on this device.
-* This value cannot exceed the *max_event_queue_flows* which previously
-* provided in rte_event_dev_info_get()
+   /**< Max number of flows needed for a single event queue on this device.
+* This value cannot exceed @ref 
rte_event_dev_info.max_event_queue_flows
+* returned by rte_event_dev_info_get()
 */
uint32_t nb_event_port_dequeue_depth;
-   /**< Maximum number of events can be dequeued at a time from an
-* event port by this device.
-* This value cannot exceed the *max_event_port_dequeue_depth*
-* which previously provided in rte_event_dev_info_get().
+   /**< Max number of events that can be dequeued at a time from an event 
p

[PATCH v3 07/11] eventdev: improve doxygen comments on config fns

2024-02-02 Thread Bruce Richardson
Improve the documentation text for the configuration functions and
structures for configuring an eventdev, as well as ports and queues.
Clarify text where possible, and ensure references come through as links
in the html output.

Signed-off-by: Bruce Richardson 

---
V3: Update following review, mainly:
 - change ranges starting with 0, to just say "less than"
 - put in "." at end of sentences & bullet points
---
 lib/eventdev/rte_eventdev.h | 221 +++-
 1 file changed, 144 insertions(+), 77 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 027f5936fb..e2923a69fb 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -707,12 +707,14 @@ struct rte_event_dev_config {
 /**
  * Configure an event device.
  *
- * This function must be invoked first before any other function in the
- * API. This function can also be re-invoked when a device is in the
- * stopped state.
+ * This function must be invoked before any other configuration function in the
+ * API, when preparing an event device for application use.
+ * This function can also be re-invoked when a device is in the stopped state.
  *
- * The caller may use rte_event_dev_info_get() to get the capability of each
- * resources available for this event device.
+ * The caller should use rte_event_dev_info_get() to get the capabilities and
+ * resource limits for this event device before calling this API.
+ * Many values in the dev_conf input parameter are subject to limits given
+ * in the device information returned from rte_event_dev_info_get().
  *
  * @param dev_id
  *   The identifier of the device to configure.
@@ -722,6 +724,9 @@ struct rte_event_dev_config {
  * @return
  *   - 0: Success, device configured.
  *   - <0: Error code returned by the driver configuration function.
+ * - -ENOTSUP - device does not support configuration.
+ * - -EINVAL  - invalid input parameter.
+ * - -EBUSY   - device has already been started.
  */
 int
 rte_event_dev_configure(uint8_t dev_id,
@@ -731,14 +736,35 @@ rte_event_dev_configure(uint8_t dev_id,
 
 /* Event queue configuration bitmap flags */
 #define RTE_EVENT_QUEUE_CFG_ALL_TYPES  (1ULL << 0)
-/**< Allow ATOMIC,ORDERED,PARALLEL schedule type enqueue
+/**< Allow events with schedule types ATOMIC, ORDERED, and PARALLEL to be 
enqueued to this queue.
  *
+ * The scheduling type to be used is that specified in each individual event.
+ * This flag can only be set when configuring queues on devices reporting the
+ * @ref RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES capability.
+ *
+ * Without this flag, only events with the specific scheduling type configured 
at queue setup
+ * can be sent to the queue.
+ *
+ * @see RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES
  * @see RTE_SCHED_TYPE_ORDERED, RTE_SCHED_TYPE_ATOMIC, RTE_SCHED_TYPE_PARALLEL
  * @see rte_event_enqueue_burst()
  */
 #define RTE_EVENT_QUEUE_CFG_SINGLE_LINK(1ULL << 1)
 /**< This event queue links only to a single event port.
  *
+ * No load-balancing of events is performed, as all events
+ * sent to this queue end up at the same event port.
+ * The number of queues on which this flag is to be set must be
+ * configured at device configuration time, by setting
+ * @ref rte_event_dev_config.nb_single_link_event_port_queues
+ * parameter appropriately.
+ *
+ * This flag serves as a hint only, any devices without specific
+ * support for single-link queues can fall-back automatically to
+ * using regular queues with a single destination port.
+ *
+ *  @see rte_event_dev_info.max_single_link_event_port_queue_pairs
+ *  @see rte_event_dev_config.nb_single_link_event_port_queues
  *  @see rte_event_port_setup(), rte_event_port_link()
  */
 
@@ -746,56 +772,79 @@ rte_event_dev_configure(uint8_t dev_id,
 struct rte_event_queue_conf {
uint32_t nb_atomic_flows;
/**< The maximum number of active flows this queue can track at any
-* given time. If the queue is configured for atomic scheduling (by
-* applying the RTE_EVENT_QUEUE_CFG_ALL_TYPES flag to event_queue_cfg
-* or RTE_SCHED_TYPE_ATOMIC flag to schedule_type), then the
-* value must be in the range of [1, nb_event_queue_flows], which was
-* previously provided in rte_event_dev_configure().
+* given time.
+*
+* If the queue is configured for atomic scheduling (by
+* applying the @ref RTE_EVENT_QUEUE_CFG_ALL_TYPES flag to
+* @ref rte_event_queue_conf.event_queue_cfg
+* or @ref RTE_SCHED_TYPE_ATOMIC flag to @ref 
rte_event_queue_conf.schedule_type), then the
+* value must be in the range of [1, @ref 
rte_event_dev_config.nb_event_queue_flows],
+* which was previously provided in rte_event_dev_configure().
+*
+* If the queue is not configured for atomic scheduling this value is 
ignored.
 */
uint32_t nb_atomic_order_sequences;
/**< The maximum number of outstandi

[PATCH v3 08/11] eventdev: improve doxygen comments for control APIs

2024-02-02 Thread Bruce Richardson
The doxygen comments for the port attributes, start and stop (and
related functions) are improved.

Signed-off-by: Bruce Richardson 

---
V3: add missing "." on end of sentences/lines.
---
 lib/eventdev/rte_eventdev.h | 47 +++--
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index e2923a69fb..a7d8c28015 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1151,19 +1151,21 @@ rte_event_port_quiesce(uint8_t dev_id, uint8_t port_id,
   rte_eventdev_port_flush_t release_cb, void *args);
 
 /**
- * The queue depth of the port on the enqueue side
+ * Port attribute id for the maximum size of a burst enqueue operation 
supported on a port.
  */
 #define RTE_EVENT_PORT_ATTR_ENQ_DEPTH 0
 /**
- * The queue depth of the port on the dequeue side
+ * Port attribute id for the maximum size of a dequeue burst which can be 
returned from a port.
  */
 #define RTE_EVENT_PORT_ATTR_DEQ_DEPTH 1
 /**
- * The new event threshold of the port
+ * Port attribute id for the new event threshold of the port.
+ * Once the number of events in the system exceeds this threshold, the enqueue 
of NEW-type
+ * events will fail.
  */
 #define RTE_EVENT_PORT_ATTR_NEW_EVENT_THRESHOLD 2
 /**
- * The implicit release disable attribute of the port
+ * Port attribute id for the implicit release disable attribute of the port.
  */
 #define RTE_EVENT_PORT_ATTR_IMPLICIT_RELEASE_DISABLE 3
 
@@ -1171,17 +1173,18 @@ rte_event_port_quiesce(uint8_t dev_id, uint8_t port_id,
  * Get an attribute from a port.
  *
  * @param dev_id
- *   Eventdev id
+ *   The identifier of the device.
  * @param port_id
- *   Eventdev port id
+ *   The index of the event port to query. The value must be less than
+ *   @ref rte_event_dev_config.nb_event_ports previously supplied to 
rte_event_dev_configure().
  * @param attr_id
- *   The attribute ID to retrieve
+ *   The attribute ID to retrieve (RTE_EVENT_PORT_ATTR_*)
  * @param[out] attr_value
  *   A pointer that will be filled in with the attribute value if successful
  *
  * @return
- *   - 0: Successfully returned value
- *   - (-EINVAL) Invalid device, port or attr_id, or attr_value was NULL
+ *   - 0: Successfully returned value.
+ *   - (-EINVAL) Invalid device, port or attr_id, or attr_value was NULL.
  */
 int
 rte_event_port_attr_get(uint8_t dev_id, uint8_t port_id, uint32_t attr_id,
@@ -1190,17 +1193,19 @@ rte_event_port_attr_get(uint8_t dev_id, uint8_t 
port_id, uint32_t attr_id,
 /**
  * Start an event device.
  *
- * The device start step is the last one and consists of setting the event
- * queues to start accepting the events and schedules to event ports.
+ * The device start step is the last one in device setup, and enables the event
+ * ports and queues to start accepting events and scheduling them to event 
ports.
  *
  * On success, all basic functions exported by the API (event enqueue,
  * event dequeue and so on) can be invoked.
  *
  * @param dev_id
- *   Event device identifier
+ *   Event device identifier.
  * @return
  *   - 0: Success, device started.
- *   - -ESTALE : Not all ports of the device are configured
+ *   - -EINVAL:  Invalid device id provided.
+ *   - -ENOTSUP: Device does not support this operation.
+ *   - -ESTALE : Not all ports of the device are configured.
  *   - -ENOLINK: Not all queues are linked, which could lead to deadlock.
  */
 int
@@ -1242,18 +1247,22 @@ typedef void (*rte_eventdev_stop_flush_t)(uint8_t 
dev_id,
  * callback function must be registered in every process that can call
  * rte_event_dev_stop().
  *
+ * Only one callback function may be registered. Each new call replaces
+ * the existing registered callback function with the new function passed in.
+ *
  * To unregister a callback, call this function with a NULL callback pointer.
  *
  * @param dev_id
  *   The identifier of the device.
  * @param callback
- *   Callback function invoked once per flushed event.
+ *   Callback function to be invoked once per flushed event.
+ *   Pass NULL to unset any previously-registered callback function.
  * @param userdata
  *   Argument supplied to callback.
  *
  * @return
  *  - 0 on success.
- *  - -EINVAL if *dev_id* is invalid
+ *  - -EINVAL if *dev_id* is invalid.
  *
  * @see rte_event_dev_stop()
  */
@@ -1264,12 +1273,14 @@ int rte_event_dev_stop_flush_callback_register(uint8_t 
dev_id,
  * Close an event device. The device cannot be restarted!
  *
  * @param dev_id
- *   Event device identifier
+ *   Event device identifier.
  *
  * @return
  *  - 0 on successfully closing device
- *  - <0 on failure to close device
- *  - (-EAGAIN) if device is busy
+ *  - <0 on failure to close device.
+ *- -EINVAL - invalid device id.
+ *- -ENOTSUP - operation not supported for this device.
+ *- -EAGAIN - device is busy.
  */
 int
 rte_event_dev_close(uint8_t dev_id);
-- 
2.40.1



[PATCH v3 09/11] eventdev: improve comments on scheduling types

2024-02-02 Thread Bruce Richardson
The description of ordered and atomic scheduling given in the eventdev
doxygen documentation was not always clear. Try and simplify this so
that it is clearer for the end-user of the application

Signed-off-by: Bruce Richardson 

---
V3: extensive rework following feedback. Please re-review!
---
 lib/eventdev/rte_eventdev.h | 73 +++--
 1 file changed, 45 insertions(+), 28 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index a7d8c28015..8d72765ae7 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1347,25 +1347,35 @@ struct rte_event_vector {
 /**< Ordered scheduling
  *
  * Events from an ordered flow of an event queue can be scheduled to multiple
- * ports for concurrent processing while maintaining the original event order.
+ * ports for concurrent processing while maintaining the original event order,
+ * i.e. the order in which they were first enqueued to that queue.
  * This scheme enables the user to achieve high single flow throughput by
- * avoiding SW synchronization for ordering between ports which bound to cores.
- *
- * The source flow ordering from an event queue is maintained when events are
- * enqueued to their destination queue within the same ordered flow context.
- * An event port holds the context until application call
- * rte_event_dequeue_burst() from the same port, which implicitly releases
- * the context.
- * User may allow the scheduler to release the context earlier than that
- * by invoking rte_event_enqueue_burst() with RTE_EVENT_OP_RELEASE operation.
- *
- * Events from the source queue appear in their original order when dequeued
- * from a destination queue.
- * Event ordering is based on the received event(s), but also other
- * (newly allocated or stored) events are ordered when enqueued within the same
- * ordered context. Events not enqueued (e.g. released or stored) within the
- * context are  considered missing from reordering and are skipped at this time
- * (but can be ordered again within another context).
+ * avoiding SW synchronization for ordering between ports which are polled
+ * by different cores.
+ *
+ * After events are dequeued from a set of ports, as those events are 
re-enqueued
+ * to another queue (with the op field set to @ref RTE_EVENT_OP_FORWARD), the 
event
+ * device restores the original event order - including events returned from 
all
+ * ports in the set - before the events arrive on the destination queue.
+ *
+ * Any events not forwarded i.e. dropped explicitly via RELEASE or implicitly
+ * released by the next dequeue operation on a port, are skipped by the 
reordering
+ * stage and do not affect the reordering of other returned events.
+ *
+ * Any NEW events sent on a port are not ordered with respect to FORWARD 
events sent
+ * on the same port, since they have no original event order. They also are not
+ * ordered with respect to NEW events enqueued on other ports.
+ * However, NEW events to the same destination queue from the same port are 
guaranteed
+ * to be enqueued in the order they were submitted via 
rte_event_enqueue_burst().
+ *
+ * NOTE:
+ *   In restoring event order of forwarded events, the eventdev API guarantees 
that
+ *   all events from the same flow (i.e. same @ref rte_event.flow_id,
+ *   @ref rte_event.priority and @ref rte_event.queue_id) will be put in the 
original
+ *   order before being forwarded to the destination queue.
+ *   Some eventdevs may implement stricter ordering to achieve this aim,
+ *   for example, restoring the order across *all* flows dequeued from the 
same ORDERED
+ *   queue.
  *
  * @see rte_event_queue_setup(), rte_event_dequeue_burst(), 
RTE_EVENT_OP_RELEASE
  */
@@ -1373,18 +1383,25 @@ struct rte_event_vector {
 #define RTE_SCHED_TYPE_ATOMIC   1
 /**< Atomic scheduling
  *
- * Events from an atomic flow of an event queue can be scheduled only to a
+ * Events from an atomic flow, identified by a combination of @ref 
rte_event.flow_id,
+ * @ref rte_event.queue_id and @ref rte_event.priority, can be scheduled only 
to a
  * single port at a time. The port is guaranteed to have exclusive (atomic)
  * access to the associated flow context, which enables the user to avoid SW
- * synchronization. Atomic flows also help to maintain event ordering
- * since only one port at a time can process events from a flow of an
- * event queue.
- *
- * The atomic queue synchronization context is dedicated to the port until
- * application call rte_event_dequeue_burst() from the same port,
- * which implicitly releases the context. User may allow the scheduler to
- * release the context earlier than that by invoking rte_event_enqueue_burst()
- * with RTE_EVENT_OP_RELEASE operation.
+ * synchronization. Atomic flows also maintain event ordering
+ * since only one port at a time can process events from each flow of an
+ * event queue, and events within a flow are not reordered within the 
scheduler.
+ *
+ 

[PATCH v3 10/11] eventdev: clarify docs on event object fields and op types

2024-02-02 Thread Bruce Richardson
Clarify the meaning of the NEW, FORWARD and RELEASE event types.
For the fields in "rte_event" struct, enhance the comments on each to
clarify the field's use, and whether it is preserved between enqueue and
dequeue, and it's role, if any, in scheduling.

Signed-off-by: Bruce Richardson 
---
V3: updates following review
---
 lib/eventdev/rte_eventdev.h | 161 +---
 1 file changed, 111 insertions(+), 50 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 8d72765ae7..58219e027e 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1463,47 +1463,54 @@ struct rte_event_vector {
 
 /* Event enqueue operations */
 #define RTE_EVENT_OP_NEW0
-/**< The event producers use this operation to inject a new event to the
- * event device.
+/**< The @ref rte_event.op field must be set to this operation type to inject 
a new event,
+ * i.e. one not previously dequeued, into the event device, to be scheduled
+ * for processing.
  */
 #define RTE_EVENT_OP_FORWARD1
-/**< The CPU use this operation to forward the event to different event queue 
or
- * change to new application specific flow or schedule type to enable
- * pipelining.
+/**< The application must set the @ref rte_event.op field to this operation 
type to return a
+ * previously dequeued event to the event device to be scheduled for further 
processing.
  *
- * This operation must only be enqueued to the same port that the
+ * This event *must* be enqueued to the same port that the
  * event to be forwarded was dequeued from.
+ *
+ * The event's fields, including (but not limited to) flow_id, scheduling type,
+ * destination queue, and event payload e.g. mbuf pointer, may all be updated 
as
+ * desired by the application, but the @ref rte_event.impl_opaque field must
+ * be kept to the same value as was present when the event was dequeued.
  */
 #define RTE_EVENT_OP_RELEASE2
 /**< Release the flow context associated with the schedule type.
  *
- * If current flow's scheduler type method is *RTE_SCHED_TYPE_ATOMIC*
- * then this function hints the scheduler that the user has completed critical
- * section processing in the current atomic context.
- * The scheduler is now allowed to schedule events from the same flow from
- * an event queue to another port. However, the context may be still held
- * until the next rte_event_dequeue_burst() call, this call allows but does not
- * force the scheduler to release the context early.
- *
- * Early atomic context release may increase parallelism and thus system
+ * If current flow's scheduler type method is @ref RTE_SCHED_TYPE_ATOMIC
+ * then this operation type hints the scheduler that the user has completed 
critical
+ * section processing for this event in the current atomic context, and that 
the
+ * scheduler may unlock any atomic locks held for this event.
+ * If this is the last event from an atomic flow, i.e. all flow locks are 
released,
+ * the scheduler is now allowed to schedule events from that flow from to 
another port.
+ * However, the atomic locks may be still held until the next 
rte_event_dequeue_burst()
+ * call; enqueuing an event with opt type @ref RTE_EVENT_OP_RELEASE allows,
+ * but does not force, the scheduler to release the atomic locks early.
+ *
+ * Early atomic lock release may increase parallelism and thus system
  * performance, but the user needs to design carefully the split into critical
  * vs non-critical sections.
  *
- * If current flow's scheduler type method is *RTE_SCHED_TYPE_ORDERED*
- * then this function hints the scheduler that the user has done all that need
- * to maintain event order in the current ordered context.
- * The scheduler is allowed to release the ordered context of this port and
- * avoid reordering any following enqueues.
- *
- * Early ordered context release may increase parallelism and thus system
- * performance.
+ * If current flow's scheduler type method is @ref RTE_SCHED_TYPE_ORDERED
+ * then this operation type informs the scheduler that the current event has
+ * completed processing and will not be returned to the scheduler, i.e.
+ * it has been dropped, and so the reordering context for that event
+ * should be considered filled.
  *
- * If current flow's scheduler type method is *RTE_SCHED_TYPE_PARALLEL*
- * or no scheduling context is held then this function may be an NOOP,
- * depending on the implementation.
+ * Events with this operation type must only be enqueued to the same port that 
the
+ * event to be released was dequeued from. The @ref rte_event.impl_opaque
+ * field in the release event must have the same value as that in the original 
dequeued event.
  *
- * This operation must only be enqueued to the same port that the
- * event to be released was dequeued from.
+ * If a dequeued event is re-enqueued with operation type of @ref 
RTE_EVENT_OP_RELEASE,
+ * then any subsequent enqueue of that event - or a copy 

[PATCH v3 11/11] eventdev: drop comment for anon union from doxygen

2024-02-02 Thread Bruce Richardson
Make the comments on the unnamed unions in the rte_event structure
regular comments rather than doxygen ones. The comments do not add
anything meaningful to the doxygen output.

Signed-off-by: Bruce Richardson 
---
 lib/eventdev/rte_eventdev.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
index 58219e027e..e31c927905 100644
--- a/lib/eventdev/rte_eventdev.h
+++ b/lib/eventdev/rte_eventdev.h
@@ -1518,7 +1518,7 @@ struct rte_event_vector {
  * for dequeue and enqueue operation
  */
 struct rte_event {
-   /** WORD0 */
+   /* WORD0 */
union {
uint64_t event;
/** Event attributes for dequeue or enqueue operation */
@@ -1631,7 +1631,7 @@ struct rte_event {
 */
};
};
-   /** WORD1 */
+   /* WORD1 */
union {
uint64_t u64;
/**< Opaque 64-bit value */
-- 
2.40.1



RE: [RFC v3] eal: add bitset type

2024-02-02 Thread Morten Brørup

> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> Sent: Friday, 2 February 2024 11.19
> 
> On 2024-02-01 09:04, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> >> Sent: Wednesday, 31 January 2024 19.46
> >>
> >> On 2024-01-31 17:06, Stephen Hemminger wrote:
> >>> On Wed, 31 Jan 2024 14:13:01 +0100
> >>> Mattias Rönnblom  wrote:
> >
> > [...]
> >
> >>> FYI - the linux kernel has a similar but more complete set of
> >> operations.
> >>> It might be more efficient to use unsigned long rather than
> requiring
> >>> the elements to be uint64_t. Thinking of the few 32 bit platforms.
> >>>
> >>
> >> Keeping it 64-bit avoids a popcount-related #ifdef. DPDK doesn't
> have
> >> an
> >> equivalent to __builtin_popcountl().
> >>
> >> How much do we need to care about 32-bit ISA performance?
> >
> > At the 2023 DPDK Summit I talked to someone at a very well known
> network equipment vendor using 32 bit CPUs in some of their products;
> some sort of CPE, IIRC. 32 bit CPUs are still out there, and 32-bit CPU
> support has not been deprecated in DPDK.
> >
> > For the bitset parameter to functions, you could either use "unsigned
> long*" (as suggested by Stephen), or "void*" (followed by type casting
> inside the functions).
> >
> > If only using this library for the command line argument parser and
> similar, performance is irrelevant. If we foresee using it in the fast
> path, e.g. with the htimer library, we shouldn't tie its API tightly to
> 64 bit.
> >
> 
> I'm not even sure performance will be that much worse. Sure, two
> popcount instead of one. What is probably worse is older ISAs (32- or
> 64-bit, e.g. original x64_64) that lack machine instructions for
> counting set bits of *any* word size.

I'm sorry about being unclear. I didn't mean to suggest supporting *any* word 
size; I was thinking about one word size, either 32 or 64 bit, automatically 
selected at build time depending on CPU architecture.

> 
> That said, the only real concern I have about going "unsigned long" ->
> "uint64_t" is that I might feel I need to go fix  first.

I see.
Otherwise you'll end up with a bunch of #if RTE_ARCH_32 rte_bit_32() #else 
rte_bit_64() #endif in the implementation.
Perhaps a string concatenation macro could replace that with something like 
rte_bit_##RTE_ARCH_BITS(), or RTE_POSTFIX_ARCH_BITS(rte_bit_, 
(params)). Just thinking out aloud.

> 
> >>
> >> I'll go through the below API and some other APIs to see if there's
> >> something obvious missing.
> >>
> >> When I originally wrote this code there were a few potential
> features
> >> where I wasn't sure to what extent they were useful. One example was
> >> the
> >> shift operation. Any input is appreciated.
> >
> > Start off with what you already have. If we need more operations,
> they can always be added later.
> >
> >>
> >>> Also, what if any thread safety guarantees? or atomic.
> >>>
> >>
> >> Currently, it's all MT unsafe.
> >>
> >> An atomic set and get/test would make sense, and maybe other
> operations
> >> would as well.
> >>
> >> Bringing in atomicity into the design makes it much less obvious:
> >>
> >> Would the atomic operations imply some memory ordering, or be
> >> "relaxed".
> >> I would lean toward relaxed, but then shouldn't bit-level atomics be
> >> consistent with the core DPDK atomics API? With that in mind, memory
> >> ordering should be user-configurable.
> >>
> >> If the code needs to be pure C11 atomics-wise, the words that makes
> up
> >> the bitset must be _Atomic uint64_t. Then you need to be careful or
> end
> >> up with "lock"-prefixed instructions if you manipulate the bitset
> >> words.
> >> Just a pure words[N] = 0; gives you a mov+mfence on x86, for
> example,
> >> plus all the fun memory_order_seq_cst in terms of preventing
> >> compiler-level optimizations. So you definitely can't have the
> bitset
> >> always using _Atomic uint64_t, since would risk non-shared use
> cases.
> >> You could have a variant I guess. Just duplicate the whole thing, or
> >> something with macros.
> >
> > It seems like MT unsafe suffices for the near term use cases.
> >
> > We can add an atomic variant of the library later, if the need should
> arise.
> >
> 
> Agreed. The only concern I have here is that you end up wanting to
> change the original design, to better be able to fit atomic bit
> operations.

In a perfect world, the design should have a roadmap leading towards atomic bit 
operations.
In a fast moving world, we could mark the lib experimental (and mean it!) - it 
is still an improvement over copy-pasting something similar all over the code.

If a potential roadmap towards atomic operations is not obvious after thinking 
a few moments about it, we have a clear conscience to simply deem the library 
unsafe for multithreading and proceed with it "as is".



[PATCH v3 1/2] crypto/ipsec_mb: bump minimum IPsec Multi-buffer version

2024-02-02 Thread Sivaramakrishnan Venkat
SW PMDs increment IPsec Multi-buffer version to 1.4.
A minimum IPsec Multi-buffer version of 1.4 or greater is now required.

Signed-off-by: Sivaramakrishnan Venkat 
Acked-by: Ciara Power 
---
  v2:
 - Removed unused macro in ipsec_mb_ops.c
 - set_gcm_job() modified correctly to keep multi_sgl_job line
 - Updated SW PMDs documentation for minimum IPSec Multi-buffer version
 - Updated commit message, and patch title.
---
 doc/guides/cryptodevs/aesni_gcm.rst |   3 +-
 doc/guides/cryptodevs/aesni_mb.rst  |   3 +-
 doc/guides/cryptodevs/chacha20_poly1305.rst |   3 +-
 doc/guides/cryptodevs/kasumi.rst|   3 +-
 doc/guides/cryptodevs/snow3g.rst|   3 +-
 doc/guides/cryptodevs/zuc.rst   |   3 +-
 drivers/crypto/ipsec_mb/ipsec_mb_ops.c  |  23 ---
 drivers/crypto/ipsec_mb/meson.build |   2 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 165 
 drivers/crypto/ipsec_mb/pmd_aesni_mb_priv.h |   9 --
 10 files changed, 13 insertions(+), 204 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index f5773426ee..dc665e536c 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -85,7 +85,8 @@ and the external crypto libraries supported by them:
18.05 - 19.02  Multi-buffer library 0.49 - 0.52
19.05 - 20.08  Multi-buffer library 0.52 - 0.55
20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index b2e74ba417..5d670ee237 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -146,7 +146,8 @@ and the Multi-Buffer library version supported by them:
19.05 - 19.08   0.52
19.11 - 20.08   0.52 - 0.55
20.11 - 21.08   0.53 - 1.3*
-   21.11+  1.0  - 1.5*
+   21.11 - 23.11   1.0  - 1.5*
+   24.03+  1.4  - 1.5*
==  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index 9d4bf86cf1..c32866b301 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -72,7 +72,8 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21.11+ Multi-buffer library 1.0-1.5*
+   21.11 - 23.11  Multi-buffer library 1.0-1.5*
+   24.03+ Multi-buffer library 1.4-1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/kasumi.rst b/doc/guides/cryptodevs/kasumi.rst
index 0989054875..a8f4e6b204 100644
--- a/doc/guides/cryptodevs/kasumi.rst
+++ b/doc/guides/cryptodevs/kasumi.rst
@@ -87,7 +87,8 @@ and the external crypto libraries supported by them:
=  
16.11 - 19.11  LibSSO KASUMI
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/snow3g.rst b/doc/guides/cryptodevs/snow3g.rst
index 3392932653..46863462e5 100644
--- a/doc/guides/cryptodevs/snow3g.rst
+++ b/doc/guides/cryptodevs/snow3g.rst
@@ -96,7 +96,8 @@ and the external crypto libraries supported by them:
=  
16.04 - 19.11  LibSSO SNOW3G
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
+   24.03+ Multi-buffer library 1.4  - 1.5*
=  
 
 \* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
diff --git a/doc/guides/cryptodevs/zuc.rst b/doc/guides/cryptodevs/zuc.rst
index a414b5ad2c..51867e1a16 100644
--- a/doc/guides/cryptodevs/zuc.rst
+++ b/doc/guides/cryptodevs/zuc.rst
@@ -95,7 +95,8 @@ and the external crypto libraries supported by them:
=  
16.11 - 19.11  LibSSO ZUC
20.02 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11+ Multi-buffer library 1.0  - 1.5*
+   21.11 - 23.11  Multi-

[PATCH v3 2/2] doc: remove outdated version details

2024-02-02 Thread Sivaramakrishnan Venkat
SW PMDs documentation is updated to remove details of unsupported IPsec
Multi-buffer versions.DPDK older than 20.11 is end of life. So, older
DPDK versions are removed from the Crypto library version table.

Signed-off-by: Sivaramakrishnan Venkat 
---
 v3:
- added second patch for outdated documentation updates.
---
 doc/guides/cryptodevs/aesni_gcm.rst | 19 +++---
 doc/guides/cryptodevs/aesni_mb.rst  | 22 +++--
 doc/guides/cryptodevs/chacha20_poly1305.rst | 12 ++-
 doc/guides/cryptodevs/kasumi.rst| 14 +++--
 doc/guides/cryptodevs/snow3g.rst| 15 +++---
 doc/guides/cryptodevs/zuc.rst   | 15 +++---
 6 files changed, 17 insertions(+), 80 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst 
b/doc/guides/cryptodevs/aesni_gcm.rst
index dc665e536c..e38a03b78f 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -62,12 +62,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
@@ -79,18 +73,11 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   16.04 - 16.11  Multi-buffer library 0.43 - 0.44
-   17.02 - 17.05  ISA-L Crypto v2.18
-   17.08 - 18.02  Multi-buffer library 0.46 - 0.48
-   18.05 - 19.02  Multi-buffer library 0.49 - 0.52
-   19.05 - 20.08  Multi-buffer library 0.52 - 0.55
-   20.11 - 21.08  Multi-buffer library 0.53 - 1.3*
-   21.11 - 23.11  Multi-buffer library 1.0  - 1.5*
-   24.03+ Multi-buffer library 1.4  - 1.5*
+   20.11 - 21.08  Multi-buffer library 0.53 - 1.3
+   21.11 - 23.11  Multi-buffer library 1.0  - 1.5
+   24.03+ Multi-buffer library 1.4  - 1.5
=  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 5d670ee237..bd7c8de07f 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -121,12 +121,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the Multi-Buffer library version supported by them:
 
@@ -137,21 +131,11 @@ and the Multi-Buffer library version supported by them:
==  
DPDK versionMulti-buffer library version
==  
-   2.2 - 16.11 0.43 - 0.44
-   17.02   0.44
-   17.05 - 17.08   0.45 - 0.48
-   17.11   0.47 - 0.48
-   18.02   0.48
-   18.05 - 19.02   0.49 - 0.52
-   19.05 - 19.08   0.52
-   19.11 - 20.08   0.52 - 0.55
-   20.11 - 21.08   0.53 - 1.3*
-   21.11 - 23.11   1.0  - 1.5*
-   24.03+  1.4  - 1.5*
+   20.11 - 21.08   0.53 - 1.3
+   21.11 - 23.11   1.0  - 1.5
+   24.03+  1.4  - 1.5
==  
 
-\* Multi-buffer library 1.0 or newer only works for Meson but not Make build 
system.
-
 Initialization
 --
 
diff --git a/doc/guides/cryptodevs/chacha20_poly1305.rst 
b/doc/guides/cryptodevs/chacha20_poly1305.rst
index c32866b301..8e0ee4f835 100644
--- a/doc/guides/cryptodevs/chacha20_poly1305.rst
+++ b/doc/guides/cryptodevs/chacha20_poly1305.rst
@@ -56,12 +56,6 @@ Once it is downloaded, extract it and follow these steps:
 make
 make install
 
-.. note::
-
-   Compilation of the Multi-Buffer library is broken when GCC < 5.0, if 
library <= v0.53.
-   If a lower GCC version than 5.0, the workaround proposed by the following 
link
-   should be used: ``_.
-
 As a reference, the following table shows a mapping between the past DPDK 
versions
 and the external crypto libraries supported by them:
 
@@ -72,12 +66,10 @@ and the external crypto libraries supported by them:
=  
DPDK version   Crypto library version
=  
-   21.11 - 23.11  Multi-buffer 

Re: [PATCH] event/cnxk: remove unused files

2024-02-02 Thread Jerin Jacob
On Fri, Feb 2, 2024 at 4:09 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Remove unused template files.
>
> Signed-off-by: Pavan Nikhilesh 

Applied to dpdk-next-eventdev/for-main. Thanks


Re: [PATCH] bus/vdev: fix devargs memory leak

2024-02-02 Thread Burakov, Anatoly

On 9/1/2023 9:24 AM, Mingjin Ye wrote:

When a device is created by a secondary process, an empty devargs is
temporarily generated and bound to it. This causes the device to not
be associated with the correct devargs, and the empty devargs are not
released when the resource is freed.

This patch fixes the issue by matching the devargs when inserting a
device in secondary process.

Fixes: dda987315ca2 ("vdev: make virtual bus use its device struct")
Fixes: a16040453968 ("eal: extract vdev infra")
Cc: sta...@dpdk.org

Signed-off-by: Mingjin Ye 
---

Acked-by: Anatoly Burakov 

--
Thanks,
Anatoly



[PATCH v5 1/2] net/octeon_ep: improve Rx performance

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Use mempool API instead of pktmbuf alloc to avoid mbuf reset
as it will be done by rearm on receive.
Reorder refill to avoid unnecessary write commits on mbuf data.

Signed-off-by: Pavan Nikhilesh 
---
 v2 Changes:
 - Fix compilation with distro gcc.
 v3 Changes:
 - Fix aarch32 compilation.
 v4 Changes:
 - Fix checkpatch.
 v5 Changes:
 - Update release notes.

 doc/guides/rel_notes/release_24_03.rst |  2 ++
 drivers/net/octeon_ep/cnxk_ep_rx.c |  4 +--
 drivers/net/octeon_ep/cnxk_ep_rx.h | 13 ++---
 drivers/net/octeon_ep/cnxk_ep_rx_avx.c | 20 +++---
 drivers/net/octeon_ep/cnxk_ep_rx_sse.c | 38 ++
 drivers/net/octeon_ep/otx_ep_rxtx.h|  2 +-
 6 files changed, 44 insertions(+), 35 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 282a3f9c8c..c8fcaaad6d 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -59,6 +59,8 @@ New Features

   * Optimize mbuf rearm sequence.
   * Updated Tx queue mbuf free thresholds from 128 to 256 for better 
performance.
+  * Updated Rx queue mbuf refill routine to use mempool alloc and reorder it
+to avoid mbuf write commits.
   * Added optimized SSE Rx routines.
   * Added optimized AVX2 Rx routines.

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx.c
index f3e4fb27d1..7465e0a017 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.c
@@ -76,12 +76,12 @@ cnxk_ep_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts)
uint16_t new_pkts;

new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
-   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
-
/* Refill RX buffers */
if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
cnxk_ep_rx_refill(droq);

+   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
+
return new_pkts;
 }

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.h 
b/drivers/net/octeon_ep/cnxk_ep_rx.h
index e71fc0de5c..61263e651e 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.h
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.h
@@ -21,13 +21,16 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
uint32_t i;
int rc;

-   rc = rte_pktmbuf_alloc_bulk(droq->mpool, &recv_buf_list[refill_idx], 
count);
+   rc = rte_mempool_get_bulk(droq->mpool, (void 
**)&recv_buf_list[refill_idx], count);
if (unlikely(rc)) {
droq->stats.rx_alloc_failure++;
return rc;
}

for (i = 0; i < count; i++) {
+   rte_prefetch_non_temporal(&desc_ring[(refill_idx + 1) & 3]);
+   if (i < count - 1)
+   rte_prefetch_non_temporal(recv_buf_list[refill_idx + 
1]);
buf = recv_buf_list[refill_idx];
desc_ring[refill_idx].buffer_ptr = 
rte_mbuf_data_iova_default(buf);
refill_idx++;
@@ -42,9 +45,9 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
 static inline void
 cnxk_ep_rx_refill(struct otx_ep_droq *droq)
 {
-   uint32_t desc_refilled = 0, count;
-   uint32_t nb_desc = droq->nb_desc;
+   const uint32_t nb_desc = droq->nb_desc;
uint32_t refill_idx = droq->refill_idx;
+   uint32_t desc_refilled = 0, count;
int rc;

if (unlikely(droq->read_idx == refill_idx))
@@ -128,6 +131,8 @@ cnxk_ep_rx_pkts_to_process(struct otx_ep_droq *droq, 
uint16_t nb_pkts)
return RTE_MIN(nb_pkts, droq->pkts_pending);
 }

+#define cnxk_pktmbuf_mtod(m, t) ((t)(void *)((char *)(m)->buf_addr + 
RTE_PKTMBUF_HEADROOM))
+
 static __rte_always_inline void
 cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq, uint16_t new_pkts)
 {
@@ -147,7 +152,7 @@ cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq,
  void *));

mbuf = recv_buf_list[read_idx];
-   info = rte_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
+   info = cnxk_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
read_idx = otx_ep_incr_index(read_idx, 1, nb_desc);
pkt_len = rte_bswap16(info->length >> 48);
mbuf->pkt_len = pkt_len;
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
index ae4615e6da..47eb1d2ef7 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
@@ -49,7 +49,7 @@ cnxk_ep_process_pkts_vec_avx(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq
/* Load rearm data and packet length for shuffle. */
for (i = 0; i < CNXK_EP_OQ_DESC_PER_LOOP_AVX; i++)
data[i] = _mm256_set_epi64x(0,
-   rte_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)->length >> 16,
+ 

[PATCH v5 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Add Rx ARM NEON SIMD routine.

Signed-off-by: Pavan Nikhilesh 
---
 doc/guides/rel_notes/release_24_03.rst  |   1 +
 drivers/net/octeon_ep/cnxk_ep_rx_neon.c | 148 
 drivers/net/octeon_ep/meson.build   |   6 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c   |   5 +-
 drivers/net/octeon_ep/otx_ep_rxtx.h |   6 +
 5 files changed, 164 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/octeon_ep/cnxk_ep_rx_neon.c

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index c8fcaaad6d..7a83b545cc 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -63,6 +63,7 @@ New Features
 to avoid mbuf write commits.
   * Added optimized SSE Rx routines.
   * Added optimized AVX2 Rx routines.
+  * Added optimized NEON Rx routines.
 
 * **Updated Marvell cnxk net driver.**
 
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_neon.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
new file mode 100644
index 00..8abd8711e1
--- /dev/null
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell.
+ */
+
+#include "cnxk_ep_rx.h"
+
+static __rte_always_inline void
+cnxk_ep_process_pkts_vec_neon(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq,
+ uint16_t new_pkts)
+{
+   const uint8x16_t mask0 = {0, 1, 0xff, 0xff, 0, 1, 0xff, 0xff,
+ 4, 5, 0xff, 0xff, 4, 5, 0xff, 0xff};
+   const uint8x16_t mask1 = {8,  9,  0xff, 0xff, 8,  9,  0xff, 0xff,
+ 12, 13, 0xff, 0xff, 12, 13, 0xff, 0xff};
+   struct rte_mbuf **recv_buf_list = droq->recv_buf_list;
+   uint32_t pidx0, pidx1, pidx2, pidx3;
+   struct rte_mbuf *m0, *m1, *m2, *m3;
+   uint32_t read_idx = droq->read_idx;
+   uint16_t nb_desc = droq->nb_desc;
+   uint32_t idx0, idx1, idx2, idx3;
+   uint64x2_t s01, s23;
+   uint32x4_t bytes;
+   uint16_t pkts = 0;
+
+   idx0 = read_idx;
+   s01 = vdupq_n_u64(0);
+   bytes = vdupq_n_u32(0);
+   while (pkts < new_pkts) {
+
+   idx1 = otx_ep_incr_index(idx0, 1, nb_desc);
+   idx2 = otx_ep_incr_index(idx1, 1, nb_desc);
+   idx3 = otx_ep_incr_index(idx2, 1, nb_desc);
+
+   if (new_pkts - pkts > 4) {
+   pidx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   pidx1 = otx_ep_incr_index(pidx0, 1, nb_desc);
+   pidx2 = otx_ep_incr_index(pidx1, 1, nb_desc);
+   pidx3 = otx_ep_incr_index(pidx2, 1, nb_desc);
+
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx0], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx1], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx2], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx3], void *));
+   }
+
+   m0 = recv_buf_list[idx0];
+   m1 = recv_buf_list[idx1];
+   m2 = recv_buf_list[idx2];
+   m3 = recv_buf_list[idx3];
+
+   /* Load packet size big-endian. */
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m0, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 0);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m1, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 1);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m2, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 2);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m3, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 3);
+   /* Convert to little-endian. */
+   s01 = vrev16q_u8(s01);
+
+   /* Vertical add, consolidate outside the loop. */
+   bytes += vaddq_u32(bytes, s01);
+   /* Segregate to packet length and data length. */
+   s23 = vqtbl1q_u8(s01, mask1);
+   s01 = vqtbl1q_u8(s01, mask0);
+
+   /* Store packet length and data length to mbuf. */
+   *(uint64_t *)&m0->pkt_len = vgetq_lane_u64(s01, 0);
+   *(uint64_t *)&m1->pkt_len = vgetq_lane_u64(s01, 1);
+   *(uint64_t *)&m2->pkt_len = vgetq_lane_u64(s23, 0);
+   *(uint64_t *)&m3->pkt_len = vgetq_lane_u64(s23, 1);
+
+   /* Reset rearm data. */
+   *(uint64_t *)&m0->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m1->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m2->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m3->rearm_data = droq->rearm_data;
+
+   rx_pkts[pkts++] = m0;
+   

[PATCH 1/2] net/mlx5/hws: definer, update pattern validations

2024-02-02 Thread Gregory Etelson
The patch updates HWS code for upcoming extended PMD pattern template
verification:
Support VOID flow item type.
Return E2BUG error code when pattern is too large for definer.

Signed-off-by: Gregory Etelson 
---
 drivers/net/mlx5/hws/mlx5dr_definer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 0b60479406..05b53e622a 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -2537,6 +2537,8 @@ mlx5dr_definer_conv_items_to_hl(struct mlx5dr_context 
*ctx,
ret = mlx5dr_definer_conv_item_ptype(&cd, items, i);
item_flags |= MLX5_FLOW_ITEM_PTYPE;
break;
+   case RTE_FLOW_ITEM_TYPE_VOID:
+   break;
default:
DR_LOG(ERR, "Unsupported item type %d", items->type);
rte_errno = ENOTSUP;
@@ -2843,7 +2845,7 @@ mlx5dr_definer_find_best_match_fit(struct mlx5dr_context 
*ctx,
}
 
DR_LOG(ERR, "Unable to find supporting match/jumbo definer 
combination");
-   rte_errno = ENOTSUP;
+   rte_errno = E2BIG;
return rte_errno;
 }
 
-- 
2.39.2



[PATCH 0/2] net/mlx5: update pattern validations

2024-02-02 Thread Gregory Etelson
Gregory Etelson (2):
  net/mlx5/hws: definer, update pattern validations
  net/mlx5: improve pattern template validation

 drivers/net/mlx5/hws/mlx5dr_definer.c |   4 +-
 drivers/net/mlx5/mlx5.h   |   1 +
 drivers/net/mlx5/mlx5_flow_hw.c   | 121 --
 3 files changed, 120 insertions(+), 6 deletions(-)

-- 
2.39.2



[PATCH 2/2] net/mlx5: improve pattern template validation

2024-02-02 Thread Gregory Etelson
Current PMD implementation validates pattern templates that will
always be rejected during table template creation.

The patch adds basic HWS verifications to pattern validation to
ensure that the pattern can be used in table template.

PMD updates `rte_errno` if pattern template validation failed:

E2BIG - pattern too big for PMD
ENOTSUP - pattern not supported by PMD
ENOMEM - PMD allocation failure

Signed-off-by: Gregory Etelson 
---
 drivers/net/mlx5/mlx5.h |   1 +
 drivers/net/mlx5/mlx5_flow_hw.c | 121 ++--
 2 files changed, 117 insertions(+), 5 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index f2e2e04429..e98db91888 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1965,6 +1965,7 @@ struct mlx5_priv {
struct mlx5_aso_mtr_pool *hws_mpool; /* HW steering's Meter pool. */
struct mlx5_flow_hw_ctrl_rx *hw_ctrl_rx;
/**< HW steering templates used to create control flow rules. */
+   struct rte_flow_actions_template 
*action_template_drop[MLX5DR_TABLE_TYPE_MAX];
 #endif
struct rte_eth_dev *shared_host; /* Host device for HW steering. */
uint16_t shared_refcnt; /* HW steering host reference counter. */
diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index da873ae2e2..443aa5fcf0 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -6840,6 +6840,46 @@ flow_hw_pattern_has_sq_match(const struct rte_flow_item 
*items)
return false;
 }
 
+static int
+pattern_template_validate(struct rte_eth_dev *dev,
+ struct rte_flow_pattern_template *pt[], uint32_t 
pt_num)
+{
+   uint32_t group = 0;
+   struct rte_flow_error error;
+   struct rte_flow_template_table_attr tbl_attr = {
+   .nb_flows = 64,
+   .insertion_type = RTE_FLOW_TABLE_INSERTION_TYPE_PATTERN,
+   .hash_func = RTE_FLOW_TABLE_HASH_FUNC_DEFAULT,
+   .flow_attr = {
+   .ingress = pt[0]->attr.ingress,
+   .egress = pt[0]->attr.egress,
+   .transfer = pt[0]->attr.transfer
+   }
+   };
+   struct mlx5_priv *priv = dev->data->dev_private;
+   struct rte_flow_actions_template *action_template;
+
+   if (pt[0]->attr.ingress)
+   action_template = 
priv->action_template_drop[MLX5DR_TABLE_TYPE_NIC_RX];
+   else if (pt[0]->attr.egress)
+   action_template = 
priv->action_template_drop[MLX5DR_TABLE_TYPE_NIC_TX];
+   else if (pt[0]->attr.transfer)
+   action_template = 
priv->action_template_drop[MLX5DR_TABLE_TYPE_FDB];
+   else
+   return EINVAL;
+   do {
+   struct rte_flow_template_table *tmpl_tbl;
+
+   tbl_attr.flow_attr.group = group;
+   tmpl_tbl = flow_hw_template_table_create(dev, &tbl_attr, pt, 
pt_num,
+&action_template, 1, 
NULL);
+   if (!tmpl_tbl)
+   return rte_errno;
+   flow_hw_table_destroy(dev, tmpl_tbl, &error);
+   } while (++group <= 1);
+   return 0;
+}
+
 /**
  * Create flow item template.
  *
@@ -6975,8 +7015,19 @@ flow_hw_pattern_template_create(struct rte_eth_dev *dev,
}
}
__atomic_fetch_add(&it->refcnt, 1, __ATOMIC_RELAXED);
+   rte_errno = pattern_template_validate(dev, &it, 1);
+   if (rte_errno)
+   goto error;
LIST_INSERT_HEAD(&priv->flow_hw_itt, it, next);
return it;
+error:
+   flow_hw_flex_item_release(dev, &it->flex_item);
+   claim_zero(mlx5dr_match_template_destroy(it->mt));
+   mlx5_free(it);
+   rte_flow_error_set(error, rte_errno, RTE_FLOW_ERROR_TYPE_UNSPECIFIED, 
NULL,
+  "Failed to create pattern template");
+   return NULL;
+
 }
 
 /**
@@ -9184,6 +9235,66 @@ flow_hw_compare_config(const struct mlx5_flow_hw_attr 
*hw_attr,
return true;
 }
 
+/*
+ * No need to explicitly release drop action templates on port stop.
+ * Drop action templates release with other action templates during
+ * mlx5_dev_close -> flow_hw_resource_release -> 
flow_hw_actions_template_destroy
+ */
+static void
+action_template_drop_release(struct rte_eth_dev *dev)
+{
+   int i;
+   struct mlx5_priv *priv = dev->data->dev_private;
+
+   for (i = 0; i < MLX5DR_TABLE_TYPE_MAX; i++) {
+   if (!priv->action_template_drop[i])
+   continue;
+   flow_hw_actions_template_destroy(dev,
+priv->action_template_drop[i],
+NULL);
+   }
+}
+
+static int
+action_template_drop_init(struct rte_eth_dev *dev,
+ struct rte_flow_error *error)
+{
+   const struct rte_flow_action drop[2] = {
+   [0] 

[PATCH v6 1/2] net/octeon_ep: improve Rx performance

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Use mempool API instead of pktmbuf alloc to avoid mbuf reset
as it will be done by rearm on receive.
Reorder refill to avoid unnecessary write commits on mbuf data.

Signed-off-by: Pavan Nikhilesh 
---
 v2 Changes:
 - Fix compilation with distro gcc.
 v3 Changes:
 - Fix aarch32 compilation.
 v4 Changes:
 - Fix checkpatch.
 v5 Changes:
 - Update release notes.
 v6 Changes:
 - Fix checkpatch again.

 doc/guides/rel_notes/release_24_03.rst |  2 ++
 drivers/net/octeon_ep/cnxk_ep_rx.c |  4 +--
 drivers/net/octeon_ep/cnxk_ep_rx.h | 13 ++---
 drivers/net/octeon_ep/cnxk_ep_rx_avx.c | 20 +++---
 drivers/net/octeon_ep/cnxk_ep_rx_sse.c | 38 ++
 drivers/net/octeon_ep/otx_ep_rxtx.h|  2 +-
 6 files changed, 44 insertions(+), 35 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index 282a3f9c8c..c8fcaaad6d 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -59,6 +59,8 @@ New Features

   * Optimize mbuf rearm sequence.
   * Updated Tx queue mbuf free thresholds from 128 to 256 for better 
performance.
+  * Updated Rx queue mbuf refill routine to use mempool alloc and reorder it
+to avoid mbuf write commits.
   * Added optimized SSE Rx routines.
   * Added optimized AVX2 Rx routines.

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx.c
index f3e4fb27d1..7465e0a017 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.c
@@ -76,12 +76,12 @@ cnxk_ep_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts)
uint16_t new_pkts;

new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
-   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
-
/* Refill RX buffers */
if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
cnxk_ep_rx_refill(droq);

+   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
+
return new_pkts;
 }

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.h 
b/drivers/net/octeon_ep/cnxk_ep_rx.h
index e71fc0de5c..61263e651e 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.h
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.h
@@ -21,13 +21,16 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
uint32_t i;
int rc;

-   rc = rte_pktmbuf_alloc_bulk(droq->mpool, &recv_buf_list[refill_idx], 
count);
+   rc = rte_mempool_get_bulk(droq->mpool, (void 
**)&recv_buf_list[refill_idx], count);
if (unlikely(rc)) {
droq->stats.rx_alloc_failure++;
return rc;
}

for (i = 0; i < count; i++) {
+   rte_prefetch_non_temporal(&desc_ring[(refill_idx + 1) & 3]);
+   if (i < count - 1)
+   rte_prefetch_non_temporal(recv_buf_list[refill_idx + 
1]);
buf = recv_buf_list[refill_idx];
desc_ring[refill_idx].buffer_ptr = 
rte_mbuf_data_iova_default(buf);
refill_idx++;
@@ -42,9 +45,9 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
 static inline void
 cnxk_ep_rx_refill(struct otx_ep_droq *droq)
 {
-   uint32_t desc_refilled = 0, count;
-   uint32_t nb_desc = droq->nb_desc;
+   const uint32_t nb_desc = droq->nb_desc;
uint32_t refill_idx = droq->refill_idx;
+   uint32_t desc_refilled = 0, count;
int rc;

if (unlikely(droq->read_idx == refill_idx))
@@ -128,6 +131,8 @@ cnxk_ep_rx_pkts_to_process(struct otx_ep_droq *droq, 
uint16_t nb_pkts)
return RTE_MIN(nb_pkts, droq->pkts_pending);
 }

+#define cnxk_pktmbuf_mtod(m, t) ((t)(void *)((char *)(m)->buf_addr + 
RTE_PKTMBUF_HEADROOM))
+
 static __rte_always_inline void
 cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq, uint16_t new_pkts)
 {
@@ -147,7 +152,7 @@ cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq,
  void *));

mbuf = recv_buf_list[read_idx];
-   info = rte_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
+   info = cnxk_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
read_idx = otx_ep_incr_index(read_idx, 1, nb_desc);
pkt_len = rte_bswap16(info->length >> 48);
mbuf->pkt_len = pkt_len;
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
index ae4615e6da..47eb1d2ef7 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
@@ -49,7 +49,7 @@ cnxk_ep_process_pkts_vec_avx(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq
/* Load rearm data and packet length for shuffle. */
for (i = 0; i < CNXK_EP_OQ_DESC_PER_LOOP_AVX; i++)
data[i] = _mm256_set_epi64x(0,
-   rte_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)

[PATCH v6 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Add Rx ARM NEON SIMD routine.

Signed-off-by: Pavan Nikhilesh 
---
 doc/guides/rel_notes/release_24_03.rst  |   1 +
 drivers/net/octeon_ep/cnxk_ep_rx_neon.c | 147 
 drivers/net/octeon_ep/meson.build   |   6 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c   |   5 +-
 drivers/net/octeon_ep/otx_ep_rxtx.h |   6 +
 5 files changed, 163 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/octeon_ep/cnxk_ep_rx_neon.c

diff --git a/doc/guides/rel_notes/release_24_03.rst 
b/doc/guides/rel_notes/release_24_03.rst
index c8fcaaad6d..7a83b545cc 100644
--- a/doc/guides/rel_notes/release_24_03.rst
+++ b/doc/guides/rel_notes/release_24_03.rst
@@ -63,6 +63,7 @@ New Features
 to avoid mbuf write commits.
   * Added optimized SSE Rx routines.
   * Added optimized AVX2 Rx routines.
+  * Added optimized NEON Rx routines.
 
 * **Updated Marvell cnxk net driver.**
 
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_neon.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
new file mode 100644
index 00..4c46a7ea08
--- /dev/null
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
@@ -0,0 +1,147 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell.
+ */
+
+#include "cnxk_ep_rx.h"
+
+static __rte_always_inline void
+cnxk_ep_process_pkts_vec_neon(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq,
+ uint16_t new_pkts)
+{
+   const uint8x16_t mask0 = {0, 1, 0xff, 0xff, 0, 1, 0xff, 0xff,
+ 4, 5, 0xff, 0xff, 4, 5, 0xff, 0xff};
+   const uint8x16_t mask1 = {8,  9,  0xff, 0xff, 8,  9,  0xff, 0xff,
+ 12, 13, 0xff, 0xff, 12, 13, 0xff, 0xff};
+   struct rte_mbuf **recv_buf_list = droq->recv_buf_list;
+   uint32_t pidx0, pidx1, pidx2, pidx3;
+   struct rte_mbuf *m0, *m1, *m2, *m3;
+   uint32_t read_idx = droq->read_idx;
+   uint16_t nb_desc = droq->nb_desc;
+   uint32_t idx0, idx1, idx2, idx3;
+   uint64x2_t s01, s23;
+   uint32x4_t bytes;
+   uint16_t pkts = 0;
+
+   idx0 = read_idx;
+   s01 = vdupq_n_u64(0);
+   bytes = vdupq_n_u32(0);
+   while (pkts < new_pkts) {
+   idx1 = otx_ep_incr_index(idx0, 1, nb_desc);
+   idx2 = otx_ep_incr_index(idx1, 1, nb_desc);
+   idx3 = otx_ep_incr_index(idx2, 1, nb_desc);
+
+   if (new_pkts - pkts > 4) {
+   pidx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   pidx1 = otx_ep_incr_index(pidx0, 1, nb_desc);
+   pidx2 = otx_ep_incr_index(pidx1, 1, nb_desc);
+   pidx3 = otx_ep_incr_index(pidx2, 1, nb_desc);
+
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx0], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx1], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx2], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx3], void *));
+   }
+
+   m0 = recv_buf_list[idx0];
+   m1 = recv_buf_list[idx1];
+   m2 = recv_buf_list[idx2];
+   m3 = recv_buf_list[idx3];
+
+   /* Load packet size big-endian. */
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m0, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 0);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m1, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 1);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m2, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 2);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m3, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 3);
+   /* Convert to little-endian. */
+   s01 = vrev16q_u8(s01);
+
+   /* Vertical add, consolidate outside the loop. */
+   bytes += vaddq_u32(bytes, s01);
+   /* Segregate to packet length and data length. */
+   s23 = vqtbl1q_u8(s01, mask1);
+   s01 = vqtbl1q_u8(s01, mask0);
+
+   /* Store packet length and data length to mbuf. */
+   *(uint64_t *)&m0->pkt_len = vgetq_lane_u64(s01, 0);
+   *(uint64_t *)&m1->pkt_len = vgetq_lane_u64(s01, 1);
+   *(uint64_t *)&m2->pkt_len = vgetq_lane_u64(s23, 0);
+   *(uint64_t *)&m3->pkt_len = vgetq_lane_u64(s23, 1);
+
+   /* Reset rearm data. */
+   *(uint64_t *)&m0->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m1->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m2->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m3->rearm_data = droq->rearm_data;
+
+   rx_pkts[pkts++] = m0;
+   rx

RE: [PATCH v3 1/2] crypto/ipsec_mb: bump minimum IPsec Multi-buffer version

2024-02-02 Thread De Lara Guarch, Pablo



> -Original Message-
> From: Sivaramakrishnan, VenkatX 
> Sent: Friday, February 2, 2024 3:04 PM
> To: Ji, Kai ; De Lara Guarch, Pablo
> 
> Cc: dev@dpdk.org; Richardson, Bruce ; Power,
> Ciara ; Sivaramakrishnan, VenkatX
> 
> Subject: [PATCH v3 1/2] crypto/ipsec_mb: bump minimum IPsec Multi-buffer
> version
> 
> SW PMDs increment IPsec Multi-buffer version to 1.4.
> A minimum IPsec Multi-buffer version of 1.4 or greater is now required.
> 
> Signed-off-by: Sivaramakrishnan Venkat
> 
> Acked-by: Ciara Power 

Acked-by: Pablo de Lara 


RE: [PATCH v3 2/2] doc: remove outdated version details

2024-02-02 Thread De Lara Guarch, Pablo



> -Original Message-
> From: Sivaramakrishnan, VenkatX 
> Sent: Friday, February 2, 2024 3:04 PM
> To: Ji, Kai ; De Lara Guarch, Pablo
> 
> Cc: dev@dpdk.org; Richardson, Bruce ; Power,
> Ciara ; Sivaramakrishnan, VenkatX
> 
> Subject: [PATCH v3 2/2] doc: remove outdated version details
> 
> SW PMDs documentation is updated to remove details of unsupported IPsec
> Multi-buffer versions.DPDK older than 20.11 is end of life. So, older DPDK
> versions are removed from the Crypto library version table.
> 
> Signed-off-by: Sivaramakrishnan Venkat
> 

Acked-by: Pablo de Lara 


[PATCH] drivers: Convert uses of RTE_LOG_DP to use RTE_LOG_DP_LINE

2024-02-02 Thread Stephen Hemminger
Make use of RTE_LOG_DP_LINE has newline just like
uses of macro with rte_log() which is common in drivers.

Signed-off-by: Stephen Hemminger 
---
 .../baseband/la12xx/bbdev_la12xx_pmd_logs.h   |   2 +-
 drivers/common/cpt/cpt_pmd_logs.h |   2 +-
 drivers/common/cpt/cpt_ucode.h|   4 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   | 126 ++
 drivers/crypto/dpaa2_sec/dpaa2_sec_logs.h |   2 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c   |  27 ++--
 drivers/crypto/dpaa_sec/dpaa_sec.c|  15 +--
 drivers/crypto/dpaa_sec/dpaa_sec_log.h|   2 +-
 drivers/crypto/dpaa_sec/dpaa_sec_raw_dp.c |   7 +-
 .../crypto/octeontx/otx_cryptodev_hw_access.h |   6 +-
 drivers/dma/dpaa/dpaa_qdma.c  |   2 +-
 drivers/dma/dpaa/dpaa_qdma_logs.h |   2 +-
 drivers/dma/dpaa2/dpaa2_qdma.c|   9 +-
 drivers/dma/dpaa2/dpaa2_qdma_logs.h   |   2 +-
 drivers/event/dsw/dsw_evdev.h |   2 +-
 drivers/event/dsw/dsw_event.c |  40 +++---
 drivers/mempool/dpaa/dpaa_mempool.c   |   2 +-
 drivers/mempool/dpaa/dpaa_mempool.h   |   3 +-
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c  |   6 +-
 drivers/mempool/dpaa2/dpaa2_hw_mempool_logs.h |   2 +-
 drivers/net/atlantic/atl_logs.h   |   4 +-
 drivers/net/dpaa/dpaa_ethdev.h|   2 +-
 drivers/net/dpaa2/dpaa2_pmd_logs.h|   2 +-
 drivers/net/dpaa2/dpaa2_rxtx.c|  44 +++---
 drivers/net/enetc/enetc_logs.h|   2 +-
 drivers/net/enetc/enetc_rxtx.c|   2 +-
 drivers/net/enetfec/enet_pmd_logs.h   |   2 +-
 drivers/net/mana/mana.h   |   2 +-
 drivers/net/pfe/pfe_logs.h|   2 +-
 drivers/raw/dpaa2_cmdif/dpaa2_cmdif.c |  12 +-
 drivers/raw/dpaa2_cmdif/dpaa2_cmdif_logs.h|   2 +-
 31 files changed, 142 insertions(+), 197 deletions(-)

diff --git a/drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h 
b/drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
index 49c8d35d104d..bc2be612632e 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
+++ b/drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
@@ -24,6 +24,6 @@ extern int bbdev_la12xx_logtype;
 
 /* DP Logs, toggled out at compile time if level lower than current level */
 #define rte_bbdev_dp_log(level, fmt, args...) \
-   RTE_LOG_DP(level, BBDEV_LA12XX, fmt, ## args)
+   RTE_LOG_DP_LINE(level, BBDEV_LA12XX, fmt, ## args)
 
 #endif /* _BBDEV_LA12XX_PMD_LOGS_H_ */
diff --git a/drivers/common/cpt/cpt_pmd_logs.h 
b/drivers/common/cpt/cpt_pmd_logs.h
index 3c109c1983ca..07c26821e277 100644
--- a/drivers/common/cpt/cpt_pmd_logs.h
+++ b/drivers/common/cpt/cpt_pmd_logs.h
@@ -34,7 +34,7 @@
  * DP logs, toggled out at compile time if level lower than current level.
  */
 #define CPT_LOG_DP(level, fmt, args...) \
-   RTE_LOG_DP(level, CPT, fmt "\n", ## args)
+   RTE_LOG_DP_LINE(level, CPT, fmt, ## args)
 
 #define CPT_LOG_DP_DEBUG(fmt, args...) \
CPT_LOG_DP(DEBUG, fmt, ## args)
diff --git a/drivers/common/cpt/cpt_ucode.h b/drivers/common/cpt/cpt_ucode.h
index 87a3ac80b9da..636f93604ec4 100644
--- a/drivers/common/cpt/cpt_ucode.h
+++ b/drivers/common/cpt/cpt_ucode.h
@@ -2589,7 +2589,7 @@ fill_sess_aead(struct rte_crypto_sym_xform *xform,
sess->cpt_op |= CPT_OP_CIPHER_DECRYPT;
sess->cpt_op |= CPT_OP_AUTH_VERIFY;
} else {
-   CPT_LOG_DP_ERR("Unknown aead operation\n");
+   CPT_LOG_DP_ERR("Unknown aead operation");
return -1;
}
switch (aead_form->algo) {
@@ -2658,7 +2658,7 @@ fill_sess_cipher(struct rte_crypto_sym_xform *xform,
ctx->dec_auth = 1;
}
} else {
-   CPT_LOG_DP_ERR("Unknown cipher operation\n");
+   CPT_LOG_DP_ERR("Unknown cipher operation");
return -1;
}
 
diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c 
b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index bb5a2c629e53..275cecb1124e 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -348,13 +348,9 @@ build_authenc_gcm_sg_fd(dpaa2_sec_session *sess,
DPAA2_SET_FD_COMPOUND_FMT(fd);
DPAA2_SET_FD_FLC(fd, DPAA2_VADDR_TO_IOVA(flc));
 
-   DPAA2_SEC_DP_DEBUG("GCM SG: auth_off: 0x%x/length %d, digest-len=%d\n"
-  "iv-len=%d data_off: 0x%x\n",
-  sym_op->aead.data.offset,
-  sym_op->aead.data.length,
-  sess->digest_length,
-  sess->iv.length,
-  sym_op->m_src->data_off);
+   DPAA2_SEC_DP_DEBUG("GCM SG: auth_off: 0x%x/length %d, digest-len=%d",
+  sym_op->aead.data.offset, sym_op->aead.data.length, 
sess->digest_length);
+   DPAA2_SEC_DP_DEBUG("iv-len=%d data_off: 0x%x", sess->iv.length, 
sym_op->m_src-

[PATCH 00/13] net/ionic: miscellaneous fixes and improvements

2024-02-02 Thread Andrew Boyer
This patchset provides miscellaneous fixes and improvements for
the net/ionic driver used by AMD Pensando devices.

Akshay Dorwat (1):
  net/ionic: fix RSS query routine

Andrew Boyer (8):
  net/ionic: add stat for completion queue entries processed
  net/ionic: increase max supported MTU to 9750 bytes
  net/ionic: don't auto-enable Rx scatter-gather a second time
  net/ionic: replace non-standard type in structure definition
  net/ionic: fix device close sequence to avoid crash
  net/ionic: optimize device close operation
  net/ionic: optimize device stop operation
  net/ionic: optimize device start operation

Brad Larson (1):
  net/ionic: add flexible firmware xstat counters

Neel Patel (2):
  net/ionic: fix missing volatile type for cqe pointers
  net/ionic: memcpy descriptors when using Q-in-CMB

Vamsi Krishna Atluri (1):
  net/ionic: report 1G and 200G link speeds when applicable

 drivers/net/ionic/ionic.h |   3 +
 drivers/net/ionic/ionic_dev.c |   9 +-
 drivers/net/ionic/ionic_dev.h |   8 +-
 drivers/net/ionic/ionic_dev_pci.c |   2 +-
 drivers/net/ionic/ionic_ethdev.c  |  81 ++--
 drivers/net/ionic/ionic_if.h  |  70 +++
 drivers/net/ionic/ionic_lif.c | 288 +-
 drivers/net/ionic/ionic_lif.h |  19 +-
 drivers/net/ionic/ionic_main.c|  17 +-
 drivers/net/ionic/ionic_rxtx.c| 160 ++
 drivers/net/ionic/ionic_rxtx.h|  80 ++-
 drivers/net/ionic/ionic_rxtx_sg.c |  28 +--
 drivers/net/ionic/ionic_rxtx_simple.c |  28 +--
 13 files changed, 550 insertions(+), 243 deletions(-)

-- 
2.17.1



[PATCH 01/13] net/ionic: add stat for completion queue entries processed

2024-02-02 Thread Andrew Boyer
When completion coalescing is turned on in the FW, there will be
fewer CQE than Tx packets. Expose the stat through debug logging.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_lif.h | 1 +
 drivers/net/ionic/ionic_rxtx.c| 3 +++
 drivers/net/ionic/ionic_rxtx_sg.c | 2 ++
 drivers/net/ionic/ionic_rxtx_simple.c | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/drivers/net/ionic/ionic_lif.h b/drivers/net/ionic/ionic_lif.h
index 36b3bcc5a9..cac7a4583b 100644
--- a/drivers/net/ionic/ionic_lif.h
+++ b/drivers/net/ionic/ionic_lif.h
@@ -32,6 +32,7 @@
 struct ionic_tx_stats {
uint64_t packets;
uint64_t bytes;
+   uint64_t comps;
uint64_t drop;
uint64_t stop;
uint64_t no_csum;
diff --git a/drivers/net/ionic/ionic_rxtx.c b/drivers/net/ionic/ionic_rxtx.c
index b9e73b4871..d92b231f8f 100644
--- a/drivers/net/ionic/ionic_rxtx.c
+++ b/drivers/net/ionic/ionic_rxtx.c
@@ -117,6 +117,9 @@ ionic_dev_tx_queue_stop(struct rte_eth_dev *eth_dev, 
uint16_t tx_queue_id)
stats = &txq->stats;
IONIC_PRINT(DEBUG, "TX queue %u pkts %ju tso %ju",
txq->qcq.q.index, stats->packets, stats->tso);
+   IONIC_PRINT(DEBUG, "TX queue %u comps %ju (%ju per)",
+   txq->qcq.q.index, stats->comps,
+   stats->comps ? stats->packets / stats->comps : 0);
 
return 0;
 }
diff --git a/drivers/net/ionic/ionic_rxtx_sg.c 
b/drivers/net/ionic/ionic_rxtx_sg.c
index ab8e56e91c..6c028a698c 100644
--- a/drivers/net/ionic/ionic_rxtx_sg.c
+++ b/drivers/net/ionic/ionic_rxtx_sg.c
@@ -26,6 +26,7 @@ ionic_tx_flush_sg(struct ionic_tx_qcq *txq)
 {
struct ionic_cq *cq = &txq->qcq.cq;
struct ionic_queue *q = &txq->qcq.q;
+   struct ionic_tx_stats *stats = &txq->stats;
struct rte_mbuf *txm;
struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base;
void **info;
@@ -72,6 +73,7 @@ ionic_tx_flush_sg(struct ionic_tx_qcq *txq)
}
 
cq_desc = &cq_desc_base[cq->tail_idx];
+   stats->comps++;
}
 }
 
diff --git a/drivers/net/ionic/ionic_rxtx_simple.c 
b/drivers/net/ionic/ionic_rxtx_simple.c
index 5f81856256..5969287b66 100644
--- a/drivers/net/ionic/ionic_rxtx_simple.c
+++ b/drivers/net/ionic/ionic_rxtx_simple.c
@@ -26,6 +26,7 @@ ionic_tx_flush(struct ionic_tx_qcq *txq)
 {
struct ionic_cq *cq = &txq->qcq.cq;
struct ionic_queue *q = &txq->qcq.q;
+   struct ionic_tx_stats *stats = &txq->stats;
struct rte_mbuf *txm;
struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base;
void **info;
@@ -67,6 +68,7 @@ ionic_tx_flush(struct ionic_tx_qcq *txq)
}
 
cq_desc = &cq_desc_base[cq->tail_idx];
+   stats->comps++;
}
 }
 
-- 
2.17.1



[PATCH 02/13] net/ionic: increase max supported MTU to 9750 bytes

2024-02-02 Thread Andrew Boyer
Some configurations want to use values this high internally.
Allow them to do so without modifying the code.

Signed-off-by: Andrew Boyer 
Signed-off-by: Bhuvan Mital 
---
 drivers/net/ionic/ionic_dev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ionic/ionic_dev.h b/drivers/net/ionic/ionic_dev.h
index b1e74fbd89..971c261b27 100644
--- a/drivers/net/ionic/ionic_dev.h
+++ b/drivers/net/ionic/ionic_dev.h
@@ -14,7 +14,7 @@
 #define VLAN_TAG_SIZE  4
 
 #define IONIC_MIN_MTU  RTE_ETHER_MIN_MTU
-#define IONIC_MAX_MTU  9378
+#define IONIC_MAX_MTU  9750
 #define IONIC_ETH_OVERHEAD (RTE_ETHER_HDR_LEN + VLAN_TAG_SIZE)
 
 #define IONIC_MAX_RING_DESC32768
-- 
2.17.1



[PATCH 03/13] net/ionic: don't auto-enable Rx scatter-gather a second time

2024-02-02 Thread Andrew Boyer
The receive side will enable scatter-gather if required based on the
mbuf size. If the client already enabled it in the config, it does
not need to be enabled again. This reduces log output.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_lif.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c
index 25b490deb6..fe2112c057 100644
--- a/drivers/net/ionic/ionic_lif.c
+++ b/drivers/net/ionic/ionic_lif.c
@@ -768,7 +768,8 @@ ionic_rx_qcq_alloc(struct ionic_lif *lif, uint32_t 
socket_id, uint32_t index,
max_mtu = rte_le_to_cpu_32(lif->adapter->ident.lif.eth.max_mtu);
 
/* If mbufs are too small to hold received packets, enable SG */
-   if (max_mtu > hdr_seg_size) {
+   if (max_mtu > hdr_seg_size &&
+   !(lif->features & IONIC_ETH_HW_RX_SG)) {
IONIC_PRINT(NOTICE, "Enabling RX_OFFLOAD_SCATTER");
lif->eth_dev->data->dev_conf.rxmode.offloads |=
RTE_ETH_RX_OFFLOAD_SCATTER;
-- 
2.17.1



[PATCH 05/13] net/ionic: replace non-standard type in structure definition

2024-02-02 Thread Andrew Boyer
Use uint8_t instead of u_char. This simplifies the code.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_dev_pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ionic/ionic_dev_pci.c 
b/drivers/net/ionic/ionic_dev_pci.c
index 5e74a6da71..cbaac2c5bc 100644
--- a/drivers/net/ionic/ionic_dev_pci.c
+++ b/drivers/net/ionic/ionic_dev_pci.c
@@ -38,7 +38,7 @@ ionic_pci_setup(struct ionic_adapter *adapter)
struct ionic_dev *idev = &adapter->idev;
struct rte_pci_device *bus_dev = adapter->bus_dev;
uint32_t sig;
-   u_char *bar0_base;
+   uint8_t *bar0_base;
unsigned int i;
 
/* BAR0: dev_cmd and interrupts */
-- 
2.17.1



[PATCH 04/13] net/ionic: fix missing volatile type for cqe pointers

2024-02-02 Thread Andrew Boyer
From: Neel Patel 

This memory may be changed by the hardware, so the volatile
keyword is required for correctness.

Fixes: e86a6fcc7cf3 ("net/ionic: add optimized non-scattered Rx/Tx")
cc: sta...@dpdk.org

Signed-off-by: Andrew Boyer 
Signed-off-by: Neel Patel 
---
 drivers/net/ionic/ionic_rxtx.c| 4 ++--
 drivers/net/ionic/ionic_rxtx_sg.c | 8 +---
 drivers/net/ionic/ionic_rxtx_simple.c | 8 +---
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ionic/ionic_rxtx.c b/drivers/net/ionic/ionic_rxtx.c
index d92b231f8f..d92fa1cca7 100644
--- a/drivers/net/ionic/ionic_rxtx.c
+++ b/drivers/net/ionic/ionic_rxtx.c
@@ -755,7 +755,7 @@ ionic_dev_rx_descriptor_status(void *rx_queue, uint16_t 
offset)
 {
struct ionic_rx_qcq *rxq = rx_queue;
struct ionic_qcq *qcq = &rxq->qcq;
-   struct ionic_rxq_comp *cq_desc;
+   volatile struct ionic_rxq_comp *cq_desc;
uint16_t mask, head, tail, pos;
bool done_color;
 
@@ -794,7 +794,7 @@ ionic_dev_tx_descriptor_status(void *tx_queue, uint16_t 
offset)
 {
struct ionic_tx_qcq *txq = tx_queue;
struct ionic_qcq *qcq = &txq->qcq;
-   struct ionic_txq_comp *cq_desc;
+   volatile struct ionic_txq_comp *cq_desc;
uint16_t mask, head, tail, pos, cq_pos;
bool done_color;
 
diff --git a/drivers/net/ionic/ionic_rxtx_sg.c 
b/drivers/net/ionic/ionic_rxtx_sg.c
index 6c028a698c..1392342463 100644
--- a/drivers/net/ionic/ionic_rxtx_sg.c
+++ b/drivers/net/ionic/ionic_rxtx_sg.c
@@ -28,7 +28,8 @@ ionic_tx_flush_sg(struct ionic_tx_qcq *txq)
struct ionic_queue *q = &txq->qcq.q;
struct ionic_tx_stats *stats = &txq->stats;
struct rte_mbuf *txm;
-   struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base;
+   struct ionic_txq_comp *cq_desc_base = cq->base;
+   volatile struct ionic_txq_comp *cq_desc;
void **info;
uint32_t i;
 
@@ -254,7 +255,7 @@ ionic_xmit_pkts_sg(void *tx_queue, struct rte_mbuf 
**tx_pkts,
  */
 static __rte_always_inline void
 ionic_rx_clean_one_sg(struct ionic_rx_qcq *rxq,
-   struct ionic_rxq_comp *cq_desc,
+   volatile struct ionic_rxq_comp *cq_desc,
struct ionic_rx_service *rx_svc)
 {
struct ionic_queue *q = &rxq->qcq.q;
@@ -440,7 +441,8 @@ ionic_rxq_service_sg(struct ionic_rx_qcq *rxq, uint32_t 
work_to_do,
struct ionic_cq *cq = &rxq->qcq.cq;
struct ionic_queue *q = &rxq->qcq.q;
struct ionic_rxq_desc *q_desc_base = q->base;
-   struct ionic_rxq_comp *cq_desc, *cq_desc_base = cq->base;
+   struct ionic_rxq_comp *cq_desc_base = cq->base;
+   volatile struct ionic_rxq_comp *cq_desc;
uint32_t work_done = 0;
uint64_t then, now, hz, delta;
 
diff --git a/drivers/net/ionic/ionic_rxtx_simple.c 
b/drivers/net/ionic/ionic_rxtx_simple.c
index 5969287b66..00152c885a 100644
--- a/drivers/net/ionic/ionic_rxtx_simple.c
+++ b/drivers/net/ionic/ionic_rxtx_simple.c
@@ -28,7 +28,8 @@ ionic_tx_flush(struct ionic_tx_qcq *txq)
struct ionic_queue *q = &txq->qcq.q;
struct ionic_tx_stats *stats = &txq->stats;
struct rte_mbuf *txm;
-   struct ionic_txq_comp *cq_desc, *cq_desc_base = cq->base;
+   struct ionic_txq_comp *cq_desc_base = cq->base;
+   volatile struct ionic_txq_comp *cq_desc;
void **info;
 
cq_desc = &cq_desc_base[cq->tail_idx];
@@ -227,7 +228,7 @@ ionic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
  */
 static __rte_always_inline void
 ionic_rx_clean_one(struct ionic_rx_qcq *rxq,
-   struct ionic_rxq_comp *cq_desc,
+   volatile struct ionic_rxq_comp *cq_desc,
struct ionic_rx_service *rx_svc)
 {
struct ionic_queue *q = &rxq->qcq.q;
@@ -361,7 +362,8 @@ ionic_rxq_service(struct ionic_rx_qcq *rxq, uint32_t 
work_to_do,
struct ionic_cq *cq = &rxq->qcq.cq;
struct ionic_queue *q = &rxq->qcq.q;
struct ionic_rxq_desc *q_desc_base = q->base;
-   struct ionic_rxq_comp *cq_desc, *cq_desc_base = cq->base;
+   struct ionic_rxq_comp *cq_desc_base = cq->base;
+   volatile struct ionic_rxq_comp *cq_desc;
uint32_t work_done = 0;
uint64_t then, now, hz, delta;
 
-- 
2.17.1



[PATCH 08/13] net/ionic: report 1G and 200G link speeds when applicable

2024-02-02 Thread Andrew Boyer
From: Vamsi Krishna Atluri 

The hardware supports these speeds, so we should report them
correctly.

Signed-off-by: Andrew Boyer 
Signed-off-by: Vamsi Krishna Atluri 
---
 drivers/net/ionic/ionic_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index 008e50e0b9..327f6b9de5 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -285,6 +285,9 @@ ionic_dev_link_update(struct rte_eth_dev *eth_dev,
link.link_status = RTE_ETH_LINK_UP;
link.link_duplex = RTE_ETH_LINK_FULL_DUPLEX;
switch (adapter->link_speed) {
+   case  1000:
+   link.link_speed = RTE_ETH_SPEED_NUM_1G;
+   break;
case  1:
link.link_speed = RTE_ETH_SPEED_NUM_10G;
break;
@@ -300,6 +303,9 @@ ionic_dev_link_update(struct rte_eth_dev *eth_dev,
case 10:
link.link_speed = RTE_ETH_SPEED_NUM_100G;
break;
+   case 20:
+   link.link_speed = RTE_ETH_SPEED_NUM_200G;
+   break;
default:
link.link_speed = RTE_ETH_SPEED_NUM_NONE;
break;
-- 
2.17.1



[PATCH 07/13] net/ionic: fix RSS query routine

2024-02-02 Thread Andrew Boyer
From: Akshay Dorwat 

The routine that copies out the RSS config can't use memcpy() because
'reta_conf->reta' is an array of uint16_t while 'lif->rss_ind_tbl' is
an array of uint8_t. Instead, copy the values individually.

Fixes: 22e7171bc63b ("net/ionic: support RSS")
Cc: cardigli...@ntop.org
Cc: sta...@dpdk.org

Signed-off-by: Akshay Dorwat 
Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_ethdev.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index 340fd0cd59..008e50e0b9 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -561,7 +561,7 @@ ionic_dev_rss_reta_query(struct rte_eth_dev *eth_dev,
struct ionic_lif *lif = IONIC_ETH_DEV_TO_LIF(eth_dev);
struct ionic_adapter *adapter = lif->adapter;
struct ionic_identity *ident = &adapter->ident;
-   int i, num;
+   int i, j, num;
uint16_t tbl_sz = rte_le_to_cpu_16(ident->lif.eth.rss_ind_tbl_sz);
 
IONIC_PRINT_CALL();
@@ -582,9 +582,10 @@ ionic_dev_rss_reta_query(struct rte_eth_dev *eth_dev,
num = reta_size / RTE_ETH_RETA_GROUP_SIZE;
 
for (i = 0; i < num; i++) {
-   memcpy(reta_conf->reta,
-   &lif->rss_ind_tbl[i * RTE_ETH_RETA_GROUP_SIZE],
-   RTE_ETH_RETA_GROUP_SIZE);
+   for (j = 0; j < RTE_ETH_RETA_GROUP_SIZE; j++) {
+   reta_conf->reta[j] =
+   lif->rss_ind_tbl[(i * RTE_ETH_RETA_GROUP_SIZE) 
+ j];
+   }
reta_conf++;
}
 
-- 
2.17.1



[PATCH 09/13] net/ionic: add flexible firmware xstat counters

2024-02-02 Thread Andrew Boyer
From: Brad Larson 

Assign 32 counters for flexible firmware events. These can be used as
per-port or per-queue counters in certain firmware configurations.
They are displayed as fw_flex_eventX in xstats.

Signed-off-by: Andrew Boyer 
Signed-off-by: Brad Larson 
---
 drivers/net/ionic/ionic_ethdev.c | 33 +++
 drivers/net/ionic/ionic_if.h | 70 
 2 files changed, 68 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index 327f6b9de5..7c55a26956 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -196,6 +196,39 @@ static const struct rte_ionic_xstats_name_off 
rte_ionic_xstats_strings[] = {
tx_desc_fetch_error)},
{"tx_desc_data_error", offsetof(struct ionic_lif_stats,
tx_desc_data_error)},
+   /* Flexible firmware events */
+   {"fw_flex_event1", offsetof(struct ionic_lif_stats, flex1)},
+   {"fw_flex_event2", offsetof(struct ionic_lif_stats, flex2)},
+   {"fw_flex_event3", offsetof(struct ionic_lif_stats, flex3)},
+   {"fw_flex_event4", offsetof(struct ionic_lif_stats, flex4)},
+   {"fw_flex_event5", offsetof(struct ionic_lif_stats, flex5)},
+   {"fw_flex_event6", offsetof(struct ionic_lif_stats, flex6)},
+   {"fw_flex_event7", offsetof(struct ionic_lif_stats, flex7)},
+   {"fw_flex_event8", offsetof(struct ionic_lif_stats, flex8)},
+   {"fw_flex_event9", offsetof(struct ionic_lif_stats, flex9)},
+   {"fw_flex_event10", offsetof(struct ionic_lif_stats, flex10)},
+   {"fw_flex_event11", offsetof(struct ionic_lif_stats, flex11)},
+   {"fw_flex_event12", offsetof(struct ionic_lif_stats, flex12)},
+   {"fw_flex_event13", offsetof(struct ionic_lif_stats, flex13)},
+   {"fw_flex_event14", offsetof(struct ionic_lif_stats, flex14)},
+   {"fw_flex_event15", offsetof(struct ionic_lif_stats, flex15)},
+   {"fw_flex_event16", offsetof(struct ionic_lif_stats, flex16)},
+   {"fw_flex_event17", offsetof(struct ionic_lif_stats, flex17)},
+   {"fw_flex_event18", offsetof(struct ionic_lif_stats, flex18)},
+   {"fw_flex_event19", offsetof(struct ionic_lif_stats, flex19)},
+   {"fw_flex_event20", offsetof(struct ionic_lif_stats, flex20)},
+   {"fw_flex_event21", offsetof(struct ionic_lif_stats, flex21)},
+   {"fw_flex_event22", offsetof(struct ionic_lif_stats, flex22)},
+   {"fw_flex_event23", offsetof(struct ionic_lif_stats, flex23)},
+   {"fw_flex_event24", offsetof(struct ionic_lif_stats, flex24)},
+   {"fw_flex_event25", offsetof(struct ionic_lif_stats, flex25)},
+   {"fw_flex_event26", offsetof(struct ionic_lif_stats, flex26)},
+   {"fw_flex_event27", offsetof(struct ionic_lif_stats, flex27)},
+   {"fw_flex_event28", offsetof(struct ionic_lif_stats, flex28)},
+   {"fw_flex_event29", offsetof(struct ionic_lif_stats, flex29)},
+   {"fw_flex_event30", offsetof(struct ionic_lif_stats, flex30)},
+   {"fw_flex_event31", offsetof(struct ionic_lif_stats, flex31)},
+   {"fw_flex_event32", offsetof(struct ionic_lif_stats, flex32)},
 };
 
 #define IONIC_NB_HW_STATS RTE_DIM(rte_ionic_xstats_strings)
diff --git a/drivers/net/ionic/ionic_if.h b/drivers/net/ionic/ionic_if.h
index 79aa196345..7ca604a7bb 100644
--- a/drivers/net/ionic/ionic_if.h
+++ b/drivers/net/ionic/ionic_if.h
@@ -2592,41 +2592,41 @@ struct ionic_lif_stats {
__le64 rsvd16;
__le64 rsvd17;
 
-   __le64 rsvd18;
-   __le64 rsvd19;
-   __le64 rsvd20;
-   __le64 rsvd21;
-   __le64 rsvd22;
-   __le64 rsvd23;
-   __le64 rsvd24;
-   __le64 rsvd25;
-
-   __le64 rsvd26;
-   __le64 rsvd27;
-   __le64 rsvd28;
-   __le64 rsvd29;
-   __le64 rsvd30;
-   __le64 rsvd31;
-   __le64 rsvd32;
-   __le64 rsvd33;
-
-   __le64 rsvd34;
-   __le64 rsvd35;
-   __le64 rsvd36;
-   __le64 rsvd37;
-   __le64 rsvd38;
-   __le64 rsvd39;
-   __le64 rsvd40;
-   __le64 rsvd41;
-
-   __le64 rsvd42;
-   __le64 rsvd43;
-   __le64 rsvd44;
-   __le64 rsvd45;
-   __le64 rsvd46;
-   __le64 rsvd47;
-   __le64 rsvd48;
-   __le64 rsvd49;
+   __le64 flex1;
+   __le64 flex2;
+   __le64 flex3;
+   __le64 flex4;
+   __le64 flex5;
+   __le64 flex6;
+   __le64 flex7;
+   __le64 flex8;
+
+   __le64 flex9;
+   __le64 flex10;
+   __le64 flex11;
+   __le64 flex12;
+   __le64 flex13;
+   __le64 flex14;
+   __le64 flex15;
+   __le64 flex16;
+
+   __le64 flex17;
+   __le64 flex18;
+   __le64 flex19;
+   __le64 flex20;
+   __le64 flex21;
+   __le64 flex22;
+   __le64 flex23;
+   __le64 flex24;
+
+   __le64 flex25;
+   __le64 flex26;
+   __le64 flex27;
+   __le64 flex28;
+   __le64 flex29;
+   __le64 flex30;
+   __le64 flex31;
+   __le6

[PATCH 06/13] net/ionic: memcpy descriptors when using Q-in-CMB

2024-02-02 Thread Andrew Boyer
From: Neel Patel 

They can be batched together this way, reducing the number
of PCIe transactions. This improves transmit PPS by up to 50% in
some configurations.

Signed-off-by: Andrew Boyer 
Signed-off-by: Neel Patel 
---
 drivers/net/ionic/ionic_dev.c |  9 +++--
 drivers/net/ionic/ionic_dev.h |  6 ++-
 drivers/net/ionic/ionic_lif.c | 26 +
 drivers/net/ionic/ionic_rxtx.h| 56 +++
 drivers/net/ionic/ionic_rxtx_sg.c | 18 -
 drivers/net/ionic/ionic_rxtx_simple.c | 18 -
 6 files changed, 101 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ionic/ionic_dev.c b/drivers/net/ionic/ionic_dev.c
index 70c14882ed..7f15914f74 100644
--- a/drivers/net/ionic/ionic_dev.c
+++ b/drivers/net/ionic/ionic_dev.c
@@ -369,17 +369,19 @@ ionic_q_init(struct ionic_queue *q, uint32_t index, 
uint16_t num_descs)
q->index = index;
q->num_descs = num_descs;
q->size_mask = num_descs - 1;
-   q->head_idx = 0;
-   q->tail_idx = 0;
+   ionic_q_reset(q);
 
return 0;
 }
 
 void
-ionic_q_map(struct ionic_queue *q, void *base, rte_iova_t base_pa)
+ionic_q_map(struct ionic_queue *q, void *base, rte_iova_t base_pa,
+   void *cmb_base, rte_iova_t cmb_base_pa)
 {
q->base = base;
q->base_pa = base_pa;
+   q->cmb_base = cmb_base;
+   q->cmb_base_pa = cmb_base_pa;
 }
 
 void
@@ -393,5 +395,6 @@ void
 ionic_q_reset(struct ionic_queue *q)
 {
q->head_idx = 0;
+   q->cmb_head_idx = 0;
q->tail_idx = 0;
 }
diff --git a/drivers/net/ionic/ionic_dev.h b/drivers/net/ionic/ionic_dev.h
index 971c261b27..3a366247f1 100644
--- a/drivers/net/ionic/ionic_dev.h
+++ b/drivers/net/ionic/ionic_dev.h
@@ -145,11 +145,13 @@ struct ionic_queue {
uint16_t num_descs;
uint16_t num_segs;
uint16_t head_idx;
+   uint16_t cmb_head_idx;
uint16_t tail_idx;
uint16_t size_mask;
uint8_t type;
uint8_t hw_type;
void *base;
+   void *cmb_base;
void *sg_base;
struct ionic_doorbell __iomem *db;
void **info;
@@ -158,6 +160,7 @@ struct ionic_queue {
uint32_t hw_index;
rte_iova_t base_pa;
rte_iova_t sg_base_pa;
+   rte_iova_t cmb_base_pa;
 };
 
 #define IONIC_INTR_NONE(-1)
@@ -244,7 +247,8 @@ uint32_t ionic_cq_service(struct ionic_cq *cq, uint32_t 
work_to_do,
 
 int ionic_q_init(struct ionic_queue *q, uint32_t index, uint16_t num_descs);
 void ionic_q_reset(struct ionic_queue *q);
-void ionic_q_map(struct ionic_queue *q, void *base, rte_iova_t base_pa);
+void ionic_q_map(struct ionic_queue *q, void *base, rte_iova_t base_pa,
+void *cmb_base, rte_iova_t cmb_base_pa);
 void ionic_q_sg_map(struct ionic_queue *q, void *base, rte_iova_t base_pa);
 
 static inline uint16_t
diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c
index fe2112c057..2713f8aa24 100644
--- a/drivers/net/ionic/ionic_lif.c
+++ b/drivers/net/ionic/ionic_lif.c
@@ -572,10 +572,11 @@ ionic_qcq_alloc(struct ionic_lif *lif,
 {
struct ionic_qcq *new;
uint32_t q_size, cq_size, sg_size, total_size;
-   void *q_base, *cq_base, *sg_base;
+   void *q_base, *cmb_q_base, *cq_base, *sg_base;
rte_iova_t q_base_pa = 0;
rte_iova_t cq_base_pa = 0;
rte_iova_t sg_base_pa = 0;
+   rte_iova_t cmb_q_base_pa = 0;
size_t page_size = rte_mem_page_size();
int err;
 
@@ -666,19 +667,22 @@ ionic_qcq_alloc(struct ionic_lif *lif,
IONIC_PRINT(ERR, "Cannot reserve queue from NIC mem");
return -ENOMEM;
}
-   q_base = (void *)
+   cmb_q_base = (void *)
((uintptr_t)lif->adapter->bars.bar[2].vaddr +
 (uintptr_t)lif->adapter->cmb_offset);
/* CMB PA is a relative address */
-   q_base_pa = lif->adapter->cmb_offset;
+   cmb_q_base_pa = lif->adapter->cmb_offset;
lif->adapter->cmb_offset += q_size;
+   } else {
+   cmb_q_base = NULL;
+   cmb_q_base_pa = 0;
}
 
IONIC_PRINT(DEBUG, "Q-Base-PA = %#jx CQ-Base-PA = %#jx "
"SG-base-PA = %#jx",
q_base_pa, cq_base_pa, sg_base_pa);
 
-   ionic_q_map(&new->q, q_base, q_base_pa);
+   ionic_q_map(&new->q, q_base, q_base_pa, cmb_q_base, cmb_q_base_pa);
ionic_cq_map(&new->cq, cq_base, cq_base_pa);
 
*qcq = new;
@@ -1583,7 +1587,6 @@ ionic_lif_txq_init(struct ionic_tx_qcq *txq)
.flags = rte_cpu_to_le_16(IONIC_QINIT_F_ENA),
.intr_index = rte_cpu_to_le_16(IONIC_INTR_NONE),
.ring_size = rte_log2_u32(q->num_descs),
-   .ring_base = rte_cpu_to_le_64(q->base_pa),
.cq_ring_base = rt

[PATCH 10/13] net/ionic: fix device close sequence to avoid crash

2024-02-02 Thread Andrew Boyer
The close routine should release all resources, but not
call rte_eth_dev_destroy(). As written this code will call
rte_eth_dev_release_port() twice and segfault.

Instead, move rte_eth_dev_destroy() to the remove routine.
eth_ionic_dev_uninit() will call close if necessary.

Fixes: 175e4e7ed760 ("net/ionic: complete release on close")
Cc: sta...@dpdk.org

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_ethdev.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index 7c55a26956..bedcf958e2 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -1009,19 +1009,21 @@ ionic_dev_close(struct rte_eth_dev *eth_dev)
 
ionic_lif_stop(lif);
 
-   ionic_lif_free_queues(lif);
-
IONIC_PRINT(NOTICE, "Removing device %s", eth_dev->device->name);
if (adapter->intf->unconfigure_intr)
(*adapter->intf->unconfigure_intr)(adapter);
 
-   rte_eth_dev_destroy(eth_dev, eth_ionic_dev_uninit);
-
ionic_port_reset(adapter);
ionic_reset(adapter);
+
+   ionic_lif_free_queues(lif);
+   ionic_lif_deinit(lif);
+   ionic_lif_free(lif); /* Does not free LIF object */
+
if (adapter->intf->unmap_bars)
(*adapter->intf->unmap_bars)(adapter);
 
+   lif->adapter = NULL;
rte_free(adapter);
 
return 0;
@@ -1098,21 +1100,18 @@ eth_ionic_dev_init(struct rte_eth_dev *eth_dev, void 
*init_params)
 static int
 eth_ionic_dev_uninit(struct rte_eth_dev *eth_dev)
 {
-   struct ionic_lif *lif = IONIC_ETH_DEV_TO_LIF(eth_dev);
-   struct ionic_adapter *adapter = lif->adapter;
-
IONIC_PRINT_CALL();
 
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return 0;
 
-   adapter->lif = NULL;
-
-   ionic_lif_deinit(lif);
-   ionic_lif_free(lif);
+   if (eth_dev->state != RTE_ETH_DEV_UNUSED)
+   ionic_dev_close(eth_dev);
 
-   if (!(lif->state & IONIC_LIF_F_FW_RESET))
-   ionic_lif_reset(lif);
+   eth_dev->dev_ops = NULL;
+   eth_dev->rx_pkt_burst = NULL;
+   eth_dev->tx_pkt_burst = NULL;
+   eth_dev->tx_pkt_prepare = NULL;
 
return 0;
 }
@@ -1267,17 +1266,18 @@ eth_ionic_dev_remove(struct rte_device *rte_dev)
 {
char name[RTE_ETH_NAME_MAX_LEN];
struct rte_eth_dev *eth_dev;
+   int ret = 0;
 
/* Adapter lookup is using the eth_dev name */
snprintf(name, sizeof(name), "%s_lif", rte_dev->name);
 
eth_dev = rte_eth_dev_allocated(name);
if (eth_dev)
-   ionic_dev_close(eth_dev);
+   ret = rte_eth_dev_destroy(eth_dev, eth_ionic_dev_uninit);
else
IONIC_PRINT(DEBUG, "Cannot find device %s", rte_dev->name);
 
-   return 0;
+   return ret;
 }
 
 RTE_LOG_REGISTER_DEFAULT(ionic_logtype, NOTICE);
-- 
2.17.1



[PATCH 11/13] net/ionic: optimize device close operation

2024-02-02 Thread Andrew Boyer
Use a single device reset command to speed up dev_close(). The LIF stop
and port reset commands are not needed.
This reduces the outage window when restarting the process by about 2ms
plus another 1ms per queue.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_ethdev.c | 3 ---
 drivers/net/ionic/ionic_lif.c| 8 +---
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/net/ionic/ionic_ethdev.c b/drivers/net/ionic/ionic_ethdev.c
index bedcf958e2..7e80751846 100644
--- a/drivers/net/ionic/ionic_ethdev.c
+++ b/drivers/net/ionic/ionic_ethdev.c
@@ -1007,13 +1007,10 @@ ionic_dev_close(struct rte_eth_dev *eth_dev)
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
return 0;
 
-   ionic_lif_stop(lif);
-
IONIC_PRINT(NOTICE, "Removing device %s", eth_dev->device->name);
if (adapter->intf->unconfigure_intr)
(*adapter->intf->unconfigure_intr)(adapter);
 
-   ionic_port_reset(adapter);
ionic_reset(adapter);
 
ionic_lif_free_queues(lif);
diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c
index 2713f8aa24..90efcc8cbb 100644
--- a/drivers/net/ionic/ionic_lif.c
+++ b/drivers/net/ionic/ionic_lif.c
@@ -1231,13 +1231,7 @@ ionic_lif_rss_setup(struct ionic_lif *lif)
 static void
 ionic_lif_rss_teardown(struct ionic_lif *lif)
 {
-   if (!lif->rss_ind_tbl)
-   return;
-
-   if (lif->rss_ind_tbl_z) {
-   /* Disable RSS on the NIC */
-   ionic_lif_rss_config(lif, 0x0, NULL, NULL);
-
+   if (lif->rss_ind_tbl) {
lif->rss_ind_tbl = NULL;
lif->rss_ind_tbl_pa = 0;
rte_memzone_free(lif->rss_ind_tbl_z);
-- 
2.17.1



[PATCH 12/13] net/ionic: optimize device stop operation

2024-02-02 Thread Andrew Boyer
Split the queue_stop operation into first-half and second-half helpers.
Move the command context from the stack into each Rx/Tx queue struct.
Expose some needed adminq interfaces.

This allows us to batch up the queue commands during dev_stop(), reducing
the outage window when restarting the process by about 1ms per queue.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic.h  |  3 ++
 drivers/net/ionic/ionic_lif.c  | 81 --
 drivers/net/ionic/ionic_lif.h  | 12 +++--
 drivers/net/ionic/ionic_main.c | 17 ++-
 drivers/net/ionic/ionic_rxtx.c | 78 +++-
 drivers/net/ionic/ionic_rxtx.h | 14 +-
 6 files changed, 143 insertions(+), 62 deletions(-)

diff --git a/drivers/net/ionic/ionic.h b/drivers/net/ionic/ionic.h
index c479eaba74..cb4ea450a9 100644
--- a/drivers/net/ionic/ionic.h
+++ b/drivers/net/ionic/ionic.h
@@ -83,7 +83,10 @@ struct ionic_admin_ctx {
union ionic_adminq_comp comp;
 };
 
+int ionic_adminq_post(struct ionic_lif *lif, struct ionic_admin_ctx *ctx);
 int ionic_adminq_post_wait(struct ionic_lif *lif, struct ionic_admin_ctx *ctx);
+int ionic_adminq_wait(struct ionic_lif *lif, struct ionic_admin_ctx *ctx);
+uint16_t ionic_adminq_space_avail(struct ionic_lif *lif);
 
 int ionic_dev_cmd_wait_check(struct ionic_dev *idev, unsigned long max_wait);
 int ionic_setup(struct ionic_adapter *adapter);
diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c
index 90efcc8cbb..8ffdbc4df7 100644
--- a/drivers/net/ionic/ionic_lif.c
+++ b/drivers/net/ionic/ionic_lif.c
@@ -31,42 +31,54 @@ static int ionic_lif_addr_add(struct ionic_lif *lif, const 
uint8_t *addr);
 static int ionic_lif_addr_del(struct ionic_lif *lif, const uint8_t *addr);
 
 static int
-ionic_qcq_disable(struct ionic_qcq *qcq)
+ionic_qcq_disable_nowait(struct ionic_qcq *qcq,
+   struct ionic_admin_ctx *ctx)
 {
+   int err;
+
struct ionic_queue *q = &qcq->q;
struct ionic_lif *lif = qcq->lif;
-   struct ionic_admin_ctx ctx = {
-   .pending_work = true,
-   .cmd.q_control = {
-   .opcode = IONIC_CMD_Q_CONTROL,
-   .type = q->type,
-   .index = rte_cpu_to_le_32(q->index),
-   .oper = IONIC_Q_DISABLE,
-   },
-   };
 
-   return ionic_adminq_post_wait(lif, &ctx);
+   memset(ctx, 0, sizeof(*ctx));
+   ctx->pending_work = true;
+   ctx->cmd.q_control.opcode = IONIC_CMD_Q_CONTROL;
+   ctx->cmd.q_control.type = q->type;
+   ctx->cmd.q_control.index = rte_cpu_to_le_32(q->index);
+   ctx->cmd.q_control.oper = IONIC_Q_DISABLE;
+
+   /* Does not wait for command completion */
+   err = ionic_adminq_post(lif, ctx);
+   if (err)
+   ctx->pending_work = false;
+   return err;
 }
 
 void
 ionic_lif_stop(struct ionic_lif *lif)
 {
-   uint32_t i;
+   struct rte_eth_dev *dev = lif->eth_dev;
+   uint32_t i, j, chunk;
 
IONIC_PRINT_CALL();
 
lif->state &= ~IONIC_LIF_F_UP;
 
-   for (i = 0; i < lif->nrxqcqs; i++) {
-   struct ionic_rx_qcq *rxq = lif->rxqcqs[i];
-   if (rxq->flags & IONIC_QCQ_F_INITED)
-   (void)ionic_dev_rx_queue_stop(lif->eth_dev, i);
+   chunk = ionic_adminq_space_avail(lif);
+
+   for (i = 0; i < lif->nrxqcqs; i += chunk) {
+   for (j = 0; j < chunk && i + j < lif->nrxqcqs; j++)
+   ionic_dev_rx_queue_stop_firsthalf(dev, i + j);
+
+   for (j = 0; j < chunk && i + j < lif->nrxqcqs; j++)
+   ionic_dev_rx_queue_stop_secondhalf(dev, i + j);
}
 
-   for (i = 0; i < lif->ntxqcqs; i++) {
-   struct ionic_tx_qcq *txq = lif->txqcqs[i];
-   if (txq->flags & IONIC_QCQ_F_INITED)
-   (void)ionic_dev_tx_queue_stop(lif->eth_dev, i);
+   for (i = 0; i < lif->ntxqcqs; i += chunk) {
+   for (j = 0; j < chunk && i + j < lif->ntxqcqs; j++)
+   ionic_dev_tx_queue_stop_firsthalf(dev, i + j);
+
+   for (j = 0; j < chunk && i + j < lif->ntxqcqs; j++)
+   ionic_dev_tx_queue_stop_secondhalf(dev, i + j);
}
 }
 
@@ -1240,21 +1252,42 @@ ionic_lif_rss_teardown(struct ionic_lif *lif)
 }
 
 void
-ionic_lif_txq_deinit(struct ionic_tx_qcq *txq)
+ionic_lif_txq_deinit_nowait(struct ionic_tx_qcq *txq)
 {
-   ionic_qcq_disable(&txq->qcq);
+   ionic_qcq_disable_nowait(&txq->qcq, &txq->admin_ctx);
 
txq->flags &= ~IONIC_QCQ_F_INITED;
 }
 
 void
-ionic_lif_rxq_deinit(struct ionic_rx_qcq *rxq)
+ionic_lif_txq_stats(struct ionic_tx_qcq *txq)
 {
-   ionic_qcq_disable(&rxq->qcq);
+   struct ionic_tx_stats *stats = &txq->stats;
+
+   IONIC_PRINT(DEBUG, "TX queue %u pkts %ju tso %ju",
+   txq->qcq.q.index, stats->packets, stats->tso);
+   IONIC_PRINT(DEBUG, "TX queue %u comp

[PATCH 13/13] net/ionic: optimize device start operation

2024-02-02 Thread Andrew Boyer
Split the queue_start operation into first-half and second-half helpers.

This allows us to batch up the queue commands during dev_start(), reducing
the outage window when restarting the process by about 1ms per queue.

Signed-off-by: Andrew Boyer 
---
 drivers/net/ionic/ionic_lif.c  | 178 +
 drivers/net/ionic/ionic_lif.h  |   6 +-
 drivers/net/ionic/ionic_rxtx.c |  81 ---
 drivers/net/ionic/ionic_rxtx.h |  10 ++
 4 files changed, 194 insertions(+), 81 deletions(-)

diff --git a/drivers/net/ionic/ionic_lif.c b/drivers/net/ionic/ionic_lif.c
index 8ffdbc4df7..1937e48d9b 100644
--- a/drivers/net/ionic/ionic_lif.c
+++ b/drivers/net/ionic/ionic_lif.c
@@ -1598,52 +1598,61 @@ ionic_lif_set_features(struct ionic_lif *lif)
 }
 
 int
-ionic_lif_txq_init(struct ionic_tx_qcq *txq)
+ionic_lif_txq_init_nowait(struct ionic_tx_qcq *txq)
 {
struct ionic_qcq *qcq = &txq->qcq;
struct ionic_queue *q = &qcq->q;
struct ionic_lif *lif = qcq->lif;
struct ionic_cq *cq = &qcq->cq;
-   struct ionic_admin_ctx ctx = {
-   .pending_work = true,
-   .cmd.q_init = {
-   .opcode = IONIC_CMD_Q_INIT,
-   .type = q->type,
-   .ver = lif->qtype_info[q->type].version,
-   .index = rte_cpu_to_le_32(q->index),
-   .flags = rte_cpu_to_le_16(IONIC_QINIT_F_ENA),
-   .intr_index = rte_cpu_to_le_16(IONIC_INTR_NONE),
-   .ring_size = rte_log2_u32(q->num_descs),
-   .cq_ring_base = rte_cpu_to_le_64(cq->base_pa),
-   .sg_ring_base = rte_cpu_to_le_64(q->sg_base_pa),
-   },
-   };
+   struct ionic_admin_ctx *ctx = &txq->admin_ctx;
int err;
 
+   memset(ctx, 0, sizeof(*ctx));
+   ctx->pending_work = true;
+   ctx->cmd.q_init.opcode = IONIC_CMD_Q_INIT;
+   ctx->cmd.q_init.type = q->type;
+   ctx->cmd.q_init.ver = lif->qtype_info[q->type].version;
+   ctx->cmd.q_init.index = rte_cpu_to_le_32(q->index);
+   ctx->cmd.q_init.flags = rte_cpu_to_le_16(IONIC_QINIT_F_ENA);
+   ctx->cmd.q_init.intr_index = rte_cpu_to_le_16(IONIC_INTR_NONE);
+   ctx->cmd.q_init.ring_size = rte_log2_u32(q->num_descs);
+   ctx->cmd.q_init.cq_ring_base = rte_cpu_to_le_64(cq->base_pa);
+   ctx->cmd.q_init.sg_ring_base = rte_cpu_to_le_64(q->sg_base_pa);
+
if (txq->flags & IONIC_QCQ_F_SG)
-   ctx.cmd.q_init.flags |= rte_cpu_to_le_16(IONIC_QINIT_F_SG);
+   ctx->cmd.q_init.flags |= rte_cpu_to_le_16(IONIC_QINIT_F_SG);
if (txq->flags & IONIC_QCQ_F_CMB) {
-   ctx.cmd.q_init.flags |= rte_cpu_to_le_16(IONIC_QINIT_F_CMB);
-   ctx.cmd.q_init.ring_base = rte_cpu_to_le_64(q->cmb_base_pa);
+   ctx->cmd.q_init.flags |= rte_cpu_to_le_16(IONIC_QINIT_F_CMB);
+   ctx->cmd.q_init.ring_base = rte_cpu_to_le_64(q->cmb_base_pa);
} else {
-   ctx.cmd.q_init.ring_base = rte_cpu_to_le_64(q->base_pa);
+   ctx->cmd.q_init.ring_base = rte_cpu_to_le_64(q->base_pa);
}
 
IONIC_PRINT(DEBUG, "txq_init.index %d", q->index);
IONIC_PRINT(DEBUG, "txq_init.ring_base 0x%" PRIx64 "", q->base_pa);
IONIC_PRINT(DEBUG, "txq_init.ring_size %d",
-   ctx.cmd.q_init.ring_size);
-   IONIC_PRINT(DEBUG, "txq_init.ver %u", ctx.cmd.q_init.ver);
+   ctx->cmd.q_init.ring_size);
+   IONIC_PRINT(DEBUG, "txq_init.ver %u", ctx->cmd.q_init.ver);
 
ionic_q_reset(q);
ionic_cq_reset(cq);
 
-   err = ionic_adminq_post_wait(lif, &ctx);
+   /* Caller responsible for calling ionic_lif_txq_init_done() */
+   err = ionic_adminq_post(lif, ctx);
if (err)
-   return err;
+   ctx->pending_work = false;
+   return err;
+}
 
-   q->hw_type = ctx.comp.q_init.hw_type;
-   q->hw_index = rte_le_to_cpu_32(ctx.comp.q_init.hw_index);
+void
+ionic_lif_txq_init_done(struct ionic_tx_qcq *txq)
+{
+   struct ionic_lif *lif = txq->qcq.lif;
+   struct ionic_queue *q = &txq->qcq.q;
+   struct ionic_admin_ctx *ctx = &txq->admin_ctx;
+
+   q->hw_type = ctx->comp.q_init.hw_type;
+   q->hw_index = rte_le_to_cpu_32(ctx->comp.q_init.hw_index);
q->db = ionic_db_map(lif, q);
 
IONIC_PRINT(DEBUG, "txq->hw_type %d", q->hw_type);
@@ -1651,57 +1660,64 @@ ionic_lif_txq_init(struct ionic_tx_qcq *txq)
IONIC_PRINT(DEBUG, "txq->db %p", q->db);
 
txq->flags |= IONIC_QCQ_F_INITED;
-
-   return 0;
 }
 
 int
-ionic_lif_rxq_init(struct ionic_rx_qcq *rxq)
+ionic_lif_rxq_init_nowait(struct ionic_rx_qcq *rxq)
 {
struct ionic_qcq *qcq = &rxq->qcq;
struct ionic_queue *q = &qcq->q;
struct ionic_lif *lif = qcq->lif;
struct ionic_cq *cq = &qcq->cq;
-   struct ionic_admin_ctx ctx = {
-   .pending_work = tr

Re: [PATCH 1/1] eal: add C++ include guard in generic/rte_vect.h

2024-02-02 Thread Ashish Sadanandan
On Fri, Feb 2, 2024 at 2:41 AM Bruce Richardson 
wrote:

> On Fri, Feb 02, 2024 at 10:18:23AM +0100, Thomas Monjalon wrote:
> > 02/02/2024 06:13, Ashish Sadanandan:
> > > The header was missing the extern "C" directive which causes name
> > > mangling of functions by C++ compilers, leading to linker errors
> > > complaining of undefined references to these functions.
> > >
> > > Fixes: 86c743cf9140 ("eal: define generic vector types")
> > > Cc: nelio.laranje...@6wind.com
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Ashish Sadanandan 
> >
> > Thank you for improving C++ compatibility.
> >
> > I'm not sure what is best to fix it.
> > You are adding extern "C" in a file which is not directly included
> > by the user app. The same was done for rte_rwlock.h.
> > The other way is to make sure this include is in an extern "C" block
> > in lib/eal/*/include/rte_vect.h (instead of being before the block).
> >
> > I would like we use the same approach for all files.
> > Opinions?
> >
> I think just having the extern "C" guard in all files is the safest choice,
> because it's immediately obvious in each and every file that it is correct.
> Taking the other option, to check any indirect include file you need to go
> finding what other files include it and check there that a) they have
> include guards and b) the include for the indirect header is contained
> within it.
>
> Adopting the policy of putting the guard in each and every header is also a
> lot easier to do basic automated sanity checks on. If the file ends in .h,
> we just use grep to quickly verify it's not missing the guards. [Naturally,
> we can do more complete checks than that if we want, but 99% percent of
> misses can be picked up by a grep for the 'extern "C"' bit]
>
> /Bruce
>

100% agree with Bruce. It's a valid ideological argument that private
headers
don't need such safeguards, but it's difficult to enforce and easy to break
during refactoring.

- Ashish


[PATCH v7 00/19] Replace use of PMD logtype

2024-02-02 Thread Stephen Hemminger
Many of the uses of PMD logtype have already been fixed.
But there are still some leftovers, mostly places where
drivers had a logtype but did not use them.

Note: this is not an ABI break, but could break out of
  tree drivers that never updated to use dynamic logtype.
  DPDK never guaranteed that that would not happen.

v7 - drop changes to newlines
 drop changes related to RTE_LOG_DP
 rebase now that other stuff has changed

Stephen Hemminger (19):
  common/sfc_efx: remove use of PMD logtype
  mempool/dpaa2: use driver logtype not PMD
  net/dpaa: use dedicated logtype not PMD
  net/dpaa2: used dedicated logtype not PMD
  net/mrvl: do not use PMD logtype
  net/mvpp2: use dedicated logtype
  net/nfb: use dynamic logtype
  net/vmxnet3: used dedicated logtype not PMD
  raw/cnxk: replace PMD logtype with dynamic type
  crypto/scheduler: replace use of logtype PMD
  crypto/armv8: do not use PMD logtype
  crypto/caam_jr: use dedicated logtype
  crypto/ccp: do not use PMD logtype
  crypto/dpaa_sec, crypto/dpaa2_sec: use dedicated logtype
  event/dpaa, event/dpaa2: use dedicated logtype
  event/dlb2: use dedicated logtype
  event/skeleton: replace logtype PMD with dynamic type
  examples/fips_validation: replace use of PMD logtype
  log: remove PMD log type

 drivers/common/cnxk/roc_platform.h| 16 ---
 drivers/common/sfc_efx/sfc_efx.c  | 11 +
 drivers/common/sfc_efx/sfc_efx_log.h  |  2 +-
 drivers/crypto/armv8/rte_armv8_pmd.c  |  4 +-
 drivers/crypto/caam_jr/caam_jr.c  |  5 +--
 drivers/crypto/ccp/rte_ccp_pmd.c  | 11 +++--
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |  6 +--
 drivers/crypto/dpaa_sec/dpaa_sec.c| 30 ++---
 drivers/crypto/scheduler/scheduler_pmd.c  |  4 +-
 drivers/event/dlb2/dlb2.c |  5 +--
 drivers/event/dpaa/dpaa_eventdev.c|  2 +-
 drivers/event/dpaa2/dpaa2_eventdev.c  |  4 +-
 drivers/event/dpaa2/dpaa2_eventdev_selftest.c |  6 +--
 drivers/event/skeleton/skeleton_eventdev.c|  4 +-
 drivers/event/skeleton/skeleton_eventdev.h|  8 +++-
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c  |  4 +-
 drivers/net/dpaa/dpaa_ethdev.c|  6 +--
 drivers/net/dpaa2/dpaa2_ethdev.c  |  2 +-
 drivers/net/dpaa2/dpaa2_sparser.c |  4 +-
 drivers/net/mvpp2/mrvl_ethdev.c   |  7 ++-
 drivers/net/nfb/nfb.h |  5 +++
 drivers/net/nfb/nfb_ethdev.c  | 20 -
 drivers/net/nfb/nfb_rx.c  | 10 ++---
 drivers/net/nfb/nfb_rx.h  |  2 +-
 drivers/net/nfb/nfb_tx.c  | 10 ++---
 drivers/net/nfb/nfb_tx.h  |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  2 +-
 drivers/raw/cnxk_bphy/cnxk_bphy.c |  3 +-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c |  2 +-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c| 31 +++--
 drivers/raw/cnxk_bphy/rte_pmd_bphy.h  |  6 +++
 drivers/raw/cnxk_gpio/cnxk_gpio.c | 21 -
 drivers/raw/cnxk_gpio/cnxk_gpio.h |  5 +++
 drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c| 17 ---
 examples/fips_validation/fips_dev_self_test.c | 44 +--
 lib/log/log.c |  2 +-
 lib/log/rte_log.h |  2 +-
 37 files changed, 166 insertions(+), 159 deletions(-)

-- 
2.43.0



[PATCH v7 01/19] common/sfc_efx: remove use of PMD logtype

2024-02-02 Thread Stephen Hemminger
This code was implemented in a slightly different manner
than all the other logging code (for no good reason).
Make it the same and handle errors in same way as
other drivers.

Signed-off-by: Stephen Hemminger 
---
 drivers/common/sfc_efx/sfc_efx.c | 11 ++-
 drivers/common/sfc_efx/sfc_efx_log.h |  2 +-
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/common/sfc_efx/sfc_efx.c b/drivers/common/sfc_efx/sfc_efx.c
index 2dc5545760b8..5eeffb065b0d 100644
--- a/drivers/common/sfc_efx/sfc_efx.c
+++ b/drivers/common/sfc_efx/sfc_efx.c
@@ -15,7 +15,7 @@
 #include "sfc_efx_log.h"
 #include "sfc_efx.h"
 
-uint32_t sfc_efx_logtype;
+int sfc_efx_logtype;
 
 static int
 sfc_efx_kvarg_dev_class_handler(__rte_unused const char *key,
@@ -117,11 +117,4 @@ sfc_efx_family(struct rte_pci_device *pci_dev,
return rc;
 }
 
-RTE_INIT(sfc_efx_register_logtype)
-{
-   int ret;
-
-   ret = rte_log_register_type_and_pick_level("pmd.common.sfc_efx",
-  RTE_LOG_NOTICE);
-   sfc_efx_logtype = (ret < 0) ? RTE_LOGTYPE_PMD : ret;
-}
+RTE_LOG_REGISTER_DEFAULT(sfc_efx_logtype, NOTICE);
diff --git a/drivers/common/sfc_efx/sfc_efx_log.h 
b/drivers/common/sfc_efx/sfc_efx_log.h
index 694455c1b14e..1519ebdc175f 100644
--- a/drivers/common/sfc_efx/sfc_efx_log.h
+++ b/drivers/common/sfc_efx/sfc_efx_log.h
@@ -11,7 +11,7 @@
 #define _SFC_EFX_LOG_H_
 
 /** Generic driver log type */
-extern uint32_t sfc_efx_logtype;
+extern int sfc_efx_logtype;
 
 /** Log message, add a prefix and a line break */
 #define SFC_EFX_LOG(level, ...) \
-- 
2.43.0



[PATCH v7 03/19] net/dpaa: use dedicated logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver already has a logtype, but was not used in couple
places.

Fixes: 6b10d1f7bdea ("net/dpaa: update process specific device info")
Fixes: c2c4f87b1259 ("net: add macro for MAC address print")

Signed-off-by: Stephen Hemminger 
---
 drivers/net/dpaa/dpaa_ethdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index ef4c06db6a4d..bb2de5de801c 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -2096,8 +2096,8 @@ dpaa_dev_init(struct rte_eth_dev *eth_dev)
/* copy the primary mac address */
rte_ether_addr_copy(&fman_intf->mac_addr, ð_dev->data->mac_addrs[0]);
 
-   RTE_LOG(INFO, PMD, "net: dpaa: %s: " RTE_ETHER_ADDR_PRT_FMT "\n",
-   dpaa_device->name, RTE_ETHER_ADDR_BYTES(&fman_intf->mac_addr));
+   DPAA_PMD_INFO("net: dpaa: %s: " RTE_ETHER_ADDR_PRT_FMT,
+ dpaa_device->name, 
RTE_ETHER_ADDR_BYTES(&fman_intf->mac_addr));
 
if (!fman_intf->is_shared_mac) {
/* Configure error packet handling */
@@ -2166,7 +2166,7 @@ rte_dpaa_probe(struct rte_dpaa_driver *dpaa_drv,
 
ret = dpaa_dev_init_secondary(eth_dev);
if (ret != 0) {
-   RTE_LOG(ERR, PMD, "secondary dev init failed\n");
+   DPAA_PMD_ERR("secondary dev init failed");
return ret;
}
 
-- 
2.43.0



[PATCH v7 02/19] mempool/dpaa2: use driver logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver already has macros for logging, use them.
Fixes: 7ed359909556 ("mempool/dpaa2: add functions for CMDIF")

Signed-off-by: Stephen Hemminger 
---
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c 
b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
index 84371d5d1abb..4c9245cb814c 100644
--- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
+++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
@@ -293,7 +293,7 @@ rte_dpaa2_mbuf_pool_bpid(struct rte_mempool *mp)
 
bp_info = mempool_to_bpinfo(mp);
if (!(bp_info->bp_list)) {
-   RTE_LOG(ERR, PMD, "DPAA2 buffer pool not configured\n");
+   DPAA2_MEMPOOL_ERR("DPAA2 buffer pool not configured");
return -ENOMEM;
}
 
@@ -307,7 +307,7 @@ rte_dpaa2_mbuf_from_buf_addr(struct rte_mempool *mp, void 
*buf_addr)
 
bp_info = mempool_to_bpinfo(mp);
if (!(bp_info->bp_list)) {
-   RTE_LOG(ERR, PMD, "DPAA2 buffer pool not configured\n");
+   DPAA2_MEMPOOL_ERR("DPAA2 buffer pool not configured");
return NULL;
}
 
-- 
2.43.0



[PATCH v7 00/19] Replace uses of RTE_LOGTYPE_PMD

2024-02-02 Thread Stephen Hemminger
Many of the uses of PMD logtype have already been fixed.
But there are still some leftovers, mostly places where
drivers had a logtype but did not use them.

Note: this is not an ABI break, but could break out of
  tree drivers that never updated to use dynamic logtype.
  DPDK never guaranteed that that would not happen.

v7 - drop changes to newlines
 drop changes related to RTE_LOG_DP
 rebase now that other stuff has changed

Stephen Hemminger (19):
  common/sfc_efx: remove use of PMD logtype
  mempool/dpaa2: use driver logtype not PMD
  net/dpaa: use dedicated logtype not PMD
  net/dpaa2: used dedicated logtype not PMD
  net/mrvl: do not use PMD logtype
  net/mvpp2: use dedicated logtype
  net/nfb: use dynamic logtype
  net/vmxnet3: used dedicated logtype not PMD
  raw/cnxk: replace PMD logtype with dynamic type
  crypto/scheduler: replace use of logtype PMD
  crypto/armv8: do not use PMD logtype
  crypto/caam_jr: use dedicated logtype
  crypto/ccp: do not use PMD logtype
  crypto/dpaa_sec, crypto/dpaa2_sec: use dedicated logtype
  event/dpaa, event/dpaa2: use dedicated logtype
  event/dlb2: use dedicated logtype
  event/skeleton: replace logtype PMD with dynamic type
  examples/fips_validation: replace use of PMD logtype
  log: remove PMD log type

 drivers/common/cnxk/roc_platform.h| 16 ---
 drivers/common/sfc_efx/sfc_efx.c  | 11 +
 drivers/common/sfc_efx/sfc_efx_log.h  |  2 +-
 drivers/crypto/armv8/rte_armv8_pmd.c  |  4 +-
 drivers/crypto/caam_jr/caam_jr.c  |  5 +--
 drivers/crypto/ccp/rte_ccp_pmd.c  | 11 +++--
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |  6 +--
 drivers/crypto/dpaa_sec/dpaa_sec.c| 30 ++---
 drivers/crypto/scheduler/scheduler_pmd.c  |  4 +-
 drivers/event/dlb2/dlb2.c |  5 +--
 drivers/event/dpaa/dpaa_eventdev.c|  2 +-
 drivers/event/dpaa2/dpaa2_eventdev.c  |  4 +-
 drivers/event/dpaa2/dpaa2_eventdev_selftest.c |  6 +--
 drivers/event/skeleton/skeleton_eventdev.c|  4 +-
 drivers/event/skeleton/skeleton_eventdev.h|  8 +++-
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c  |  4 +-
 drivers/net/dpaa/dpaa_ethdev.c|  6 +--
 drivers/net/dpaa2/dpaa2_ethdev.c  |  2 +-
 drivers/net/dpaa2/dpaa2_sparser.c |  4 +-
 drivers/net/mvpp2/mrvl_ethdev.c   |  7 ++-
 drivers/net/nfb/nfb.h |  5 +++
 drivers/net/nfb/nfb_ethdev.c  | 20 -
 drivers/net/nfb/nfb_rx.c  | 10 ++---
 drivers/net/nfb/nfb_rx.h  |  2 +-
 drivers/net/nfb/nfb_tx.c  | 10 ++---
 drivers/net/nfb/nfb_tx.h  |  2 +-
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  2 +-
 drivers/raw/cnxk_bphy/cnxk_bphy.c |  3 +-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c |  2 +-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c| 31 +++--
 drivers/raw/cnxk_bphy/rte_pmd_bphy.h  |  6 +++
 drivers/raw/cnxk_gpio/cnxk_gpio.c | 21 -
 drivers/raw/cnxk_gpio/cnxk_gpio.h |  5 +++
 drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c| 17 ---
 examples/fips_validation/fips_dev_self_test.c | 44 +--
 lib/log/log.c |  2 +-
 lib/log/rte_log.h |  2 +-
 37 files changed, 166 insertions(+), 159 deletions(-)

-- 
2.43.0



[PATCH v7 01/19] common/sfc_efx: remove use of PMD logtype

2024-02-02 Thread Stephen Hemminger
This code was implemented in a slightly different manner
than all the other logging code (for no good reason).
Make it the same and handle errors in same way as
other drivers.

Signed-off-by: Stephen Hemminger 
---
 drivers/common/sfc_efx/sfc_efx.c | 11 ++-
 drivers/common/sfc_efx/sfc_efx_log.h |  2 +-
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/common/sfc_efx/sfc_efx.c b/drivers/common/sfc_efx/sfc_efx.c
index 2dc5545760b8..5eeffb065b0d 100644
--- a/drivers/common/sfc_efx/sfc_efx.c
+++ b/drivers/common/sfc_efx/sfc_efx.c
@@ -15,7 +15,7 @@
 #include "sfc_efx_log.h"
 #include "sfc_efx.h"
 
-uint32_t sfc_efx_logtype;
+int sfc_efx_logtype;
 
 static int
 sfc_efx_kvarg_dev_class_handler(__rte_unused const char *key,
@@ -117,11 +117,4 @@ sfc_efx_family(struct rte_pci_device *pci_dev,
return rc;
 }
 
-RTE_INIT(sfc_efx_register_logtype)
-{
-   int ret;
-
-   ret = rte_log_register_type_and_pick_level("pmd.common.sfc_efx",
-  RTE_LOG_NOTICE);
-   sfc_efx_logtype = (ret < 0) ? RTE_LOGTYPE_PMD : ret;
-}
+RTE_LOG_REGISTER_DEFAULT(sfc_efx_logtype, NOTICE);
diff --git a/drivers/common/sfc_efx/sfc_efx_log.h 
b/drivers/common/sfc_efx/sfc_efx_log.h
index 694455c1b14e..1519ebdc175f 100644
--- a/drivers/common/sfc_efx/sfc_efx_log.h
+++ b/drivers/common/sfc_efx/sfc_efx_log.h
@@ -11,7 +11,7 @@
 #define _SFC_EFX_LOG_H_
 
 /** Generic driver log type */
-extern uint32_t sfc_efx_logtype;
+extern int sfc_efx_logtype;
 
 /** Log message, add a prefix and a line break */
 #define SFC_EFX_LOG(level, ...) \
-- 
2.43.0



[PATCH v7 02/19] mempool/dpaa2: use driver logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver already has macros for logging, use them.
Fixes: 7ed359909556 ("mempool/dpaa2: add functions for CMDIF")

Signed-off-by: Stephen Hemminger 
---
 drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c 
b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
index 84371d5d1abb..4c9245cb814c 100644
--- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
+++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c
@@ -293,7 +293,7 @@ rte_dpaa2_mbuf_pool_bpid(struct rte_mempool *mp)
 
bp_info = mempool_to_bpinfo(mp);
if (!(bp_info->bp_list)) {
-   RTE_LOG(ERR, PMD, "DPAA2 buffer pool not configured\n");
+   DPAA2_MEMPOOL_ERR("DPAA2 buffer pool not configured");
return -ENOMEM;
}
 
@@ -307,7 +307,7 @@ rte_dpaa2_mbuf_from_buf_addr(struct rte_mempool *mp, void 
*buf_addr)
 
bp_info = mempool_to_bpinfo(mp);
if (!(bp_info->bp_list)) {
-   RTE_LOG(ERR, PMD, "DPAA2 buffer pool not configured\n");
+   DPAA2_MEMPOOL_ERR("DPAA2 buffer pool not configured");
return NULL;
}
 
-- 
2.43.0



[PATCH v7 03/19] net/dpaa: use dedicated logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver already has a logtype, but was not used in couple
places.

Fixes: 6b10d1f7bdea ("net/dpaa: update process specific device info")
Fixes: c2c4f87b1259 ("net: add macro for MAC address print")

Signed-off-by: Stephen Hemminger 
---
 drivers/net/dpaa/dpaa_ethdev.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dpaa/dpaa_ethdev.c b/drivers/net/dpaa/dpaa_ethdev.c
index ef4c06db6a4d..bb2de5de801c 100644
--- a/drivers/net/dpaa/dpaa_ethdev.c
+++ b/drivers/net/dpaa/dpaa_ethdev.c
@@ -2096,8 +2096,8 @@ dpaa_dev_init(struct rte_eth_dev *eth_dev)
/* copy the primary mac address */
rte_ether_addr_copy(&fman_intf->mac_addr, ð_dev->data->mac_addrs[0]);
 
-   RTE_LOG(INFO, PMD, "net: dpaa: %s: " RTE_ETHER_ADDR_PRT_FMT "\n",
-   dpaa_device->name, RTE_ETHER_ADDR_BYTES(&fman_intf->mac_addr));
+   DPAA_PMD_INFO("net: dpaa: %s: " RTE_ETHER_ADDR_PRT_FMT,
+ dpaa_device->name, 
RTE_ETHER_ADDR_BYTES(&fman_intf->mac_addr));
 
if (!fman_intf->is_shared_mac) {
/* Configure error packet handling */
@@ -2166,7 +2166,7 @@ rte_dpaa_probe(struct rte_dpaa_driver *dpaa_drv,
 
ret = dpaa_dev_init_secondary(eth_dev);
if (ret != 0) {
-   RTE_LOG(ERR, PMD, "secondary dev init failed\n");
+   DPAA_PMD_ERR("secondary dev init failed");
return ret;
}
 
-- 
2.43.0



[PATCH v7 04/19] net/dpaa2: used dedicated logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver has a logtype, but was not being used in one place.

Fixes: f023d059769f ("net/dpaa2: support recycle loopback port")
Fixes: 72ec7a678e70 ("net/dpaa2: add soft parser driver")

Signed-off-by: Stephen Hemminger 
---
 drivers/net/dpaa2/dpaa2_ethdev.c  | 2 +-
 drivers/net/dpaa2/dpaa2_sparser.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dpaa2/dpaa2_ethdev.c b/drivers/net/dpaa2/dpaa2_ethdev.c
index 8e610b6bba30..91846fcd2f23 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.c
+++ b/drivers/net/dpaa2/dpaa2_ethdev.c
@@ -2851,7 +2851,7 @@ dpaa2_dev_init(struct rte_eth_dev *eth_dev)
return ret;
}
}
-   RTE_LOG(INFO, PMD, "%s: netdev created, connected to %s\n",
+   DPAA2_PMD_INFO("%s: netdev created, connected to %s",
eth_dev->data->name, dpaa2_dev->ep_name);
 
return 0;
diff --git a/drivers/net/dpaa2/dpaa2_sparser.c 
b/drivers/net/dpaa2/dpaa2_sparser.c
index 63463c4fbfd6..36a14526a5c5 100644
--- a/drivers/net/dpaa2/dpaa2_sparser.c
+++ b/drivers/net/dpaa2/dpaa2_sparser.c
@@ -181,7 +181,7 @@ int dpaa2_eth_load_wriop_soft_parser(struct dpaa2_dev_priv 
*priv,
 
priv->ss_iova = (uint64_t)(DPAA2_VADDR_TO_IOVA(addr));
priv->ss_offset += sp_param.size;
-   RTE_LOG(INFO, PMD, "Soft parser loaded for dpni@%d\n", priv->hw_id);
+   DPAA2_PMD_INFO("Soft parser loaded for dpni@%d", priv->hw_id);
 
rte_free(addr);
return 0;
@@ -234,6 +234,6 @@ int dpaa2_eth_enable_wriop_soft_parser(struct 
dpaa2_dev_priv *priv,
}
 
rte_free(param_addr);
-   RTE_LOG(INFO, PMD, "Soft parser enabled for dpni@%d\n", priv->hw_id);
+   DPAA2_PMD_INFO("Soft parser enabled for dpni@%d", priv->hw_id);
return 0;
 }
-- 
2.43.0



[PATCH v7 05/19] net/mrvl: do not use PMD logtype

2024-02-02 Thread Stephen Hemminger
Use the same logtype as other places in the driver.

Fixes: 9e79d810911d ("net/mvpp2: support Tx scatter/gather")
Signed-off-by: Stephen Hemminger 
---
 drivers/net/mvpp2/mrvl_ethdev.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/mvpp2/mrvl_ethdev.c b/drivers/net/mvpp2/mrvl_ethdev.c
index c12364941d62..1ca13e8b89d7 100644
--- a/drivers/net/mvpp2/mrvl_ethdev.c
+++ b/drivers/net/mvpp2/mrvl_ethdev.c
@@ -2976,8 +2976,7 @@ mrvl_tx_sg_pkt_burst(void *txq, struct rte_mbuf **tx_pkts,
 */
if (nb_segs > PP2_PPIO_DESC_NUM_FRAGS) {
total_descs -= nb_segs;
-   RTE_LOG(ERR, PMD,
-   "Too many segments. Packet won't be sent.\n");
+   MRVL_LOG(ERR, "Too many segments. Packet won't be 
sent.");
break;
}
 
-- 
2.43.0



[PATCH v7 07/19] net/nfb: use dynamic logtype

2024-02-02 Thread Stephen Hemminger
All drivers should be using dynamic logtype.

Fixes: 6435f9a0ac22 ("net/nfb: add new netcope driver")
Signed-off-by: Stephen Hemminger 
---
 drivers/net/nfb/nfb.h|  5 +
 drivers/net/nfb/nfb_ethdev.c | 20 +---
 drivers/net/nfb/nfb_rx.c | 10 +-
 drivers/net/nfb/nfb_rx.h |  2 +-
 drivers/net/nfb/nfb_tx.c | 10 +-
 drivers/net/nfb/nfb_tx.h |  2 +-
 6 files changed, 26 insertions(+), 23 deletions(-)

diff --git a/drivers/net/nfb/nfb.h b/drivers/net/nfb/nfb.h
index 7dc5bd29e44c..45226ee3d938 100644
--- a/drivers/net/nfb/nfb.h
+++ b/drivers/net/nfb/nfb.h
@@ -12,6 +12,11 @@
 #include 
 #include 
 
+extern int nfb_logtype;
+#define NFB_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, nfb_logtype, "%s(): " fmt "\n", \
+   __func__, ## args)
+
 #include "nfb_rx.h"
 #include "nfb_tx.h"
 
diff --git a/drivers/net/nfb/nfb_ethdev.c b/drivers/net/nfb/nfb_ethdev.c
index defd118bd0ee..da9e4167ea69 100644
--- a/drivers/net/nfb/nfb_ethdev.c
+++ b/drivers/net/nfb/nfb_ethdev.c
@@ -192,8 +192,7 @@ nfb_eth_dev_configure(struct rte_eth_dev *dev __rte_unused)
(&nfb_timestamp_dynfield_offset,
&nfb_timestamp_rx_dynflag);
if (ret != 0) {
-   RTE_LOG(ERR, PMD, "Cannot register Rx timestamp"
-   " field/flag %d\n", ret);
+   NFB_LOG(ERR, "Cannot register Rx timestamp field/flag 
%d", ret);
nfb_close(internals->nfb);
return -rte_errno;
}
@@ -520,7 +519,7 @@ nfb_eth_dev_init(struct rte_eth_dev *dev)
struct rte_ether_addr eth_addr_init;
struct rte_kvargs *kvlist;
 
-   RTE_LOG(INFO, PMD, "Initializing NFB device (" PCI_PRI_FMT ")\n",
+   NFB_LOG(INFO, "Initializing NFB device (" PCI_PRI_FMT ")",
pci_addr->domain, pci_addr->bus, pci_addr->devid,
pci_addr->function);
 
@@ -536,7 +535,7 @@ nfb_eth_dev_init(struct rte_eth_dev *dev)
kvlist = rte_kvargs_parse(dev->device->devargs->args,
VALID_KEYS);
if (kvlist == NULL) {
-   RTE_LOG(ERR, PMD, "Failed to parse device arguments %s",
+   NFB_LOG(ERR, "Failed to parse device arguments %s",
dev->device->devargs->args);
rte_kvargs_free(kvlist);
return -EINVAL;
@@ -551,14 +550,14 @@ nfb_eth_dev_init(struct rte_eth_dev *dev)
 */
internals->nfb = nfb_open(internals->nfb_dev);
if (internals->nfb == NULL) {
-   RTE_LOG(ERR, PMD, "nfb_open(): failed to open %s",
+   NFB_LOG(ERR, "nfb_open(): failed to open %s",
internals->nfb_dev);
return -EINVAL;
}
data->nb_rx_queues = ndp_get_rx_queue_available_count(internals->nfb);
data->nb_tx_queues = ndp_get_tx_queue_available_count(internals->nfb);
 
-   RTE_LOG(INFO, PMD, "Available NDP queues RX: %u TX: %u\n",
+   NFB_LOG(INFO, "Available NDP queues RX: %u TX: %u",
data->nb_rx_queues, data->nb_tx_queues);
 
nfb_nc_rxmac_init(internals->nfb,
@@ -583,7 +582,7 @@ nfb_eth_dev_init(struct rte_eth_dev *dev)
data->mac_addrs = rte_zmalloc(data->name,
sizeof(struct rte_ether_addr) * mac_count, RTE_CACHE_LINE_SIZE);
if (data->mac_addrs == NULL) {
-   RTE_LOG(ERR, PMD, "Could not alloc space for MAC address!\n");
+   NFB_LOG(ERR, "Could not alloc space for MAC address");
nfb_close(internals->nfb);
return -EINVAL;
}
@@ -601,8 +600,7 @@ nfb_eth_dev_init(struct rte_eth_dev *dev)
 
dev->data->dev_flags |= RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS;
 
-   RTE_LOG(INFO, PMD, "NFB device ("
-   PCI_PRI_FMT ") successfully initialized\n",
+   NFB_LOG(INFO, "NFB device (" PCI_PRI_FMT ") successfully initialized",
pci_addr->domain, pci_addr->bus, pci_addr->devid,
pci_addr->function);
 
@@ -626,8 +624,7 @@ nfb_eth_dev_uninit(struct rte_eth_dev *dev)
 
nfb_eth_dev_close(dev);
 
-   RTE_LOG(INFO, PMD, "NFB device ("
-   PCI_PRI_FMT ") successfully uninitialized\n",
+   NFB_LOG(INFO, "NFB device (" PCI_PRI_FMT ") successfully uninitialized",
pci_addr->domain, pci_addr->bus, pci_addr->devid,
pci_addr->function);
 
@@ -690,3 +687,4 @@ static struct rte_pci_driver nfb_eth_driver = {
 RTE_PMD_REGISTER_PCI(RTE_NFB_DRIVER_NAME, nfb_eth_driver);
 RTE_PMD_REGISTER_PCI_TABLE(RTE_NFB_DRIVER_NAME, nfb_pci_id_table);
 RTE_PMD_REGISTER_KMOD_DEP(RTE_NFB_DRIVER_NAME, "* nfb");
+RTE_LOG_REGISTER_DEFAULT(nfb_logtype, NOTICE);
diff --git a/drivers/net/nfb/nfb_rx.c b/drivers/net/nfb/nfb_rx.c
index 8a9b232305f2..f72

[PATCH v7 06/19] net/mvpp2: use dedicated logtype

2024-02-02 Thread Stephen Hemminger
Always use the dedicated logtype, not PMD.

Fixes: 9e79d810911d ("net/mvpp2: support Tx scatter/gather")
Signed-off-by: Stephen Hemminger 
---
 drivers/net/mvpp2/mrvl_ethdev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mvpp2/mrvl_ethdev.c b/drivers/net/mvpp2/mrvl_ethdev.c
index 1ca13e8b89d7..a91509d92afb 100644
--- a/drivers/net/mvpp2/mrvl_ethdev.c
+++ b/drivers/net/mvpp2/mrvl_ethdev.c
@@ -415,10 +415,10 @@ mrvl_set_tx_function(struct rte_eth_dev *dev)
 
/* Use a simple Tx queue (no offloads, no multi segs) if possible */
if (priv->multiseg) {
-   RTE_LOG(INFO, PMD, "Using multi-segment tx callback\n");
+   MRVL_LOG(INFO, "Using multi-segment tx callback");
dev->tx_pkt_burst = mrvl_tx_sg_pkt_burst;
} else {
-   RTE_LOG(INFO, PMD, "Using single-segment tx callback\n");
+   MRVL_LOG(INFO, "Using single-segment tx callback");
dev->tx_pkt_burst = mrvl_tx_pkt_burst;
}
 }
-- 
2.43.0



[PATCH v7 08/19] net/vmxnet3: used dedicated logtype not PMD

2024-02-02 Thread Stephen Hemminger
The driver has log macros, just not used in one place.

Fixes: 046f11619567 ("net/vmxnet3: support MSI-X interrupt")
Signed-off-by: Stephen Hemminger 
---
 drivers/net/vmxnet3/vmxnet3_ethdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vmxnet3/vmxnet3_ethdev.c 
b/drivers/net/vmxnet3/vmxnet3_ethdev.c
index e49191718aea..4fd704045fc4 100644
--- a/drivers/net/vmxnet3/vmxnet3_ethdev.c
+++ b/drivers/net/vmxnet3/vmxnet3_ethdev.c
@@ -1919,7 +1919,7 @@ vmxnet3_interrupt_handler(void *param)
if (events == 0)
goto done;
 
-   RTE_LOG(DEBUG, PMD, "Reading events: 0x%X", events);
+   PMD_DRV_LOG(DEBUG, "Reading events: 0x%X", events);
vmxnet3_process_events(dev);
 done:
vmxnet3_enable_intr(hw, *eventIntrIdx);
-- 
2.43.0



[PATCH v7 09/19] raw/cnxk: replace PMD logtype with dynamic type

2024-02-02 Thread Stephen Hemminger
Driver should not be using PMD logtype, they should have their
own logtype.

Signed-off-by: Stephen Hemminger 
---
 drivers/common/cnxk/roc_platform.h | 16 ++-
 drivers/raw/cnxk_bphy/cnxk_bphy.c  |  3 ++-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c  |  2 +-
 drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c | 31 +++---
 drivers/raw/cnxk_bphy/rte_pmd_bphy.h   |  6 +
 drivers/raw/cnxk_gpio/cnxk_gpio.c  | 21 ---
 drivers/raw/cnxk_gpio/cnxk_gpio.h  |  5 
 drivers/raw/cnxk_gpio/cnxk_gpio_selftest.c | 17 ++--
 8 files changed, 57 insertions(+), 44 deletions(-)

diff --git a/drivers/common/cnxk/roc_platform.h 
b/drivers/common/cnxk/roc_platform.h
index ba23b2e0d79e..9d2ea8f00965 100644
--- a/drivers/common/cnxk/roc_platform.h
+++ b/drivers/common/cnxk/roc_platform.h
@@ -265,11 +265,13 @@ extern int cnxk_logtype_tm;
 extern int cnxk_logtype_ree;
 extern int cnxk_logtype_dpi;
 
+#define RTE_LOGTYPE_CNXK cnxk_logtype_base
+
 #define plt_err(fmt, args...)  
\
-   RTE_LOG(ERR, PMD, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
-#define plt_info(fmt, args...) RTE_LOG(INFO, PMD, fmt "\n", ##args)
-#define plt_warn(fmt, args...) RTE_LOG(WARNING, PMD, fmt "\n", ##args)
-#define plt_print(fmt, args...) RTE_LOG(INFO, PMD, fmt "\n", ##args)
+   RTE_LOG(ERR, CNXK, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
+#define plt_info(fmt, args...) RTE_LOG(INFO, CNXK, fmt "\n", ##args)
+#define plt_warn(fmt, args...) RTE_LOG(WARNING, CNXK, fmt "\n", ##args)
+#define plt_print(fmt, args...) RTE_LOG(INFO, CNXK, fmt "\n", ##args)
 #define plt_dump(fmt, ...)  fprintf(stderr, fmt "\n", ##__VA_ARGS__)
 #define plt_dump_no_nl(fmt, ...) fprintf(stderr, fmt, ##__VA_ARGS__)
 
@@ -296,11 +298,11 @@ extern int cnxk_logtype_dpi;
 
 /* Datapath logs */
 #define plt_dp_err(fmt, args...)   
\
-   RTE_LOG_DP(ERR, PMD, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
+   RTE_LOG_DP(ERR, CNXK, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
 #define plt_dp_info(fmt, args...)  
\
-   RTE_LOG_DP(INFO, PMD, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
+   RTE_LOG_DP(INFO, CNXK, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
 #define plt_dp_dbg(fmt, args...)  \
-   RTE_LOG_DP(DEBUG, PMD, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
+   RTE_LOG_DP(DEBUG, CNXK, "%s():%u " fmt "\n", __func__, __LINE__, ##args)
 
 #ifdef __cplusplus
 #define CNXK_PCI_ID(subsystem_dev, dev)
\
diff --git a/drivers/raw/cnxk_bphy/cnxk_bphy.c 
b/drivers/raw/cnxk_bphy/cnxk_bphy.c
index 15dbc4c1a637..1dbab6fb3e12 100644
--- a/drivers/raw/cnxk_bphy/cnxk_bphy.c
+++ b/drivers/raw/cnxk_bphy/cnxk_bphy.c
@@ -251,7 +251,7 @@ cnxk_bphy_irq_enqueue_bufs(struct rte_rawdev *dev,
 
/* get rid of last response if any */
if (qp->rsp) {
-   RTE_LOG(WARNING, PMD, "Previous response got overwritten\n");
+   CNXK_BPHY_LOG(WARNING, "Previous response got overwritten");
rte_free(qp->rsp);
}
qp->rsp = rsp;
@@ -410,3 +410,4 @@ static struct rte_pci_driver cnxk_bphy_rawdev_pmd = {
 RTE_PMD_REGISTER_PCI(bphy_rawdev_pci_driver, cnxk_bphy_rawdev_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(bphy_rawdev_pci_driver, pci_bphy_map);
 RTE_PMD_REGISTER_KMOD_DEP(bphy_rawdev_pci_driver, "vfio-pci");
+RTE_LOG_REGISTER_SUFFIX(cnxk_logtype_bphy, bphy, INFO);
diff --git a/drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c 
b/drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c
index 2d8466ef918b..4358aeecc3e5 100644
--- a/drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c
+++ b/drivers/raw/cnxk_bphy/cnxk_bphy_cgx.c
@@ -189,7 +189,7 @@ cnxk_bphy_cgx_process_buf(struct cnxk_bphy_cgx *cgx, 
unsigned int queue,
 
/* get rid of last response if any */
if (qp->rsp) {
-   RTE_LOG(WARNING, PMD, "Previous response got overwritten\n");
+   CNXK_BPHY_LOG(WARNING, "Previous response got overwritten");
rte_free(qp->rsp);
}
qp->rsp = rsp;
diff --git a/drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c 
b/drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c
index a3021b4bb7db..f01d958661ad 100644
--- a/drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c
+++ b/drivers/raw/cnxk_bphy/cnxk_bphy_cgx_test.c
@@ -57,62 +57,61 @@ cnxk_bphy_cgx_dev_selftest(uint16_t dev_id)
if (ret)
break;
if (descs != 1) {
-   RTE_LOG(ERR, PMD, "Wrong number of descs reported\n");
+   CNXK_BPHY_LOG(ERR, "Wrong number of descs reported");
ret = -ENODEV;
break;
}
 
-   RTE_LOG(INFO, PMD, "Testing queue %d\n", i);
+   CNXK_BPHY_LOG(INFO, "Testing queue %d", i);
 
   

[PATCH v7 10/19] crypto/scheduler: replace use of logtype PMD

2024-02-02 Thread Stephen Hemminger
Driver has logging macro but not used everywhere.

Fixes: 6760463c9f26 ("crypto/scheduler: add mode-specific threshold parameter")
Signed-off-by: Stephen Hemminger 
---
 drivers/crypto/scheduler/scheduler_pmd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/scheduler/scheduler_pmd.c 
b/drivers/crypto/scheduler/scheduler_pmd.c
index 589d092d7457..95ce893f0540 100644
--- a/drivers/crypto/scheduler/scheduler_pmd.c
+++ b/drivers/crypto/scheduler/scheduler_pmd.c
@@ -197,8 +197,8 @@ cryptodev_scheduler_create(const char *name,
return -EINVAL;
}
 
-   RTE_LOG(INFO, PMD, "  Sched mode param (%s = %s)\n",
-   param_name, param_val);
+   CR_SCHED_LOG(INFO, "  Sched mode param (%s = %s)",
+param_name, param_val);
}
}
 
-- 
2.43.0



[PATCH v7 11/19] crypto/armv8: do not use PMD logtype

2024-02-02 Thread Stephen Hemminger
Driver already has logging macros, just not used in one place.

Fixes: 169ca3db550c ("crypto/armv8: add PMD optimized for ARMv8 processors")
Signed-off-by: Stephen Hemminger 
---
 drivers/crypto/armv8/rte_armv8_pmd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/armv8/rte_armv8_pmd.c 
b/drivers/crypto/armv8/rte_armv8_pmd.c
index 824a2cc7352a..026cdf5105dd 100644
--- a/drivers/crypto/armv8/rte_armv8_pmd.c
+++ b/drivers/crypto/armv8/rte_armv8_pmd.c
@@ -833,8 +833,8 @@ cryptodev_armv8_crypto_uninit(struct rte_vdev_device *vdev)
if (name == NULL)
return -EINVAL;
 
-   RTE_LOG(INFO, PMD,
-   "Closing ARMv8 crypto device %s on numa socket %u\n",
+   ARVM8_CRYPTO_LOG_INFO(
+   "Closing ARMv8 crypto device %s on numa socket %u",
name, rte_socket_id());
 
cryptodev = rte_cryptodev_pmd_get_named_dev(name);
-- 
2.43.0



[PATCH v7 12/19] crypto/caam_jr: use dedicated logtype

2024-02-02 Thread Stephen Hemminger
The driver has macro and logtype but not used in a couple places.

Fixes: af7c9b5e9ce7 ("crypto/caam_jr: introduce basic driver")
Signed-off-by: Stephen Hemminger 
---
 drivers/crypto/caam_jr/caam_jr.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/crypto/caam_jr/caam_jr.c b/drivers/crypto/caam_jr/caam_jr.c
index 9c96fd21a48d..a6c0cdc93135 100644
--- a/drivers/crypto/caam_jr/caam_jr.c
+++ b/drivers/crypto/caam_jr/caam_jr.c
@@ -2346,7 +2346,7 @@ caam_jr_dev_init(const char *name,
 
rte_cryptodev_pmd_probing_finish(dev);
 
-   RTE_LOG(INFO, PMD, "%s cryptodev init\n", dev->data->name);
+   CAAM_JR_LOG(INFO, "%s cryptodev init", dev->data->name);
 
return 0;
 
@@ -2386,8 +2386,7 @@ cryptodev_caam_jr_probe(struct rte_vdev_device *vdev)
 
ret = of_init();
if (ret) {
-   RTE_LOG(ERR, PMD,
-   "of_init failed\n");
+   CAAM_JR_LOG(ERR, "of_init failed");
return -EINVAL;
}
/* if sec device version is not configured */
-- 
2.43.0



[PATCH v7 13/19] crypto/ccp: do not use PMD logtype

2024-02-02 Thread Stephen Hemminger
This driver has logging macros but not used consistently.

Fixes: ef4b04f87fa6 ("crypto/ccp: support device init")
Signed-off-by: Stephen Hemminger 
---
 drivers/crypto/ccp/rte_ccp_pmd.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/ccp/rte_ccp_pmd.c b/drivers/crypto/ccp/rte_ccp_pmd.c
index a5271d72273d..869399597ef1 100644
--- a/drivers/crypto/ccp/rte_ccp_pmd.c
+++ b/drivers/crypto/ccp/rte_ccp_pmd.c
@@ -194,8 +194,7 @@ cryptodev_ccp_remove(struct rte_pci_device *pci_dev)
 
ccp_pmd_init_done = 0;
 
-   RTE_LOG(INFO, PMD, "Closing ccp device %s on numa socket %u\n",
-   name, rte_socket_id());
+   CCP_LOG_INFO("Closing ccp device %s on numa socket %u", name, 
rte_socket_id());
 
return rte_cryptodev_pmd_destroy(dev);
 }
@@ -279,7 +278,7 @@ cryptodev_ccp_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
};
 
if (ccp_pmd_init_done) {
-   RTE_LOG(INFO, PMD, "CCP PMD already initialized\n");
+   CCP_LOG_INFO("CCP PMD already initialized");
return -EFAULT;
}
rte_pci_device_name(&pci_dev->addr, name, sizeof(name));
@@ -288,11 +287,11 @@ cryptodev_ccp_probe(struct rte_pci_driver *pci_drv 
__rte_unused,
 
init_params.def_p.max_nb_queue_pairs = CCP_PMD_MAX_QUEUE_PAIRS;
 
-   RTE_LOG(INFO, PMD, "Initialising %s on NUMA node %d\n", name,
+   CCP_LOG_INFO("Initialising %s on NUMA node %d", name,
init_params.def_p.socket_id);
-   RTE_LOG(INFO, PMD, "Max number of queue pairs = %d\n",
+   CCP_LOG_INFO("Max number of queue pairs = %d",
init_params.def_p.max_nb_queue_pairs);
-   RTE_LOG(INFO, PMD, "Authentication offload to %s\n",
+   CCP_LOG_INFO("Authentication offload to %s",
((init_params.auth_opt == 0) ? "CCP" : "CPU"));
 
rte_pci_device_name(&pci_dev->addr, name, sizeof(name));
-- 
2.43.0



[PATCH v7 14/19] crypto/dpaa_sec, crypto/dpaa2_sec: use dedicated logtype

2024-02-02 Thread Stephen Hemminger
A couple of messages were using RTE_LOGTYPE_PMD when dedicated
logtype was already available.

Fixes: fe3688ba7950 ("crypto/dpaa_sec: support event crypto adapter")
Fixes: bffc7d561c81 ("crypto/dpaa2_sec: support event crypto adapter")
Signed-off-by: Stephen Hemminger 
---
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c |  6 ++---
 drivers/crypto/dpaa_sec/dpaa_sec.c  | 30 ++---
 2 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c 
b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index bb5a2c629e53..1cae6c45059e 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -4153,7 +4153,7 @@ dpaa2_sec_eventq_attach(const struct rte_cryptodev *dev,
ret = dpseci_set_opr(dpseci, CMD_PRI_LOW, priv->token,
   qp_id, OPR_OPT_CREATE, &ocfg);
if (ret) {
-   RTE_LOG(ERR, PMD, "Error setting opr: ret: %d\n", ret);
+   DPAA2_SEC_ERR("Error setting opr: ret: %d", ret);
return ret;
}
qp->tx_vq.cb_eqresp_free = dpaa2_sec_free_eqresp_buf;
@@ -4163,7 +4163,7 @@ dpaa2_sec_eventq_attach(const struct rte_cryptodev *dev,
ret = dpseci_set_rx_queue(dpseci, CMD_PRI_LOW, priv->token,
  qp_id, &cfg);
if (ret) {
-   RTE_LOG(ERR, PMD, "Error in dpseci_set_queue: ret: %d\n", ret);
+   DPAA2_SEC_ERR("Error in dpseci_set_queue: ret: %d", ret);
return ret;
}
 
@@ -4188,7 +4188,7 @@ dpaa2_sec_eventq_detach(const struct rte_cryptodev *dev,
ret = dpseci_set_rx_queue(dpseci, CMD_PRI_LOW, priv->token,
  qp_id, &cfg);
if (ret)
-   RTE_LOG(ERR, PMD, "Error in dpseci_set_queue: ret: %d\n", ret);
+   DPAA2_SEC_ERR("Error in dpseci_set_queue: ret: %d", ret);
 
return ret;
 }
diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c 
b/drivers/crypto/dpaa_sec/dpaa_sec.c
index a301e8edb2a4..e7ebcbe2af22 100644
--- a/drivers/crypto/dpaa_sec/dpaa_sec.c
+++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
@@ -102,7 +102,7 @@ ern_sec_fq_handler(struct qman_portal *qm __rte_unused,
   struct qman_fq *fq,
   const struct qm_mr_entry *msg)
 {
-   DPAA_SEC_DP_ERR("sec fq %d error, RC = %x, seqnum = %x\n",
+   DPAA_SEC_DP_ERR("sec fq %d error, RC = %x, seqnum = %x",
fq->fqid, msg->ern.rc, msg->ern.seqnum);
 }
 
@@ -849,7 +849,7 @@ dpaa_sec_deq(struct dpaa_sec_qp *qp, struct rte_crypto_op 
**ops, int nb_ops)
op->status = RTE_CRYPTO_OP_STATUS_SUCCESS;
} else {
if (dpaa_sec_dp_dump > DPAA_SEC_DP_NO_DUMP) {
-   DPAA_SEC_DP_WARN("SEC return err:0x%x\n",
+   DPAA_SEC_DP_WARN("SEC return err:0x%x",
  ctx->fd_status);
if (dpaa_sec_dp_dump > DPAA_SEC_DP_ERR_DUMP)
dpaa_sec_dump(ctx, qp);
@@ -1943,8 +1943,7 @@ dpaa_sec_enqueue_burst(void *qp, struct rte_crypto_op 
**ops,
}
} else if (unlikely(ses->qp[rte_lcore_id() %
MAX_DPAA_CORES] != qp)) {
-   DPAA_SEC_DP_ERR("Old:sess->qp = %p"
-   " New qp = %p\n",
+   DPAA_SEC_DP_ERR("Old: sess->qp = %p New qp = 
%p",
ses->qp[rte_lcore_id() %
MAX_DPAA_CORES], qp);
frames_to_send = loop;
@@ -2054,7 +2053,7 @@ dpaa_sec_enqueue_burst(void *qp, struct rte_crypto_op 
**ops,
fd->cmd = 0x8000 |
*((uint32_t *)((uint8_t *)op +
ses->pdcp.hfn_ovd_offset));
-   DPAA_SEC_DP_DEBUG("Per packet HFN: %x, 
ovd:%u\n",
+   DPAA_SEC_DP_DEBUG("Per packet HFN: %x, ovd:%u",
*((uint32_t *)((uint8_t *)op +
ses->pdcp.hfn_ovd_offset)),
ses->pdcp.hfn_ovd);
@@ -2095,7 +2094,7 @@ dpaa_sec_dequeue_burst(void *qp, struct rte_crypto_op 
**ops,
dpaa_qp->rx_pkts += num_rx;
dpaa_qp->rx_errs += nb_ops - num_rx;
 
-   DPAA_SEC_DP_DEBUG("SEC Received %d Packets\n", num_rx);
+   DPAA_SEC_DP_DEBUG("SEC Received %d Packets", num_rx);
 
return num_rx;
 }
@@ -2158,7 +2157,7 @@ dpaa_sec_queue_pair_setup(struct rte_cryptodev *dev, 
uint16_t qp_id,
NULL, NULL, NULL, NULL,
   

[PATCH v7 15/19] event/dpaa, event/dpaa2: use dedicated logtype

2024-02-02 Thread Stephen Hemminger
Do not use RTE_LOGTYPE_PMD.

Fixes: b0f66a68ca74 ("event/dpaa: support crypto adapter")
Fixes: 4ab57b042e7c ("event/dpaa2: affine portal at runtime during I/O")
Signed-off-by: Stephen Hemminger 
---
 drivers/event/dpaa/dpaa_eventdev.c| 2 +-
 drivers/event/dpaa2/dpaa2_eventdev.c  | 4 ++--
 drivers/event/dpaa2/dpaa2_eventdev_selftest.c | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/event/dpaa/dpaa_eventdev.c 
b/drivers/event/dpaa/dpaa_eventdev.c
index 46a9b88c73ae..a8e5c3421df1 100644
--- a/drivers/event/dpaa/dpaa_eventdev.c
+++ b/drivers/event/dpaa/dpaa_eventdev.c
@@ -1025,7 +1025,7 @@ dpaa_event_dev_create(const char *name, const char 
*params, struct rte_vdev_devi
eventdev->txa_enqueue = dpaa_eventdev_txa_enqueue;
eventdev->txa_enqueue_same_dest = dpaa_eventdev_txa_enqueue_same_dest;
 
-   RTE_LOG(INFO, PMD, "%s eventdev added", name);
+   DPAA_EVENTDEV_INFO("%s eventdev added", name);
 
/* For secondary processes, the primary has done all the work */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/drivers/event/dpaa2/dpaa2_eventdev.c 
b/drivers/event/dpaa2/dpaa2_eventdev.c
index dd4e64395fe5..85c2dbd998dd 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev.c
@@ -1141,7 +1141,7 @@ dpaa2_eventdev_create(const char *name, struct 
rte_vdev_device *vdev)
priv->max_event_queues++;
} while (dpcon_dev && dpci_dev);
 
-   RTE_LOG(INFO, PMD, "%s eventdev created\n", name);
+   DPAA2_EVENTDEV_INFO("%s eventdev created", name);
 
 done:
event_dev_probing_finish(eventdev);
@@ -1178,7 +1178,7 @@ dpaa2_eventdev_destroy(const char *name)
}
priv->max_event_queues = 0;
 
-   RTE_LOG(INFO, PMD, "%s eventdev cleaned\n", name);
+   DPAA2_EVENTDEV_INFO("%s eventdev cleaned", name);
return 0;
 }
 
diff --git a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c 
b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
index 427aff4b..9d4938efe6aa 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev_selftest.c
@@ -774,15 +774,15 @@ static void dpaa2_test_run(int (*setup)(void), void 
(*tdown)(void),
int (*test)(void), const char *name)
 {
if (setup() < 0) {
-   RTE_LOG(INFO, PMD, "Error setting up test %s", name);
+   DPAA2_EVENTDEV_INFO("Error setting up test %s", name);
unsupported++;
} else {
if (test() < 0) {
failed++;
-   RTE_LOG(INFO, PMD, "%s Failed\n", name);
+   DPAA2_EVENTDEV_INFO("%s Failed", name);
} else {
passed++;
-   RTE_LOG(INFO, PMD, "%s Passed", name);
+   DPAA2_EVENTDEV_INFO("%s Passed", name);
}
}
 
-- 
2.43.0



[PATCH v7 16/19] event/dlb2: use dedicated logtype

2024-02-02 Thread Stephen Hemminger
Driver was using RTE_LOGTYPE_PMD when it had its own logtype.
Fixes: 5433956d5185 ("event/dlb2: add eventdev probe")

Signed-off-by: Stephen Hemminger 
---
 drivers/event/dlb2/dlb2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index 050ace0904b4..c26f2219d40c 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -4741,9 +4741,8 @@ dlb2_parse_params(const char *params,
struct rte_kvargs *kvlist = rte_kvargs_parse(params, args);
 
if (kvlist == NULL) {
-   RTE_LOG(INFO, PMD,
-   "Ignoring unsupported parameters when creating 
device '%s'\n",
-   name);
+   DLB2_LOG_INFO("Ignoring unsupported parameters when 
creating device '%s'",
+ name);
} else {
int ret = rte_kvargs_process(kvlist, NUMA_NODE_ARG,
 set_numa_node,
-- 
2.43.0



[PATCH v7 17/19] event/skeleton: replace logtype PMD with dynamic type

2024-02-02 Thread Stephen Hemminger
The skeleton is supposed to match current best practices.
Change it to use dynamic logtype.

Signed-off-by: Stephen Hemminger 
---
 drivers/event/skeleton/skeleton_eventdev.c | 4 ++--
 drivers/event/skeleton/skeleton_eventdev.h | 8 ++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/event/skeleton/skeleton_eventdev.c 
b/drivers/event/skeleton/skeleton_eventdev.c
index 7df032b7daa1..848b3be82c40 100644
--- a/drivers/event/skeleton/skeleton_eventdev.c
+++ b/drivers/event/skeleton/skeleton_eventdev.c
@@ -457,8 +457,7 @@ skeleton_eventdev_probe(struct rte_vdev_device *vdev)
const char *name;
 
name = rte_vdev_device_name(vdev);
-   RTE_LOG(INFO, PMD, "Initializing %s on NUMA node %d\n", name,
-   rte_socket_id());
+   PMD_DRV_LOG(INFO, "Initializing %s on NUMA node %d", name, 
rte_socket_id());
return skeleton_eventdev_create(name, rte_socket_id(), vdev);
 }
 
@@ -479,3 +478,4 @@ static struct rte_vdev_driver vdev_eventdev_skeleton_pmd = {
 };
 
 RTE_PMD_REGISTER_VDEV(EVENTDEV_NAME_SKELETON_PMD, vdev_eventdev_skeleton_pmd);
+RTE_LOG_REGISTER_DEFAULT(skeleton_eventdev_logtype, INFO);
diff --git a/drivers/event/skeleton/skeleton_eventdev.h 
b/drivers/event/skeleton/skeleton_eventdev.h
index 9193f45f4782..9c1ed4ec5755 100644
--- a/drivers/event/skeleton/skeleton_eventdev.h
+++ b/drivers/event/skeleton/skeleton_eventdev.h
@@ -8,9 +8,12 @@
 #include 
 #include 
 
+extern int skeleton_eventdev_logtype;
+
 #ifdef RTE_LIBRTE_PMD_SKELETON_EVENTDEV_DEBUG
 #define PMD_DRV_LOG(level, fmt, args...) \
-   RTE_LOG(level, PMD, "%s(): " fmt "\n", __func__, ## args)
+   rte_log(RTE_LOG_ ## level, skeleton_eventdev_logtype, \
+   "%s(): " fmt "\n", __func__, ## args)
 #define PMD_DRV_FUNC_TRACE() PMD_DRV_LOG(DEBUG, ">>")
 #else
 #define PMD_DRV_LOG(level, fmt, args...) do { } while (0)
@@ -18,7 +21,8 @@
 #endif
 
 #define PMD_DRV_ERR(fmt, args...) \
-   RTE_LOG(ERR, PMD, "%s(): " fmt "\n", __func__, ## args)
+   rte_log(RTE_LOG_ERR, skeleton_eventdev_logtype, \
+   "%s(): " fmt "\n", __func__, ## args)
 
 struct skeleton_eventdev {
uintptr_t reg_base;
-- 
2.43.0



[PATCH v7 18/19] examples/fips_validation: replace use of PMD logtype

2024-02-02 Thread Stephen Hemminger
Replace PMD with USER1 since that is already used in main

Fixes: 41d561cbdd24 ("examples/fips_validation: add power on self test")
Signed-off-by: Stephen Hemminger 
---
 examples/fips_validation/fips_dev_self_test.c | 44 +--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/examples/fips_validation/fips_dev_self_test.c 
b/examples/fips_validation/fips_dev_self_test.c
index b17f664a5eda..667f5da4e7d9 100644
--- a/examples/fips_validation/fips_dev_self_test.c
+++ b/examples/fips_validation/fips_dev_self_test.c
@@ -1032,7 +1032,7 @@ prepare_cipher_xform(uint8_t dev_id,
 
cap = rte_cryptodev_sym_capability_get(dev_id, &cap_idx);
if (!cap) {
-   RTE_LOG(ERR, PMD, "Failed to get capability for cdev %u\n",
+   RTE_LOG(ERR, USER1, "Failed to get capability for cdev %u\n",
dev_id);
return -EACCES;
}
@@ -1040,7 +1040,7 @@ prepare_cipher_xform(uint8_t dev_id,
if (rte_cryptodev_sym_capability_check_cipher(cap,
cipher_xform->key.length,
cipher_xform->iv.length) != 0) {
-   RTE_LOG(ERR, PMD, "PMD %s key length %u IV length %u\n",
+   RTE_LOG(ERR, USER1, "PMD %s key length %u IV length %u\n",
rte_cryptodev_name_get(dev_id),
cipher_xform->key.length,
cipher_xform->iv.length);
@@ -1088,7 +1088,7 @@ prepare_auth_xform(uint8_t dev_id,
 
cap = rte_cryptodev_sym_capability_get(dev_id, &cap_idx);
if (!cap) {
-   RTE_LOG(ERR, PMD, "Failed to get capability for cdev %u\n",
+   RTE_LOG(ERR, USER1, "Failed to get capability for cdev %u\n",
dev_id);
return -EACCES;
}
@@ -1096,7 +1096,7 @@ prepare_auth_xform(uint8_t dev_id,
if (rte_cryptodev_sym_capability_check_auth(cap,
auth_xform->key.length,
auth_xform->digest_length, 0) != 0) {
-   RTE_LOG(ERR, PMD, "PMD %s key length %u Digest length %u\n",
+   RTE_LOG(ERR, USER1, "PMD %s key length %u Digest length %u\n",
rte_cryptodev_name_get(dev_id),
auth_xform->key.length,
auth_xform->digest_length);
@@ -1147,7 +1147,7 @@ prepare_aead_xform(uint8_t dev_id,
 
cap = rte_cryptodev_sym_capability_get(dev_id, &cap_idx);
if (!cap) {
-   RTE_LOG(ERR, PMD, "Failed to get capability for cdev %u\n",
+   RTE_LOG(ERR, USER1, "Failed to get capability for cdev %u\n",
dev_id);
return -EACCES;
}
@@ -1156,7 +1156,7 @@ prepare_aead_xform(uint8_t dev_id,
aead_xform->key.length,
aead_xform->digest_length, aead_xform->aad_length,
aead_xform->iv.length) != 0) {
-   RTE_LOG(ERR, PMD,
+   RTE_LOG(ERR, USER1,
"PMD %s key_len %u tag_len %u aad_len %u iv_len %u\n",
rte_cryptodev_name_get(dev_id),
aead_xform->key.length,
@@ -1195,7 +1195,7 @@ prepare_cipher_op(struct rte_crypto_op *op,
 
dst = (uint8_t *)rte_pktmbuf_append(mbuf, len);
if (!dst) {
-   RTE_LOG(ERR, PMD, "Error %i: MBUF too small\n", -ENOMEM);
+   RTE_LOG(ERR, USER1, "Error %i: MBUF too small\n", -ENOMEM);
return -ENOMEM;
}
 
@@ -1219,7 +1219,7 @@ prepare_auth_op(struct rte_crypto_op *op,
uint8_t *dst;
 
if (vec->input.len + vec->digest.len > RTE_MBUF_MAX_NB_SEGS) {
-   RTE_LOG(ERR, PMD, "Error %i: Test data too long (%u).\n",
+   RTE_LOG(ERR, USER1, "Error %i: Test data too long (%u).\n",
-ENOMEM, vec->input.len + vec->digest.len);
return -ENOMEM;
}
@@ -1229,7 +1229,7 @@ prepare_auth_op(struct rte_crypto_op *op,
dst = (uint8_t *)rte_pktmbuf_append(mbuf, vec->input.len +
vec->digest.len);
if (!dst) {
-   RTE_LOG(ERR, PMD, "Error %i: MBUF too small\n", -ENOMEM);
+   RTE_LOG(ERR, USER1, "Error %i: MBUF too small\n", -ENOMEM);
return -ENOMEM;
}
 
@@ -1274,7 +1274,7 @@ prepare_aead_op(struct rte_crypto_op *op,
memcpy(iv, vec->iv.data, vec->iv.len);
 
if (len + vec->digest.len > RTE_MBUF_MAX_NB_SEGS) {
-   RTE_LOG(ERR, PMD, "Error %i: Test data too long (%u).\n",
+   RTE_LOG(ERR, USER1, "Error %i: Test data too long (%u).\n",
-ENOMEM, len + vec->digest.len);
return -ENOMEM;
}
@@ -1282,7 +1282,7 @@ prepare_aead_op(struct rte_crypto_op *op,
dst = (uint8_t *)rte_pktmb

[PATCH v7 19/19] log: remove PMD log type

2024-02-02 Thread Stephen Hemminger
All uses of PMD logtype in core DPDK have been replaced
by dynamic types.

Signed-off-by: Stephen Hemminger 
---
 lib/log/log.c | 2 +-
 lib/log/rte_log.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/log/log.c b/lib/log/log.c
index 4cb07121b915..255f757d94cc 100644
--- a/lib/log/log.c
+++ b/lib/log/log.c
@@ -352,7 +352,7 @@ struct logtype {
 
 static const struct logtype logtype_strings[] = {
{RTE_LOGTYPE_EAL,"lib.eal"},
-   {RTE_LOGTYPE_PMD,"pmd"},
+
{RTE_LOGTYPE_USER1,  "user1"},
{RTE_LOGTYPE_USER2,  "user2"},
{RTE_LOGTYPE_USER3,  "user3"},
diff --git a/lib/log/rte_log.h b/lib/log/rte_log.h
index 47ab63635e0d..fbc0df74ca6a 100644
--- a/lib/log/rte_log.h
+++ b/lib/log/rte_log.h
@@ -32,7 +32,7 @@ extern "C" {
 /* was RTE_LOGTYPE_RING */
 /* was RTE_LOGTYPE_MEMPOOL */
 /* was RTE_LOGTYPE_TIMER */
-#define RTE_LOGTYPE_PMD5 /**< Log related to poll mode driver. */
+/* was RTE_LOGTYPE_PMD */
 /* was RTE_LOGTYPE_HASH */
 /* was RTE_LOGTYPE_LPM */
 /* was RTE_LOGTYPE_KNI */
-- 
2.43.0



Re: [PATCH 01/13] net/ionic: add stat for completion queue entries processed

2024-02-02 Thread Stephen Hemminger
On Fri, 2 Feb 2024 11:32:26 -0800
Andrew Boyer  wrote:

> When completion coalescing is turned on in the FW, there will be
> fewer CQE than Tx packets. Expose the stat through debug logging.
> 
> Signed-off-by: Andrew Boyer 

If you care about the stat it should be in xstats.


Re: [PATCH 13/13] net/ionic: optimize device start operation

2024-02-02 Thread Stephen Hemminger
On Fri, 2 Feb 2024 11:32:38 -0800
Andrew Boyer  wrote:

> + memset(ctx, 0, sizeof(*ctx));
> + ctx->pending_work = true;
> + ctx->cmd.q_init.opcode = IONIC_CMD_Q_INIT;
> + ctx->cmd.q_init.type = q->type;
> + ctx->cmd.q_init.ver = lif->qtype_info[q->type].version;
> + ctx->cmd.q_init.index = rte_cpu_to_le_32(q->index);
> + ctx->cmd.q_init.flags = rte_cpu_to_le_16(IONIC_QINIT_F_ENA);
> + ctx->cmd.q_init.intr_index = rte_cpu_to_le_16(IONIC_INTR_NONE);
> + ctx->cmd.q_init.ring_size = rte_log2_u32(q->num_descs);
> + ctx->cmd.q_init.cq_ring_base = rte_cpu_to_le_64(cq->base_pa);
> + ctx->cmd.q_init.sg_ring_base = rte_cpu_to_le_64(q->sg_base_pa);
> +

memset followed by assignment is technically slower than structure
initialization because it requires two writes to the data.
But the optimizer may in some cases figure that out.


Re: [PATCH v2 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread Jerin Jacob
On Fri, Feb 2, 2024 at 7:29 AM  wrote:
>
> From: Pavan Nikhilesh 
>
> Add Rx ARM NEON SIMD routine.
>
> Signed-off-by: Pavan Nikhilesh 

Please fix https://mails.dpdk.org/archives/test-report/2024-February/568395.html


Re: [PATCH 01/11] eventdev: introduce ML event adapter library

2024-02-02 Thread Jerin Jacob
On Sun, Jan 7, 2024 at 9:05 PM Srikanth Yalavarthi
 wrote:
>
> Introduce event ML adapter APIs. This patch provides
> information on adapter modes and usage. Application
> can use this event adapter interface to transfer
> packets between ML device and event device.
>
> Signed-off-by: Srikanth Yalavarthi 

1) Please --thread ---in-reply-to on reply with RFC patch
2) Pleaes add change log
3)In order to merge this series, We need to have UT. it can be based
on SW adapter.
4) Added other event adapter maintainers for review in case if anyone
interested.


> ---
>  MAINTAINERS   |6 +
>  config/rte_config.h   |1 +
>  doc/api/doxy-api-index.md |1 +
>  doc/guides/prog_guide/event_ml_adapter.rst|  268 
>  doc/guides/prog_guide/eventdev.rst|   10 +-
>  .../img/event_ml_adapter_op_forward.svg   | 1086 +
>  .../img/event_ml_adapter_op_new.svg   | 1079 
>  doc/guides/prog_guide/index.rst   |1 +
>  lib/eventdev/meson.build  |4 +-
>  lib/eventdev/rte_event_ml_adapter.c   |6 +
>  lib/eventdev/rte_event_ml_adapter.h   |  594 +
>  lib/eventdev/rte_eventdev.h   |   45 +
>  lib/meson.build   |2 +-
>  lib/mldev/rte_mldev.h |6 +

I perfer this change to come via main tree. Is it possible to move
that change as sepereate patch.

> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0d1c8126e3e..a1125e93621 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -554,6 +554,12 @@ F: drivers/raw/skeleton/
>  F: app/test/test_rawdev.c
>  F: doc/guides/prog_guide/rawdev.rst
>
> +Eventdev ML Adapter API


Add after "Eventdev DMA Adapter API" section

> +M: Srikanth Yalavarthi 
> +T: git://dpdk.org/next/dpdk-next-eventdev
> +F: lib/eventdev/*ml_adapter*
> +F: doc/guides/prog_guide/event_ml_adapter.rst
> +
>

> index 000..71f6c4b5974
> --- /dev/null
> +++ b/doc/guides/prog_guide/event_ml_adapter.rst
> @@ -0,0 +1,268 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright (c) 2024 Marvell.
> +
> +Event ML Adapter Library
> +
> +
> +DPDK :doc:`Eventdev library ` provides event driven programming 
> model with features
> +to schedule events. :doc:`ML Device library ` provides an interface 
> to ML poll mode
> +drivers that support Machine Learning inference operations. Event ML Adapter 
> is intended to
> +bridge between the event device and the ML device.
> +
> +Packet flow from ML device to the event device can be accomplished using 
> software and hardware
> +based transfer mechanisms. The adapter queries an eventdev PMD to determine 
> which mechanism to
> +be used. The adapter uses an EAL service core function for software based 
> packet transfer and
> +uses the eventdev PMD functions to configure hardware based packet transfer 
> between ML device
> +and the event device. ML adapter uses a new event type called 
> ``RTE_EVENT_TYPE_MLDEV`` to
> +indicate the source of event.
> +
> +Application can choose to submit an ML operation directly to an ML device or 
> send it to an ML
> +adapter via eventdev based on RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD 
> capability. The

Use ``...`` everywhere accros the this document when using the symbols
like RTE_EVENT_ML_ADAPTER_CAP_INTERNAL_PORT_OP_FWD

> +first mode is known as the event new (RTE_EVENT_ML_ADAPTER_OP_NEW) mode and 
> the second as the

Same as above.. Please follow the same for rest of the file.,

> +event forward (RTE_EVENT_ML_ADAPTER_OP_FORWARD) mode. Choice of mode can be 
> specified while
> +creating the adapter. In the former mode, it is the application's 
> responsibility to enable
> +ingress packet ordering. In the latter mode, it is the adapter's 
> responsibility to enable
> +ingress packet ordering.
> +
> +
> +Adapter Modes
> +-
> +
> +RTE_EVENT_ML_ADAPTER_OP_NEW mode
> +
> +
> +In the RTE_EVENT_ML_ADAPTER_OP_NEW mode, application submits ML operations 
> directly to an ML
> +device. The adapter then dequeues ML completions from the ML device and 
> enqueues them as events
> +to the event device. This mode does not ensure ingress ordering as the 
> application directly
> +enqueues to the mldev without going through ML/atomic stage. In this mode, 
> events dequeued

Is this ML/atomic or atomic ?

> +from the adapter are treated as new events. The application has to specify 
> event information
> +(response information) which is needed to enqueue an event after the ML 
> operation is completed.
> +

> +
> +
> +API Overview
> +
> +
> +This section has a brief introduction to the event ML adapter APIs. The 
> application is expected
> +to create an adapter which is associated with a single eventdev, then add 
> mldev and queue pair
> +to the adapter instance.
> +
> +
> +Create an adapter in

Re: [PATCH v2 03/11] eventdev: update documentation on device capability flags

2024-02-02 Thread Mattias Rönnblom

On 2024-01-31 15:09, Bruce Richardson wrote:

On Tue, Jan 23, 2024 at 10:18:53AM +0100, Mattias Rönnblom wrote:

On 2024-01-19 18:43, Bruce Richardson wrote:

Update the device capability docs, to:

* include more cross-references
* split longer text into paragraphs, in most cases with each flag having
a single-line summary at the start of the doc block
* general comment rewording and clarification as appropriate

Signed-off-by: Bruce Richardson 
---
   lib/eventdev/rte_eventdev.h | 130 ++--
   1 file changed, 93 insertions(+), 37 deletions(-)




* If this capability is not set, the queue only supports events of the
- *  *RTE_SCHED_TYPE_* type that it was created with.
+ * *RTE_SCHED_TYPE_* type that it was created with.
+ * Any events of other types scheduled to the queue will handled in an
+ * implementation-dependent manner. They may be dropped by the
+ * event device, or enqueued with the scheduling type adjusted to the
+ * correct/supported value.


Having the application setting sched_type when it was already set on a the
level of the queue never made sense to me.

I can't see any reasons why this field shouldn't be ignored by the event
device on non-RTE_EVENT_QUEUE_CFG_ALL_TYPES queues.

If the behavior is indeed undefined, I think it's better to just say
"undefined" rather than the above speculation.



Updating in v3 to just say it's undefined.


*
- * @see RTE_SCHED_TYPE_* values



   #define RTE_EVENT_DEV_CAP_RUNTIME_QUEUE_ATTR (1ULL << 11)
   /**< Event device is capable of changing the queue attributes at runtime i.e
- * after rte_event_queue_setup() or rte_event_start() call sequence. If this
- * flag is not set, eventdev queue attributes can only be configured during
+ * after rte_event_queue_setup() or rte_event_dev_start() call sequence.
+ *
+ * If this flag is not set, eventdev queue attributes can only be configured 
during
* rte_event_queue_setup().


"event queue" or just "queue".


Ack.


+ *
+ * @see rte_event_queue_setup
*/
   #define RTE_EVENT_DEV_CAP_PROFILE_LINK (1ULL << 12)
-/**< Event device is capable of supporting multiple link profiles per event 
port
- * i.e., the value of `rte_event_dev_info::max_profiles_per_port` is greater
- * than one.
+/**< Event device is capable of supporting multiple link profiles per event 
port.
+ *
+ *
+ * When set, the value of `rte_event_dev_info::max_profiles_per_port` is 
greater
+ * than one, and multiple profiles may be configured and then switched at 
runtime.
+ * If not set, only a single profile may be configured, which may itself be
+ * runtime adjustable (if @ref RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK is set).
+ *
+ * @see rte_event_port_profile_links_set rte_event_port_profile_links_get
+ * @see rte_event_port_profile_switch
+ * @see RTE_EVENT_DEV_CAP_RUNTIME_PORT_LINK
*/
   /* Event device priority levels */
   #define RTE_EVENT_DEV_PRIORITY_HIGHEST   0
-/**< Highest priority expressed across eventdev subsystem
+/**< Highest priority expressed across eventdev subsystem.


"The highest priority an event device may support."
or
"The highest priority any event device may support."

Maybe this is a further improvement, beyond punctuation? "across eventdev
subsystem" sounds awkward.



Still not very clear. Talking about device support implies that its
possible some devices may not support it. How about:
 > "highest priority level for events and queues".



Sounds good. I guess it's totally, 100% obvious highest means most urgent?

Otherwise, "highest (i.e., most urgent) priority level for events queues"


Re: [PATCH 4/4] ml/cnxk: add adapter dequeue function

2024-02-02 Thread Jerin Jacob
On Sun, Jan 7, 2024 at 11:39 PM Srikanth Yalavarthi
 wrote:
>
> Implemented ML adapter dequeue function.
>
> Signed-off-by: Srikanth Yalavarthi 

Update the release notes  for this new feature in PMD section.


[PATCH] dmadev: standardize alignment and allocation

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Align fp_objects based on cacheline size, allocate
devices and fp_objects memory on hugepages.

Signed-off-by: Pavan Nikhilesh 
---
 lib/dmadev/rte_dmadev.c  | 6 ++
 lib/dmadev/rte_dmadev_core.h | 2 +-
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c
index 67434c805f43..1fe1434019f0 100644
--- a/lib/dmadev/rte_dmadev.c
+++ b/lib/dmadev/rte_dmadev.c
@@ -143,10 +143,9 @@ dma_fp_data_prepare(void)
 */
size = dma_devices_max * sizeof(struct rte_dma_fp_object) +
RTE_CACHE_LINE_SIZE;
-   ptr = malloc(size);
+   ptr = rte_zmalloc("", size, RTE_CACHE_LINE_SIZE);
if (ptr == NULL)
return -ENOMEM;
-   memset(ptr, 0, size);
 
rte_dma_fp_objs = RTE_PTR_ALIGN(ptr, RTE_CACHE_LINE_SIZE);
for (i = 0; i < dma_devices_max; i++)
@@ -164,10 +163,9 @@ dma_dev_data_prepare(void)
return 0;
 
size = dma_devices_max * sizeof(struct rte_dma_dev);
-   rte_dma_devices = malloc(size);
+   rte_dma_devices = rte_zmalloc("", size, RTE_CACHE_LINE_SIZE);
if (rte_dma_devices == NULL)
return -ENOMEM;
-   memset(rte_dma_devices, 0, size);
 
return 0;
 }
diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
index 064785686f7f..e8239c2d22b6 100644
--- a/lib/dmadev/rte_dmadev_core.h
+++ b/lib/dmadev/rte_dmadev_core.h
@@ -73,7 +73,7 @@ struct rte_dma_fp_object {
rte_dma_completed_tcompleted;
rte_dma_completed_status_t completed_status;
rte_dma_burst_capacity_t   burst_capacity;
-} __rte_aligned(128);
+} __rte_cache_aligned;
 
 extern struct rte_dma_fp_object *rte_dma_fp_objs;
 
-- 
2.43.0



[PATCH v3 3/3] config/arm: allow WFE to be enabled config time

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Allow RTE_ARM_USE_WFE to be enabled at meson configuration
time by passing it via c_args instead of modifying
`config/arm/meson.build`.

Example usage:
 meson build -Dc_args='-DRTE_ARM_USE_WFE' \
--cross-file config/arm/arm64_cn10k_linux_gcc

Signed-off-by: Pavan Nikhilesh 
Acked-by: Chengwen Feng 
Acked-by: Ruifeng Wang 
---
 config/arm/meson.build | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 4e44d1850bae..01870a23328a 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -17,7 +17,9 @@ flags_common = [
 #['RTE_ARM64_MEMCPY_ALIGN_MASK', 0xF],
 #['RTE_ARM64_MEMCPY_STRICT_ALIGN', false],
 
-['RTE_ARM_USE_WFE', false],
+# Enable use of ARM wait for event instruction.
+# ['RTE_ARM_USE_WFE', false],
+
 ['RTE_ARCH_ARM64', true],
 ['RTE_CACHE_LINE_SIZE', 128]
 ]
-- 
2.43.0



[PATCH v3 1/3] config/arm: avoid mcpu and march conflicts

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

The compiler options march and mtune are a subset
of mcpu and will lead to conflicts if improper march
is chosen for a given mcpu.
To avoid conflicts, force part number march when
mcpu is available and is supported by the compiler.

Example:
march = armv9-a
mcpu = neoverse-n2

mcpu supported, march supported
machine_args = ['-mcpu=neoverse-n2', '-march=armv9-a']

mcpu supported, march not supported
machine_args = ['-mcpu=neoverse-n2']

mcpu not supported, march supported
machine_args = ['-march=armv9-a']

mcpu not supported, march not supported
machine_args = ['-march=armv8.6-a']

Signed-off-by: Pavan Nikhilesh 
---
v2 Changes:
- Cleanup march inconsistencies. (Juraj Linkes)
- Unify fallback march selection. (Juraj Linkes)
- Tag along ARM WFE patch.
v3 Changes:
- Fix missing 'fallback_march' key check.

 config/arm/meson.build | 108 +
 1 file changed, 66 insertions(+), 42 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 36f21d22599a..ba859bd060b5 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -58,18 +58,18 @@ implementer_generic = {
 }

 part_number_config_arm = {
-'0xd03': {'compiler_options':  ['-mcpu=cortex-a53']},
-'0xd04': {'compiler_options':  ['-mcpu=cortex-a35']},
-'0xd05': {'compiler_options':  ['-mcpu=cortex-a55']},
-'0xd07': {'compiler_options':  ['-mcpu=cortex-a57']},
-'0xd08': {'compiler_options':  ['-mcpu=cortex-a72']},
-'0xd09': {'compiler_options':  ['-mcpu=cortex-a73']},
-'0xd0a': {'compiler_options':  ['-mcpu=cortex-a75']},
-'0xd0b': {'compiler_options':  ['-mcpu=cortex-a76']},
+'0xd03': {'mcpu': 'cortex-a53'},
+'0xd04': {'mcpu': 'cortex-a35'},
+'0xd05': {'mcpu': 'cortex-a55'},
+'0xd07': {'mcpu': 'cortex-a57'},
+'0xd08': {'mcpu': 'cortex-a72'},
+'0xd09': {'mcpu': 'cortex-a73'},
+'0xd0a': {'mcpu': 'cortex-a75'},
+'0xd0b': {'mcpu': 'cortex-a76'},
 '0xd0c': {
 'march': 'armv8.2-a',
 'march_features': ['crypto', 'rcpc'],
-'compiler_options':  ['-mcpu=neoverse-n1'],
+'mcpu': 'neoverse-n1',
 'flags': [
 ['RTE_MACHINE', '"neoverse-n1"'],
 ['RTE_ARM_FEATURE_ATOMICS', true],
@@ -81,7 +81,7 @@ part_number_config_arm = {
 '0xd40': {
 'march': 'armv8.4-a',
 'march_features': ['sve'],
-'compiler_options':  ['-mcpu=neoverse-v1'],
+'mcpu': 'neoverse-v1',
 'flags': [
 ['RTE_MACHINE', '"neoverse-v1"'],
 ['RTE_ARM_FEATURE_ATOMICS', true],
@@ -92,8 +92,9 @@ part_number_config_arm = {
 'march': 'armv8.4-a',
 },
 '0xd49': {
+'march': 'armv9-a',
 'march_features': ['sve2'],
-'compiler_options': ['-mcpu=neoverse-n2'],
+'mcpu': 'neoverse-n2',
 'flags': [
 ['RTE_MACHINE', '"neoverse-n2"'],
 ['RTE_ARM_FEATURE_ATOMICS', true],
@@ -127,21 +128,23 @@ implementer_cavium = {
 ],
 'part_number_config': {
 '0xa1': {
-'compiler_options': ['-mcpu=thunderxt88'],
+'mcpu': 'thunderxt88',
 'flags': flags_part_number_thunderx
 },
 '0xa2': {
-'compiler_options': ['-mcpu=thunderxt81'],
+'mcpu': 'thunderxt81',
 'flags': flags_part_number_thunderx
 },
 '0xa3': {
-'compiler_options': ['-march=armv8-a+crc', '-mcpu=thunderxt83'],
+'march': 'armv8-a',
+'march_features': ['crc'],
+'mcpu': 'thunderxt83',
 'flags': flags_part_number_thunderx
 },
 '0xaf': {
 'march': 'armv8.1-a',
 'march_features': ['crc', 'crypto'],
-'compiler_options': ['-mcpu=thunderx2t99'],
+'mcpu': 'thunderx2t99',
 'flags': [
 ['RTE_MACHINE', '"thunderx2"'],
 ['RTE_ARM_FEATURE_ATOMICS', true],
@@ -153,7 +156,7 @@ implementer_cavium = {
 '0xb2': {
 'march': 'armv8.2-a',
 'march_features': ['crc', 'crypto', 'lse'],
-'compiler_options': ['-mcpu=octeontx2'],
+'mcpu': 'octeontx2',
 'flags': [
 ['RTE_MACHINE', '"cn9k"'],
 ['RTE_ARM_FEATURE_ATOMICS', true],
@@ -176,7 +179,7 @@ implementer_ampere = {
 '0x0': {
 'march': 'armv8-a',
 'march_features': ['crc', 'crypto'],
-'compiler_options':  ['-mtune=emag'],
+'mcpu': 'emag',
 'flags': [
 ['RTE_MACHINE', '"eMAG"'],
 ['RTE_MAX_LCORE', 32],
@@ -186,7 +189,7 @@ implementer_ampere = {
 '0xac3': {
 'march': 'armv8.6-a',
 'march_features': ['crc', 'crypto'],
-'compiler_options':  ['-mcpu=ampere1'],
+'mcpu': 'ampere1',
 'f

[PATCH v3 2/2] net/octeon_ep: add Rx NEON routine

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Add Rx ARM NEON SIMD routine.

Signed-off-by: Pavan Nikhilesh 
---
 drivers/net/octeon_ep/cnxk_ep_rx_neon.c | 148 
 drivers/net/octeon_ep/meson.build   |   6 +-
 drivers/net/octeon_ep/otx_ep_ethdev.c   |   5 +-
 drivers/net/octeon_ep/otx_ep_rxtx.h |   6 +
 4 files changed, 163 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/octeon_ep/cnxk_ep_rx_neon.c

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_neon.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
new file mode 100644
index 00..8abd8711e1
--- /dev/null
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_neon.c
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2023 Marvell.
+ */
+
+#include "cnxk_ep_rx.h"
+
+static __rte_always_inline void
+cnxk_ep_process_pkts_vec_neon(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq,
+ uint16_t new_pkts)
+{
+   const uint8x16_t mask0 = {0, 1, 0xff, 0xff, 0, 1, 0xff, 0xff,
+ 4, 5, 0xff, 0xff, 4, 5, 0xff, 0xff};
+   const uint8x16_t mask1 = {8,  9,  0xff, 0xff, 8,  9,  0xff, 0xff,
+ 12, 13, 0xff, 0xff, 12, 13, 0xff, 0xff};
+   struct rte_mbuf **recv_buf_list = droq->recv_buf_list;
+   uint32_t pidx0, pidx1, pidx2, pidx3;
+   struct rte_mbuf *m0, *m1, *m2, *m3;
+   uint32_t read_idx = droq->read_idx;
+   uint16_t nb_desc = droq->nb_desc;
+   uint32_t idx0, idx1, idx2, idx3;
+   uint64x2_t s01, s23;
+   uint32x4_t bytes;
+   uint16_t pkts = 0;
+
+   idx0 = read_idx;
+   s01 = vdupq_n_u64(0);
+   bytes = vdupq_n_u32(0);
+   while (pkts < new_pkts) {
+
+   idx1 = otx_ep_incr_index(idx0, 1, nb_desc);
+   idx2 = otx_ep_incr_index(idx1, 1, nb_desc);
+   idx3 = otx_ep_incr_index(idx2, 1, nb_desc);
+
+   if (new_pkts - pkts > 4) {
+   pidx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   pidx1 = otx_ep_incr_index(pidx0, 1, nb_desc);
+   pidx2 = otx_ep_incr_index(pidx1, 1, nb_desc);
+   pidx3 = otx_ep_incr_index(pidx2, 1, nb_desc);
+
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx0], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx1], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx2], void *));
+   
rte_prefetch_non_temporal(cnxk_pktmbuf_mtod(recv_buf_list[pidx3], void *));
+   }
+
+   m0 = recv_buf_list[idx0];
+   m1 = recv_buf_list[idx1];
+   m2 = recv_buf_list[idx2];
+   m3 = recv_buf_list[idx3];
+
+   /* Load packet size big-endian. */
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m0, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 0);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m1, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 1);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m2, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 2);
+   s01 = vsetq_lane_u32(cnxk_pktmbuf_mtod(m3, struct 
otx_ep_droq_info *)->length >> 48,
+s01, 3);
+   /* Convert to little-endian. */
+   s01 = vrev16q_u8(s01);
+
+   /* Vertical add, consolidate outside the loop. */
+   bytes += vaddq_u32(bytes, s01);
+   /* Segregate to packet length and data length. */
+   s23 = vqtbl1q_u8(s01, mask1);
+   s01 = vqtbl1q_u8(s01, mask0);
+
+   /* Store packet length and data length to mbuf. */
+   *(uint64_t *)&m0->pkt_len = vgetq_lane_u64(s01, 0);
+   *(uint64_t *)&m1->pkt_len = vgetq_lane_u64(s01, 1);
+   *(uint64_t *)&m2->pkt_len = vgetq_lane_u64(s23, 0);
+   *(uint64_t *)&m3->pkt_len = vgetq_lane_u64(s23, 1);
+
+   /* Reset rearm data. */
+   *(uint64_t *)&m0->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m1->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m2->rearm_data = droq->rearm_data;
+   *(uint64_t *)&m3->rearm_data = droq->rearm_data;
+
+   rx_pkts[pkts++] = m0;
+   rx_pkts[pkts++] = m1;
+   rx_pkts[pkts++] = m2;
+   rx_pkts[pkts++] = m3;
+   idx0 = otx_ep_incr_index(idx3, 1, nb_desc);
+   }
+   droq->read_idx = idx0;
+
+   droq->refill_count += new_pkts;
+   droq->pkts_pending -= new_pkts;
+   /* Stats */
+   droq->stats.pkts_received += new_pkts;
+#if defined(RTE_ARCH_32)
+   droq->stats.bytes_received += vgetq_lane_u32(bytes, 0);
+   droq->stats.bytes_received += vgetq_lane_u32(byt

[PATCH v3 1/2] net/octeon_ep: improve Rx performance

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Use mempool API instead of pktmbuf alloc to avoid mbuf reset
as it will be done by rearm on receive.
Reorder refill to avoid unnecessary write commits on mbuf data.

Signed-off-by: Pavan Nikhilesh 
---
 v2 Changes:
 - Fix compilation with distro gcc.
 v3 Changes:
 - Fix aarch32 compilation.

 drivers/net/octeon_ep/cnxk_ep_rx.c |  4 +--
 drivers/net/octeon_ep/cnxk_ep_rx.h | 13 ++---
 drivers/net/octeon_ep/cnxk_ep_rx_avx.c | 20 +++---
 drivers/net/octeon_ep/cnxk_ep_rx_sse.c | 38 ++
 drivers/net/octeon_ep/otx_ep_rxtx.h|  2 +-
 5 files changed, 42 insertions(+), 35 deletions(-)

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx.c
index f3e4fb27d1..7465e0a017 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.c
@@ -76,12 +76,12 @@ cnxk_ep_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkts)
uint16_t new_pkts;

new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
-   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
-
/* Refill RX buffers */
if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
cnxk_ep_rx_refill(droq);

+   cnxk_ep_process_pkts_scalar(rx_pkts, droq, new_pkts);
+
return new_pkts;
 }

diff --git a/drivers/net/octeon_ep/cnxk_ep_rx.h 
b/drivers/net/octeon_ep/cnxk_ep_rx.h
index e71fc0de5c..61263e651e 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx.h
+++ b/drivers/net/octeon_ep/cnxk_ep_rx.h
@@ -21,13 +21,16 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
uint32_t i;
int rc;

-   rc = rte_pktmbuf_alloc_bulk(droq->mpool, &recv_buf_list[refill_idx], 
count);
+   rc = rte_mempool_get_bulk(droq->mpool, (void 
**)&recv_buf_list[refill_idx], count);
if (unlikely(rc)) {
droq->stats.rx_alloc_failure++;
return rc;
}

for (i = 0; i < count; i++) {
+   rte_prefetch_non_temporal(&desc_ring[(refill_idx + 1) & 3]);
+   if (i < count - 1)
+   rte_prefetch_non_temporal(recv_buf_list[refill_idx + 
1]);
buf = recv_buf_list[refill_idx];
desc_ring[refill_idx].buffer_ptr = 
rte_mbuf_data_iova_default(buf);
refill_idx++;
@@ -42,9 +45,9 @@ cnxk_ep_rx_refill_mbuf(struct otx_ep_droq *droq, uint32_t 
count)
 static inline void
 cnxk_ep_rx_refill(struct otx_ep_droq *droq)
 {
-   uint32_t desc_refilled = 0, count;
-   uint32_t nb_desc = droq->nb_desc;
+   const uint32_t nb_desc = droq->nb_desc;
uint32_t refill_idx = droq->refill_idx;
+   uint32_t desc_refilled = 0, count;
int rc;

if (unlikely(droq->read_idx == refill_idx))
@@ -128,6 +131,8 @@ cnxk_ep_rx_pkts_to_process(struct otx_ep_droq *droq, 
uint16_t nb_pkts)
return RTE_MIN(nb_pkts, droq->pkts_pending);
 }

+#define cnxk_pktmbuf_mtod(m, t) ((t)(void *)((char *)(m)->buf_addr + 
RTE_PKTMBUF_HEADROOM))
+
 static __rte_always_inline void
 cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, struct otx_ep_droq 
*droq, uint16_t new_pkts)
 {
@@ -147,7 +152,7 @@ cnxk_ep_process_pkts_scalar(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq,
  void *));

mbuf = recv_buf_list[read_idx];
-   info = rte_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
+   info = cnxk_pktmbuf_mtod(mbuf, struct otx_ep_droq_info *);
read_idx = otx_ep_incr_index(read_idx, 1, nb_desc);
pkt_len = rte_bswap16(info->length >> 48);
mbuf->pkt_len = pkt_len;
diff --git a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c 
b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
index ae4615e6da..47eb1d2ef7 100644
--- a/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
+++ b/drivers/net/octeon_ep/cnxk_ep_rx_avx.c
@@ -49,7 +49,7 @@ cnxk_ep_process_pkts_vec_avx(struct rte_mbuf **rx_pkts, 
struct otx_ep_droq *droq
/* Load rearm data and packet length for shuffle. */
for (i = 0; i < CNXK_EP_OQ_DESC_PER_LOOP_AVX; i++)
data[i] = _mm256_set_epi64x(0,
-   rte_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)->length >> 16,
+   cnxk_pktmbuf_mtod(m[i], struct otx_ep_droq_info 
*)->length >> 16,
0, rearm_data);

/* Shuffle data to its place and sum the packet length. */
@@ -81,15 +81,15 @@ cnxk_ep_recv_pkts_avx(void *rx_queue, struct rte_mbuf 
**rx_pkts, uint16_t nb_pkt
struct otx_ep_droq *droq = (struct otx_ep_droq *)rx_queue;
uint16_t new_pkts, vpkts;

+   /* Refill RX buffers */
+   if (droq->refill_count >= DROQ_REFILL_THRESHOLD)
+   cnxk_ep_rx_refill(droq);
+
new_pkts = cnxk_ep_rx_pkts_to_process(droq, nb_pkts);
vpkts = RTE_ALIGN_FLOOR(new_pkts, CNXK_EP_OQ_DESC_PER_LOOP_AVX);
   

[PATCH v3 2/3] config/arm: add support for fallback march

2024-02-02 Thread pbhagavatula
From: Pavan Nikhilesh 

Some ARM CPUs have specific march requirements and
are not compatible with the supported march list.
Add fallback march in case the mcpu and the march
advertised in the part_number_config are not supported
by the compiler.

Example
mcpu = neoverse-n2
march = armv9-a
fallback_march = armv8.5-a

mcpu, march not supported
machine_args = ['-march=armv8.5-a']

mcpu, march, fallback_march not supported
least march supported = armv8-a

machine_args = ['-march=armv8-a']

Signed-off-by: Pavan Nikhilesh 
---
 config/arm/meson.build | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index ba859bd060b5..4e44d1850bae 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -94,6 +94,7 @@ part_number_config_arm = {
 '0xd49': {
 'march': 'armv9-a',
 'march_features': ['sve2'],
+'fallback_march': 'armv8.5-a',
 'mcpu': 'neoverse-n2',
 'flags': [
 ['RTE_MACHINE', '"neoverse-n2"'],
@@ -708,6 +709,7 @@ if update_flags
 
 # probe supported archs and their features
 candidate_march = ''
+fallback_march = ''
 if part_number_config.has_key('march')
 if part_number_config.get('force_march', false) or candidate_mcpu != ''
 if cc.has_argument('-march=' +  part_number_config['march'])
@@ -728,10 +730,18 @@ if update_flags
 # highest supported march version found
 break
 endif
+if (part_number_config.has_key('fallback_march') and
+supported_march == part_number_config['fallback_march'] and
+cc.has_argument('-march=' + supported_march))
+fallback_march = supported_march
+endif
 endforeach
 endif
 
 if candidate_march != part_number_config['march']
+if fallback_march != ''
+candidate_march = fallback_march
+endif
 warning('Configuration march version is @0@, not supported.'
 .format(part_number_config['march']))
 if candidate_march != ''
-- 
2.43.0



Re: [PATCH v2 11/11] eventdev: RFC clarify docs on event object fields

2024-02-02 Thread Bruce Richardson
On Thu, Feb 01, 2024 at 05:02:44PM +, Bruce Richardson wrote:
> On Wed, Jan 24, 2024 at 12:34:50PM +0100, Mattias Rönnblom wrote:
> > On 2024-01-19 18:43, Bruce Richardson wrote:
> > > Clarify the meaning of the NEW, FORWARD and RELEASE event types.
> > > For the fields in "rte_event" struct, enhance the comments on each to
> > > clarify the field's use, and whether it is preserved between enqueue and
> > > dequeue, and it's role, if any, in scheduling.
> > > 
> > > Signed-off-by: Bruce Richardson 
> > > ---
> > > 
> > > As with the previous patch, please review this patch to ensure that the
> > > expected semantics of the various event types and event fields have not
> > > changed in an unexpected way.
> > > ---
> > >   lib/eventdev/rte_eventdev.h | 105 ++--
> > >   1 file changed, 77 insertions(+), 28 deletions(-)
> > > 
> > > diff --git a/lib/eventdev/rte_eventdev.h b/lib/eventdev/rte_eventdev.h
> > > index cb13602ffb..4eff1c4958 100644
> > > --- a/lib/eventdev/rte_eventdev.h
> > > +++ b/lib/eventdev/rte_eventdev.h
> 
> 
> > >   /**
> > > @@ -1473,53 +1475,100 @@ struct rte_event {
> > >   /**< Targeted flow identifier for the enqueue 
> > > and
> > >* dequeue operation.
> > >* The value must be in the range of
> > > -  * [0, nb_event_queue_flows - 1] which
> > > +  * [0, @ref rte_event_dev_config.nb_event_queue_flows - 
> > > 1] which
> > 
> > The same comment as I had before about ranges for unsigned types.
> > 
> Actually, is this correct, does a range actually apply here? 
> 
> I thought that the number of queue flows supported was a guide as to how
> internal HW resources were to be allocated, and that the flow_id was always
> a 20-bit value, where it was up to the scheduler to work out how to map
> that to internal atomic locks (when combined with queue ids etc.). It
> should not be up to the app to have to do the range limiting itself!
> 
Looking at the RX adapter in eventdev, I don't see any obvious clamping of
the flow ids to the range of 0-nb_event_queue_flows, though I'm not that
familiar with that code, so I may have missed something. If I'm right,
it looks like this doc line may indeed by a mistake.

@Jerin, can you comment again here. Is flow_id really meant to be limited
to the specified range, or is it a full 20-bit value supplied in all cases?

Thanks,
/Bruce


  1   2   >