Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread David Marchand
Hello Bruce,

On Fri, Aug 25, 2023 at 5:29 PM Bruce Richardson
 wrote:
>
> When doing a build for a system with WAITPKG support and a modern
> compiler, we get build errors for the "_umonitor" intrinsic, due to the
> casting away of the "volatile" on the parameter.
>
> ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> of '_umonitor' discards 'volatile' qualifier from pointer target type
> [-Werror=discarded-qualifiers]
>   113 | _umonitor(pmc->addr);
> |   ~~~^~
>
> We can avoid this issue by using RTE_PTR_ADD(..., 0) to cast the pointer
> through "uintptr_t" and thereby remove the volatile without warning.
> We also ensure comments are correct for each leg of the
> ifdef..else..endif block.
>
> Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> Cc: roret...@linux.microsoft.com
>
> Signed-off-by: Bruce Richardson 

I'm looking for a system with WAITPKG in the RH lab.. so far, no luck.
Do you have a way to force-reproduce this issue? Like some compiler
options forcing support?


-- 
David Marchand



Re: [RFC] cache guard

2023-08-28 Thread Bruce Richardson
On Sun, Aug 27, 2023 at 05:40:33PM +0200, Morten Brørup wrote:
> > From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> > Sent: Sunday, 27 August 2023 15.55
> > 
> > On 2023-08-27 10:34, Morten Brørup wrote:
> > > +CC Honnappa and Konstantin, Ring lib maintainers
> > > +CC Mattias, PRNG lib maintainer
> > >
> > >> From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> > >> Sent: Friday, 25 August 2023 11.24
> > >>
> > >> On Fri, Aug 25, 2023 at 11:06:01AM +0200, Morten Brørup wrote:
> > >>> +CC mempool maintainers
> > >>>
> >  From: Bruce Richardson [mailto:bruce.richard...@intel.com]
> >  Sent: Friday, 25 August 2023 10.23
> > 
> >  On Fri, Aug 25, 2023 at 08:45:12AM +0200, Morten Brørup wrote:
> > > Bruce,
> > >
> > > With this patch [1], it is noted that the ring producer and
> > >> consumer data
> >  should not be on adjacent cache lines, for performance reasons.
> > >
> > > [1]:
> > 
> > >>
> > https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring.h?id=d9f0d3a1f
> > >> fd4b66
> >  e75485cc8b63b9aedfbdfe8b0
> > >
> > > (It's obvious that they cannot share the same cache line, because
> > >> they are
> >  accessed by two different threads.)
> > >
> > > Intuitively, I would think that having them on different cache
> > >> lines would
> >  suffice. Why does having an empty cache line between them make a
> > >> difference?
> > >
> > > And does it need to be an empty cache line? Or does it suffice
> > >> having the
> >  second structure start at two cache lines after the start of the
> > >> first
> >  structure (e.g. if the size of the first structure is two cache
> > >> lines)?
> > >
> > > I'm asking because the same principle might apply to other code
> > >> too.
> > >
> >  Hi Morten,
> > 
> >  this was something we discovered when working on the distributor
> > >> library.
> >  If we have cachelines per core where there is heavy access, having
> > >> some
> >  cachelines as a gap between the content cachelines can help
> > >> performance. We
> >  believe this helps due to avoiding issues with the HW prefetchers
> > >> (e.g.
> >  adjacent cacheline prefetcher) bringing in the second cacheline
> >  speculatively when an operation is done on the first line.
> > >>>
> > >>> I guessed that it had something to do with speculative prefetching,
> > >> but wasn't sure. Good to get confirmation, and that it has a
> > measureable
> > >> effect somewhere. Very interesting!
> > >>>
> > >>> NB: More comments in the ring lib about stuff like this would be
> > nice.
> > >>>
> > >>> So, for the mempool lib, what do you think about applying the same
> > >> technique to the rte_mempool_debug_stats structure (which is an array
> > >> indexed per lcore)... Two adjacent lcores heavily accessing their
> > local
> > >> mempool caches seems likely to me. But how heavy does the access need
> > to
> > >> be for this technique to be relevant?
> > >>>
> > >>
> > >> No idea how heavy the accesses need to be for this to have a
> > noticable
> > >> effect. For things like debug stats, I wonder how worthwhile making
> > such
> > >> a
> > >> change would be, but then again, any change would have very low
> > impact
> > >> too
> > >> in that case.
> > >
> > > I just tried adding padding to some of the hot structures in our own
> > application, and observed a significant performance improvement for
> > those.
> > >
> > > So I think this technique should have higher visibility in DPDK by
> > adding a new cache macro to rte_common.h:
> > >
> > > /**
> > >   * Empty cache line, to guard against speculative prefetching.
> > >   *
> > 
> > "to guard against false sharing-like effects on systems with a
> > next-N-lines hardware prefetcher"
> > 
> > >   * Use as spacing between data accessed by different lcores,
> > >   * to prevent cache thrashing on CPUs with speculative prefetching.
> > >   */
> > > #define RTE_CACHE_GUARD(name) char
> > cache_guard_##name[RTE_CACHE_LINE_SIZE] __rte_cache_aligned;
> > >
> > 
> > You could have a macro which specified how much guarding there needs to
> > be, ideally defined on a per-CPU basis. (These things has nothing to do
> > with the ISA, but everything to do with the implementation.)
> > 
> > I'm not sure N is always 1.
> > 
> > So the guard padding should be RTE_CACHE_LINE_SIZE *
> > RTE_CACHE_GUARD_LINES bytes, and wrap the whole thing in
> > #if RTE_CACHE_GUARD_LINES > 0
> > #endif
> > 
> > ...so you can disable this (cute!) hack (on custom DPDK builds) in case
> > you have disabled hardware prefetching, which seems generally to be a
> > good idea for packet processing type applications.
> > 
> > ...which leads me to another suggestions: add a note on disabling
> > hardware prefetching in the optimization guide.
> > 
> > Seems like a very good idea to have this in , and
> > otherwise make this issue visible and known.
> 
> Good points, Mattias!
> 
> I also pref

Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread David Marchand
On Mon, Aug 28, 2023 at 9:08 AM David Marchand
 wrote:
>
> Hello Bruce,
>
> On Fri, Aug 25, 2023 at 5:29 PM Bruce Richardson
>  wrote:
> >
> > When doing a build for a system with WAITPKG support and a modern
> > compiler, we get build errors for the "_umonitor" intrinsic, due to the
> > casting away of the "volatile" on the parameter.
> >
> > ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> > ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> > of '_umonitor' discards 'volatile' qualifier from pointer target type
> > [-Werror=discarded-qualifiers]
> >   113 | _umonitor(pmc->addr);
> > |   ~~~^~
> >
> > We can avoid this issue by using RTE_PTR_ADD(..., 0) to cast the pointer
> > through "uintptr_t" and thereby remove the volatile without warning.
> > We also ensure comments are correct for each leg of the
> > ifdef..else..endif block.
> >
> > Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> > Cc: roret...@linux.microsoft.com
> >
> > Signed-off-by: Bruce Richardson 
>
> I'm looking for a system with WAITPKG in the RH lab.. so far, no luck.
> Do you have a way to force-reproduce this issue? Like some compiler
> options forcing support?

Ah.. reproduced with -Dcpu_instruction_set=sapphirerapids

[52/241] Compiling C object lib/librte_eal.a.p/eal_x86_rte_power_intrinsics.c.o
../lib/eal/x86/rte_power_intrinsics.c: In function ‘rte_power_monitor’:
../lib/eal/x86/rte_power_intrinsics.c:113:22: warning: passing
argument 1 of ‘_umonitor’ discards ‘volatile’ qualifier from pointer
target type [-Wdiscarded-qualifiers]
  113 | _umonitor(pmc->addr);
  |   ~~~^~
In file included from
/usr/lib/gcc/x86_64-redhat-linux/11/include/x86gprintrin.h:89,
 from
/usr/lib/gcc/x86_64-redhat-linux/11/include/immintrin.h:27,
 from ../lib/eal/x86/include/rte_rtm.h:8,
 from ../lib/eal/x86/rte_power_intrinsics.c:7:
/usr/lib/gcc/x86_64-redhat-linux/11/include/waitpkgintrin.h:39:18:
note: expected ‘void *’ but argument is of type ‘volatile void *’
   39 | _umonitor (void *__A)
  |~~^~~
[241/241] Linking target lib/librte_ethdev.so.24.0


-- 
David Marchand



Re: [RFC] cache guard

2023-08-28 Thread Mattias Rönnblom

On 2023-08-28 08:32, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Monday, 28 August 2023 00.31

On 2023-08-27 17:40, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
Sent: Sunday, 27 August 2023 15.55

On 2023-08-27 10:34, Morten Brørup wrote:

+CC Honnappa and Konstantin, Ring lib maintainers
+CC Mattias, PRNG lib maintainer


From: Bruce Richardson [mailto:bruce.richard...@intel.com]
Sent: Friday, 25 August 2023 11.24

On Fri, Aug 25, 2023 at 11:06:01AM +0200, Morten Brørup wrote:

+CC mempool maintainers


From: Bruce Richardson [mailto:bruce.richard...@intel.com]
Sent: Friday, 25 August 2023 10.23

On Fri, Aug 25, 2023 at 08:45:12AM +0200, Morten Brørup wrote:

Bruce,

With this patch [1], it is noted that the ring producer and

consumer data

should not be on adjacent cache lines, for performance reasons.


[1]:





https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring.h?id=d9f0d3a1f

fd4b66

e75485cc8b63b9aedfbdfe8b0


(It's obvious that they cannot share the same cache line, because

they are

accessed by two different threads.)


Intuitively, I would think that having them on different cache

lines would

suffice. Why does having an empty cache line between them make a

difference?


And does it need to be an empty cache line? Or does it suffice

having the

second structure start at two cache lines after the start of the

first

structure (e.g. if the size of the first structure is two cache

lines)?


I'm asking because the same principle might apply to other code

too.



Hi Morten,

this was something we discovered when working on the distributor

library.

If we have cachelines per core where there is heavy access, having

some

cachelines as a gap between the content cachelines can help

performance. We

believe this helps due to avoiding issues with the HW prefetchers

(e.g.

adjacent cacheline prefetcher) bringing in the second cacheline
speculatively when an operation is done on the first line.


I guessed that it had something to do with speculative prefetching,

but wasn't sure. Good to get confirmation, and that it has a

measureable

effect somewhere. Very interesting!


NB: More comments in the ring lib about stuff like this would be

nice.


So, for the mempool lib, what do you think about applying the same

technique to the rte_mempool_debug_stats structure (which is an array
indexed per lcore)... Two adjacent lcores heavily accessing their

local

mempool caches seems likely to me. But how heavy does the access need

to

be for this technique to be relevant?




No idea how heavy the accesses need to be for this to have a

noticable

effect. For things like debug stats, I wonder how worthwhile making

such

a
change would be, but then again, any change would have very low

impact

too
in that case.


I just tried adding padding to some of the hot structures in our own

application, and observed a significant performance improvement for
those.


So I think this technique should have higher visibility in DPDK by

adding a new cache macro to rte_common.h:


/**
* Empty cache line, to guard against speculative prefetching.
*


"to guard against false sharing-like effects on systems with a
next-N-lines hardware prefetcher"


* Use as spacing between data accessed by different lcores,
* to prevent cache thrashing on CPUs with speculative prefetching.
*/
#define RTE_CACHE_GUARD(name) char

cache_guard_##name[RTE_CACHE_LINE_SIZE] __rte_cache_aligned;




You could have a macro which specified how much guarding there needs to
be, ideally defined on a per-CPU basis. (These things has nothing to do
with the ISA, but everything to do with the implementation.)

I'm not sure N is always 1.

So the guard padding should be RTE_CACHE_LINE_SIZE *
RTE_CACHE_GUARD_LINES bytes, and wrap the whole thing in
#if RTE_CACHE_GUARD_LINES > 0
#endif

...so you can disable this (cute!) hack (on custom DPDK builds) in case
you have disabled hardware prefetching, which seems generally to be a
good idea for packet processing type applications.

...which leads me to another suggestions: add a note on disabling
hardware prefetching in the optimization guide.

Seems like a very good idea to have this in , and
otherwise make this issue visible and known.


Good points, Mattias!

I also prefer the name-less macro you suggested below.

So, this gets added to rte_common.h:

/**
   * Empty cache lines, to guard against false sharing-like effects
   * on systems with a next-N-lines hardware prefetcher.
   *
   * Use as spacing between data accessed by different lcores,
   * to prevent cache thrashing on hardware with speculative prefetching.
   */
#if RTE_CACHE_GUARD_LINES > 0
#define _RTE_CACHE_GUARD_HELPER2(unique) \
  char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *

RTE_CACHE_GUARD_LINES] \

  __rte_cache_aligned;
#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
#define RTE_CACHE_GUARD _RT

Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread David Marchand
On Fri, Aug 25, 2023 at 5:29 PM Bruce Richardson
 wrote:
>
> When doing a build for a system with WAITPKG support and a modern
> compiler, we get build errors for the "_umonitor" intrinsic, due to the
> casting away of the "volatile" on the parameter.
>
> ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> of '_umonitor' discards 'volatile' qualifier from pointer target type
> [-Werror=discarded-qualifiers]
>   113 | _umonitor(pmc->addr);
> |   ~~~^~
>
> We can avoid this issue by using RTE_PTR_ADD(..., 0) to cast the pointer
> through "uintptr_t" and thereby remove the volatile without warning.

As Morten, I prefer an explicit cast (keeping your comments) as it
seems we are exploiting an implementation detail of RTE_PTR_ADD.


> We also ensure comments are correct for each leg of the
> ifdef..else..endif block.

Thanks.. I had fixed other places but I have missed this one.


>
> Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> Cc: roret...@linux.microsoft.com
>
> Signed-off-by: Bruce Richardson 
> ---
>  lib/eal/x86/rte_power_intrinsics.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/eal/x86/rte_power_intrinsics.c 
> b/lib/eal/x86/rte_power_intrinsics.c
> index 4066d1392e..4f0404bfb8 100644
> --- a/lib/eal/x86/rte_power_intrinsics.c
> +++ b/lib/eal/x86/rte_power_intrinsics.c
> @@ -103,15 +103,15 @@ rte_power_monitor(const struct rte_power_monitor_cond 
> *pmc,
> rte_spinlock_lock(&s->lock);
> s->monitor_addr = pmc->addr;
>
> -   /*
> -* we're using raw byte codes for now as only the newest compiler
> -* versions support this instruction natively.
> -*/
> -
> /* set address for UMONITOR */
>  #if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> -   _umonitor(pmc->addr);
> +   /* use RTE_PTR_ADD to cast away "volatile" when using the intrinsic */
> +   _umonitor(RTE_PTR_ADD(pmc->addr, 0));
>  #else
> +   /*
> +* we're using raw byte codes for compiler versions which
> +* don't support this instruction natively.
> +*/
> asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> :
> : "D"(pmc->addr));

Tested-by: David Marchand 

An additional question, would Intel CI catch such issue?
Or was it caught only because you are blessed with bleeding edge hw? :-)


-- 
David Marchand



Re: [PATCH] build: add pdcp to optional libs

2023-08-28 Thread David Marchand
On Fri, Aug 25, 2023 at 4:19 PM Bruce Richardson
 wrote:
>
> The pdcp library is disabled when its dependent library "reorder" is not
> being built.
>
> ../lib/meson.build:179: WARNING: Cannot disable mandatory library "pdcp"
> Message: Disabling pdcp [lib/pdcp]: missing internal dependency "reorder"
>
> As such, it is not a mandatory library, and can be marked as safe to
> disable.
>
> Signed-off-by: Bruce Richardson 

Acked-by: David Marchand 


-- 
David Marchand



Re: [PATCH] build: make crypto libraries optional

2023-08-28 Thread David Marchand
On Fri, Aug 25, 2023 at 5:10 PM Bruce Richardson
 wrote:
>
> Cryptodev and the libraries which depend on it can be made optional,
> as they can be disabled without breaking the build.
>
> Signed-off-by: Bruce Richardson 

Acked-by: David Marchand 


-- 
David Marchand



RE: [RFC] cache guard

2023-08-28 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> Sent: Monday, 28 August 2023 10.46
> 
> On 2023-08-28 08:32, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> >> Sent: Monday, 28 August 2023 00.31
> >>
> >> On 2023-08-27 17:40, Morten Brørup wrote:
>  From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
>  Sent: Sunday, 27 August 2023 15.55

[...]

> >>> So, this gets added to rte_common.h:
> >>>
> >>> /**
> >>>* Empty cache lines, to guard against false sharing-like effects
> >>>* on systems with a next-N-lines hardware prefetcher.
> >>>*
> >>>* Use as spacing between data accessed by different lcores,
> >>>* to prevent cache thrashing on hardware with speculative
> prefetching.
> >>>*/
> >>> #if RTE_CACHE_GUARD_LINES > 0
> >>> #define _RTE_CACHE_GUARD_HELPER2(unique) \
> >>>   char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> >> RTE_CACHE_GUARD_LINES] \
> >>>   __rte_cache_aligned;
> >>> #define _RTE_CACHE_GUARD_HELPER1(unique)
> _RTE_CACHE_GUARD_HELPER2(unique)
> >>> #define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)
> >>> #else
> >>> #define RTE_CACHE_GUARD
> >>> #endif
> >>>
> >>
> >> Seems like a good solution. I thought as far as using __LINE__ to
> build
> >> a unique name, but __COUNTER__ is much cleaner, provided it's
> available
> >> in relevant compilers. (It's not in C11.)
> >
> > I considered __LINE__ too, but came to the same conclusion...
> __COUNTER__ is cleaner for this purpose.
> >
> > And since __COUNTER__ is being used elsewhere in DPDK, I assume it is
> available for use here too.
> >
> > If it turns out causing problems, we can easily switch to __LINE__
> instead.
> >
> >>
> >> Should the semicolon be included or not in HELPER2? If left out, a
> >> lonely ";" will be left for RTE_CACHE_GUARD_LINES == 0, but I don't
> >> think that is a problem.
> >
> > I tested it on Godbolt, and the lonely ";" in a struct didn't seem to
> be a problem.
> >
> > With the semicolon in HELPER2, there will be a lonely ";" in the
> struct in both cases, i.e. with and without cache guards enabled.
> >
> >>
> >> I don't see why __rte_cache_aligned is needed here. The adjacent
> struct
> >> must be cache-line aligned. Maybe it makes it more readable, having
> the
> >> explicit guard padding starting at the start of the actual guard
> cache
> >> lines, rather than potentially at some earlier point before, and
> having
> >> non-guard padding at the end of the struct (from __rte_cache_aligned
> on
> >> the struct level).
> >
> > Having both __rte_cache_aligned and the char array with full cache
> lines ensures that the guard field itself is on its own separate cache
> line, regardless of the organization of adjacent fields in the struct.
> E.g. this will also work:
> >
> > struct test {
> >  char x;
> >  RTE_CACHE_GUARD;
> >  char y;
> > };
> >
> 
> That struct declaration is broken, since it will create false sharing
> between x and y, in case RTE_CACHE_GUARD_LINES is defined to 0.
> 
> Maybe the most intuitive function (semantics) of the RTE_CACHE_GUARD
> macro would be have it deal exclusively with the issue resulting from
> next-N-line (and similar) hardware prefetching, and leave
> __rte_cache_aligned to deal with "classic" (same-cache line) false
> sharing.

Excellent review feedback!

I only thought of the cache guard as a means to provide spacing between 
elements where the developer already prevented (same-cache line) false sharing 
by some other means. I didn't even consider the alternative interpretation of 
its purpose.

Your feedback leaves no doubt that we should extend the cache guard's purpose 
to also enforce cache alignment (under all circumstances, also when 
RTE_CACHE_GUARD_LINES is 0).

> 
> Otherwise you would have to have something like
> 
> struct test
> {
>   char x;
>   RTE_CACHE_GUARD(char, y);
> };
> 
> ...so that 'y' can be made __rte_cache_aligned by the macro.

There's an easier solution...

We can copy the concept from the RTE_MARKER type, which uses a zero-length 
array. By simply omitting the #if RTE_CACHE_GUARD_LINES > 0, the macro will 
serve both purposes:

#define _RTE_CACHE_GUARD_HELPER2(unique) \
char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE * 
RTE_CACHE_GUARD_LINES] \
__rte_cache_aligned;
#define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
#define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)

I have verified on Godbolt that this works. The memif driver also uses 
RTE_MARKER this way [1].

[1]: 
https://elixir.bootlin.com/dpdk/latest/source/drivers/net/memif/memif.h#L173

> 
> RTE_HW_PREFETCH_GUARD could be an alternative name, but I think I like
> RTE_CACHE_GUARD better.
> 

When the macro serves both purposes (regardless of the value of 
RTE_CACHE_GUARD_LINES), I think we can stick with the RTE_CACHE_GUARD name.




Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Bruce Richardson
On Mon, Aug 28, 2023 at 11:29:05AM +0200, David Marchand wrote:
> On Fri, Aug 25, 2023 at 5:29 PM Bruce Richardson
>  wrote:
> >
> > When doing a build for a system with WAITPKG support and a modern
> > compiler, we get build errors for the "_umonitor" intrinsic, due to the
> > casting away of the "volatile" on the parameter.
> >
> > ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> > ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> > of '_umonitor' discards 'volatile' qualifier from pointer target type
> > [-Werror=discarded-qualifiers]
> >   113 | _umonitor(pmc->addr);
> > |   ~~~^~
> >
> > We can avoid this issue by using RTE_PTR_ADD(..., 0) to cast the pointer
> > through "uintptr_t" and thereby remove the volatile without warning.
> 
> As Morten, I prefer an explicit cast (keeping your comments) as it
> seems we are exploiting an implementation detail of RTE_PTR_ADD.
> 

Ok, I'll do a respin with explicit cast.

> 
> > We also ensure comments are correct for each leg of the
> > ifdef..else..endif block.
> 
> Thanks.. I had fixed other places but I have missed this one.
> 
> 
> >
> > Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> > Cc: roret...@linux.microsoft.com
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> >  lib/eal/x86/rte_power_intrinsics.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/eal/x86/rte_power_intrinsics.c 
> > b/lib/eal/x86/rte_power_intrinsics.c
> > index 4066d1392e..4f0404bfb8 100644
> > --- a/lib/eal/x86/rte_power_intrinsics.c
> > +++ b/lib/eal/x86/rte_power_intrinsics.c
> > @@ -103,15 +103,15 @@ rte_power_monitor(const struct rte_power_monitor_cond 
> > *pmc,
> > rte_spinlock_lock(&s->lock);
> > s->monitor_addr = pmc->addr;
> >
> > -   /*
> > -* we're using raw byte codes for now as only the newest compiler
> > -* versions support this instruction natively.
> > -*/
> > -
> > /* set address for UMONITOR */
> >  #if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> > -   _umonitor(pmc->addr);
> > +   /* use RTE_PTR_ADD to cast away "volatile" when using the intrinsic 
> > */
> > +   _umonitor(RTE_PTR_ADD(pmc->addr, 0));
> >  #else
> > +   /*
> > +* we're using raw byte codes for compiler versions which
> > +* don't support this instruction natively.
> > +*/
> > asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > :
> > : "D"(pmc->addr));
> 
> Tested-by: David Marchand 
> 
> An additional question, would Intel CI catch such issue?
> Or was it caught only because you are blessed with bleeding edge hw? :-)
> 
Not sure. I would hope so, though.

/Bruce


[PATCH v2] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Bruce Richardson
When doing a build for a system with WAITPKG support and a modern
compiler, we get build errors for the "_umonitor" intrinsic, due to the
casting away of the "volatile" on the parameter.

../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
of '_umonitor' discards 'volatile' qualifier from pointer target type
[-Werror=discarded-qualifiers]
  113 | _umonitor(pmc->addr);
|   ~~~^~

We can avoid this issue by casting through "uintptr_t" and thereby
remove the volatile without warning.  We also ensure comments are
correct for each leg of the ifdef..else..endif block.

Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
Cc: roret...@linux.microsoft.com

Signed-off-by: Bruce Richardson 
Acked-by: Morten Brørup 
Tested-by: David Marchand 
---
 lib/eal/x86/rte_power_intrinsics.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c 
b/lib/eal/x86/rte_power_intrinsics.c
index 4066d1392e..97202e42fc 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -103,15 +103,15 @@ rte_power_monitor(const struct rte_power_monitor_cond 
*pmc,
rte_spinlock_lock(&s->lock);
s->monitor_addr = pmc->addr;
 
-   /*
-* we're using raw byte codes for now as only the newest compiler
-* versions support this instruction natively.
-*/
-
/* set address for UMONITOR */
 #if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
-   _umonitor(pmc->addr);
+   /* cast away "volatile" when using the intrinsic */
+   _umonitor((void *)(uintptr_t)pmc->addr);
 #else
+   /*
+* we're using raw byte codes for compiler versions which
+* don't support this instruction natively.
+*/
asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
:
: "D"(pmc->addr));
-- 
2.39.2



Re: [RFC] cache guard

2023-08-28 Thread Stephen Hemminger
A quick hack might just to increase cache line size as experiment

On Mon, Aug 28, 2023, 11:54 AM Morten Brørup 
wrote:

> > From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> > Sent: Monday, 28 August 2023 10.46
> >
> > On 2023-08-28 08:32, Morten Brørup wrote:
> > >> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> > >> Sent: Monday, 28 August 2023 00.31
> > >>
> > >> On 2023-08-27 17:40, Morten Brørup wrote:
> >  From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> >  Sent: Sunday, 27 August 2023 15.55
>
> [...]
>
> > >>> So, this gets added to rte_common.h:
> > >>>
> > >>> /**
> > >>>* Empty cache lines, to guard against false sharing-like effects
> > >>>* on systems with a next-N-lines hardware prefetcher.
> > >>>*
> > >>>* Use as spacing between data accessed by different lcores,
> > >>>* to prevent cache thrashing on hardware with speculative
> > prefetching.
> > >>>*/
> > >>> #if RTE_CACHE_GUARD_LINES > 0
> > >>> #define _RTE_CACHE_GUARD_HELPER2(unique) \
> > >>>   char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> > >> RTE_CACHE_GUARD_LINES] \
> > >>>   __rte_cache_aligned;
> > >>> #define _RTE_CACHE_GUARD_HELPER1(unique)
> > _RTE_CACHE_GUARD_HELPER2(unique)
> > >>> #define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)
> > >>> #else
> > >>> #define RTE_CACHE_GUARD
> > >>> #endif
> > >>>
> > >>
> > >> Seems like a good solution. I thought as far as using __LINE__ to
> > build
> > >> a unique name, but __COUNTER__ is much cleaner, provided it's
> > available
> > >> in relevant compilers. (It's not in C11.)
> > >
> > > I considered __LINE__ too, but came to the same conclusion...
> > __COUNTER__ is cleaner for this purpose.
> > >
> > > And since __COUNTER__ is being used elsewhere in DPDK, I assume it is
> > available for use here too.
> > >
> > > If it turns out causing problems, we can easily switch to __LINE__
> > instead.
> > >
> > >>
> > >> Should the semicolon be included or not in HELPER2? If left out, a
> > >> lonely ";" will be left for RTE_CACHE_GUARD_LINES == 0, but I don't
> > >> think that is a problem.
> > >
> > > I tested it on Godbolt, and the lonely ";" in a struct didn't seem to
> > be a problem.
> > >
> > > With the semicolon in HELPER2, there will be a lonely ";" in the
> > struct in both cases, i.e. with and without cache guards enabled.
> > >
> > >>
> > >> I don't see why __rte_cache_aligned is needed here. The adjacent
> > struct
> > >> must be cache-line aligned. Maybe it makes it more readable, having
> > the
> > >> explicit guard padding starting at the start of the actual guard
> > cache
> > >> lines, rather than potentially at some earlier point before, and
> > having
> > >> non-guard padding at the end of the struct (from __rte_cache_aligned
> > on
> > >> the struct level).
> > >
> > > Having both __rte_cache_aligned and the char array with full cache
> > lines ensures that the guard field itself is on its own separate cache
> > line, regardless of the organization of adjacent fields in the struct.
> > E.g. this will also work:
> > >
> > > struct test {
> > >  char x;
> > >  RTE_CACHE_GUARD;
> > >  char y;
> > > };
> > >
> >
> > That struct declaration is broken, since it will create false sharing
> > between x and y, in case RTE_CACHE_GUARD_LINES is defined to 0.
> >
> > Maybe the most intuitive function (semantics) of the RTE_CACHE_GUARD
> > macro would be have it deal exclusively with the issue resulting from
> > next-N-line (and similar) hardware prefetching, and leave
> > __rte_cache_aligned to deal with "classic" (same-cache line) false
> > sharing.
>
> Excellent review feedback!
>
> I only thought of the cache guard as a means to provide spacing between
> elements where the developer already prevented (same-cache line) false
> sharing by some other means. I didn't even consider the alternative
> interpretation of its purpose.
>
> Your feedback leaves no doubt that we should extend the cache guard's
> purpose to also enforce cache alignment (under all circumstances, also when
> RTE_CACHE_GUARD_LINES is 0).
>
> >
> > Otherwise you would have to have something like
> >
> > struct test
> > {
> >   char x;
> >   RTE_CACHE_GUARD(char, y);
> > };
> >
> > ...so that 'y' can be made __rte_cache_aligned by the macro.
>
> There's an easier solution...
>
> We can copy the concept from the RTE_MARKER type, which uses a zero-length
> array. By simply omitting the #if RTE_CACHE_GUARD_LINES > 0, the macro will
> serve both purposes:
>
> #define _RTE_CACHE_GUARD_HELPER2(unique) \
> char cache_guard_ ## unique[RTE_CACHE_LINE_SIZE *
> RTE_CACHE_GUARD_LINES] \
> __rte_cache_aligned;
> #define _RTE_CACHE_GUARD_HELPER1(unique) _RTE_CACHE_GUARD_HELPER2(unique)
> #define RTE_CACHE_GUARD _RTE_CACHE_GUARD_HELPER1(__COUNTER__)
>
> I have verified on Godbolt that this works. The memif driver also uses
> RTE_MARKER this way [1].
>
> [1]:
> https://elixir.bootlin.com/dpdk/latest/source/dr

Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Stephen Hemminger
For humor
#define RTE_CASTAWAY(x) ((void *)(uinptr_t)(x))

On Mon, Aug 28, 2023, 12:29 PM Bruce Richardson 
wrote:

> On Mon, Aug 28, 2023 at 11:29:05AM +0200, David Marchand wrote:
> > On Fri, Aug 25, 2023 at 5:29 PM Bruce Richardson
> >  wrote:
> > >
> > > When doing a build for a system with WAITPKG support and a modern
> > > compiler, we get build errors for the "_umonitor" intrinsic, due to the
> > > casting away of the "volatile" on the parameter.
> > >
> > > ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> > > ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> > > of '_umonitor' discards 'volatile' qualifier from pointer target type
> > > [-Werror=discarded-qualifiers]
> > >   113 | _umonitor(pmc->addr);
> > > |   ~~~^~
> > >
> > > We can avoid this issue by using RTE_PTR_ADD(..., 0) to cast the
> pointer
> > > through "uintptr_t" and thereby remove the volatile without warning.
> >
> > As Morten, I prefer an explicit cast (keeping your comments) as it
> > seems we are exploiting an implementation detail of RTE_PTR_ADD.
> >
>
> Ok, I'll do a respin with explicit cast.
>
> >
> > > We also ensure comments are correct for each leg of the
> > > ifdef..else..endif block.
> >
> > Thanks.. I had fixed other places but I have missed this one.
> >
> >
> > >
> > > Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> > > Cc: roret...@linux.microsoft.com
> > >
> > > Signed-off-by: Bruce Richardson 
> > > ---
> > >  lib/eal/x86/rte_power_intrinsics.c | 12 ++--
> > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/lib/eal/x86/rte_power_intrinsics.c
> b/lib/eal/x86/rte_power_intrinsics.c
> > > index 4066d1392e..4f0404bfb8 100644
> > > --- a/lib/eal/x86/rte_power_intrinsics.c
> > > +++ b/lib/eal/x86/rte_power_intrinsics.c
> > > @@ -103,15 +103,15 @@ rte_power_monitor(const struct
> rte_power_monitor_cond *pmc,
> > > rte_spinlock_lock(&s->lock);
> > > s->monitor_addr = pmc->addr;
> > >
> > > -   /*
> > > -* we're using raw byte codes for now as only the newest
> compiler
> > > -* versions support this instruction natively.
> > > -*/
> > > -
> > > /* set address for UMONITOR */
> > >  #if defined(RTE_TOOLCHAIN_MSVC) || defined(__WAITPKG__)
> > > -   _umonitor(pmc->addr);
> > > +   /* use RTE_PTR_ADD to cast away "volatile" when using the
> intrinsic */
> > > +   _umonitor(RTE_PTR_ADD(pmc->addr, 0));
> > >  #else
> > > +   /*
> > > +* we're using raw byte codes for compiler versions which
> > > +* don't support this instruction natively.
> > > +*/
> > > asm volatile(".byte 0xf3, 0x0f, 0xae, 0xf7;"
> > > :
> > > : "D"(pmc->addr));
> >
> > Tested-by: David Marchand 
> >
> > An additional question, would Intel CI catch such issue?
> > Or was it caught only because you are blessed with bleeding edge hw? :-)
> >
> Not sure. I would hope so, though.
>
> /Bruce
>


Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Bruce Richardson
On Mon, Aug 28, 2023 at 12:42:38PM +0200, Stephen Hemminger wrote:
>For humor
>#define RTE_CASTAWAY(x) ((void *)(uinptr_t)(x))

Yes, actually thought about that. Was also wondering about making it an
inline function rather than macro, to ensure its only used on pointers, and
to make clear what is being cast away:

static inline void *
rte_cast_no_volatile(volatile void *x)
{
return (void *)(uintptr_t)(x);
}

and similarly we could do a rte_cast_no_const(const void *x).

WDYT?

/Bruce


[PATCH v1 1/1] eal/random: fix random state initialization for non-eal threads

2023-08-28 Thread Anatoly Burakov
Currently, the rte_rand() state is initialized with seed, and each
rand state is initialized up until RTE_MAX_LCORE'th rand state. However,
rand state also has one extra rand state reserved for non-EAL threads,
which is not initialized. Fix it by initializing this extra state.

Fixes: 3f002f069612 ("eal: replace libc-based random generation with LFSR")
Cc: mattias.ronnb...@ericsson.com
Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---
 lib/eal/common/rte_random.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 565f2401ce..e5691813a4 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -83,7 +83,7 @@ rte_srand(uint64_t seed)
unsigned int lcore_id;
 
/* add lcore_id to seed to avoid having the same sequence */
-   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
+   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE + 1; lcore_id++)
__rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
 }
 
-- 
2.37.2



Re: [PATCH v1 1/1] eal/random: fix random state initialization for non-eal threads

2023-08-28 Thread Mattias Rönnblom

On 2023-08-28 14:06, Anatoly Burakov wrote:

Currently, the rte_rand() state is initialized with seed, and each
rand state is initialized up until RTE_MAX_LCORE'th rand state. However,
rand state also has one extra rand state reserved for non-EAL threads,
which is not initialized. Fix it by initializing this extra state.

Fixes: 3f002f069612 ("eal: replace libc-based random generation with LFSR")
Cc: mattias.ronnb...@ericsson.com
Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---
  lib/eal/common/rte_random.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 565f2401ce..e5691813a4 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -83,7 +83,7 @@ rte_srand(uint64_t seed)
unsigned int lcore_id;
  
  	/* add lcore_id to seed to avoid having the same sequence */

-   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++)
+   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE + 1; lcore_id++)
__rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
  }
  


Acked-by: Mattias Rönnblom 


RE: [PATCH v1 1/1] eal/random: fix random state initialization for non-eal threads

2023-08-28 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> Sent: Monday, 28 August 2023 14.23
> 
> On 2023-08-28 14:06, Anatoly Burakov wrote:
> > Currently, the rte_rand() state is initialized with seed, and each
> > rand state is initialized up until RTE_MAX_LCORE'th rand state.
> However,
> > rand state also has one extra rand state reserved for non-EAL threads,
> > which is not initialized. Fix it by initializing this extra state.
> >
> > Fixes: 3f002f069612 ("eal: replace libc-based random generation with
> LFSR")
> > Cc: mattias.ronnb...@ericsson.com
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Anatoly Burakov 
> > ---

Acked-by: Morten Brørup 



Re: [PATCH v2 1/1] net/cnxk: support MACsec PN threshold events on multiple ports

2023-08-28 Thread Jerin Jacob
On Fri, Aug 25, 2023 at 4:15 PM Ankur Dwivedi  wrote:
>
> Adds sa to port mapping in roc mcs. The sa to port map is updated when the
> sa is created. A portid field is also added to macsec event callback
> function. The above changes helps to propagate the tx and rx pn threshold
> events to the correct ethernet device.
>
> Signed-off-by: Ankur Dwivedi 

Updated the git commit as follows and applied to
dpdk-next-net-mrvl/for-next-net. Thanks

commit 3cadd086a2322a366f880e6c0b4a995313cc803e (HEAD -> for-next-net,
origin/for-next-net)
Author: Ankur Dwivedi 
Date:   Fri Aug 25 16:06:35 2023 +0530

net/cnxk: support multi port MACsec PN threshold events

Adds SA to port mapping in roc mcs. The SA to port map is updated when the
SA is created. A portid field is also added to macsec event callback
function. The above changes helps to propagate the Tx and Rx PN threshold
events to the correct ethernet device.

Signed-off-by: Ankur Dwivedi 


Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Ferruh Yigit
On 8/28/2023 12:03 PM, Bruce Richardson wrote:
> On Mon, Aug 28, 2023 at 12:42:38PM +0200, Stephen Hemminger wrote:
>>For humor
>>#define RTE_CASTAWAY(x) ((void *)(uinptr_t)(x))
> 
> Yes, actually thought about that. Was also wondering about making it an
> inline function rather than macro, to ensure its only used on pointers, and
> to make clear what is being cast away:
> 
> static inline void *
> rte_cast_no_volatile(volatile void *x)
> {
>   return (void *)(uintptr_t)(x);
> }
> 

Not as good, without 'castaway' in the API/macro name :)

> and similarly we could do a rte_cast_no_const(const void *x).
> 
> WDYT?
> 
> /Bruce



Re: [PATCH v2] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread David Marchand
On Mon, Aug 28, 2023 at 12:40 PM Bruce Richardson
 wrote:
>
> When doing a build for a system with WAITPKG support and a modern
> compiler, we get build errors for the "_umonitor" intrinsic, due to the
> casting away of the "volatile" on the parameter.
>
> ../lib/eal/x86/rte_power_intrinsics.c: In function 'rte_power_monitor':
> ../lib/eal/x86/rte_power_intrinsics.c:113:22: error: passing argument 1
> of '_umonitor' discards 'volatile' qualifier from pointer target type
> [-Werror=discarded-qualifiers]
>   113 | _umonitor(pmc->addr);
> |   ~~~^~
>
> We can avoid this issue by casting through "uintptr_t" and thereby
> remove the volatile without warning.  We also ensure comments are
> correct for each leg of the ifdef..else..endif block.
>
> Fixes: 60943c04f3bc ("eal/x86: use intrinsics for power management")
> Cc: roret...@linux.microsoft.com
>
> Signed-off-by: Bruce Richardson 
> Acked-by: Morten Brørup 
> Tested-by: David Marchand 

Applied to fix build on the main branch, thanks.

We can look at the casting helper as a followup.


-- 
David Marchand



Re: [PATCH v1] eventdev/eth_rx: fix null pointer dereference

2023-08-28 Thread Jerin Jacob
On Fri, Aug 25, 2023 at 7:17 PM Naga Harish K, S V
 wrote:
>
>
>
> > -Original Message-
> > From: Kundapura, Ganapati 
> > Sent: Thursday, August 24, 2023 1:54 PM
> > To: jer...@marvell.com; Naga Harish K, S V ;
> > dev@dpdk.org
> > Cc: Jayatheerthan, Jay ; Gujjar, Abhinandan S
> > 
> > Subject: [PATCH v1] eventdev/eth_rx: fix null pointer dereference
> >
> > On passing NULL as a last parameter to xxx_create_ext_with_params(),
> > rxa_config_params_validate() uses default values and dereferences NULL
> > pointer leading to segmentation fault.
> >
> > Fixed by returning after using default values without dereferencing NULL
> > pointer.
> >
> > Fixes: 8be6c94d6d90 ("eventdev/eth_rx: add new adapter create API")

Squashed to this patch to "8be6c94d6d90 ("eventdev/eth_rx: add new
adapter create API")" and rebase the tree to latest main.


Re: [PATCH] eal/x86: fix build on systems with WAITPKG support

2023-08-28 Thread Tyler Retzlaff
On Mon, Aug 28, 2023 at 12:03:40PM +0100, Bruce Richardson wrote:
> On Mon, Aug 28, 2023 at 12:42:38PM +0200, Stephen Hemminger wrote:
> >For humor
> >#define RTE_CASTAWAY(x) ((void *)(uinptr_t)(x))
> 
> Yes, actually thought about that. Was also wondering about making it an
> inline function rather than macro, to ensure its only used on pointers, and
> to make clear what is being cast away:
> 
> static inline void *
> rte_cast_no_volatile(volatile void *x)
> {
>   return (void *)(uintptr_t)(x);
> }
> 
> and similarly we could do a rte_cast_no_const(const void *x).
> 
> WDYT?

since we're introducing humor! now announcing dpdk requires C23 comliant
compiler for typeof_unqual! https://en.cppreference.com/w/c/language/typeof

i like the idea, i like it being inline function you could use the name
'unqual' if you wanted to mimic a name similar to standard C typeof.

it could be 1 function that strips all qualifications or N that strip
specific qualifications, const, volatile and _Atomic not sure which is
best.

ty

> 
> /Bruce


RE: DPDK community: RTE_FLOW support for P4-programmable devices

2023-08-28 Thread Dumitrescu, Cristian
> > We just set up a community call for next week to discuss in more details the
> > proposal for RTE_FLOW extensions to support P4-programmable devices
> > https://mails.dpdk.org/archives/dev/2023-August/273703.html and look for
> > ways to converge and make progress.
> >
> > All the people from To: and CC: are already invited. To avoid cluttering
> people's
> > calendars, I did not add dev@dpdk.org, so if anybody else wants to attend,
> > please send me a private email and I will be happy to forward the invite.
> >
> > Thanks,
> > Cristian

Attendees: Morten Brorup, Jerin Jacob, Anoob Joseph, Vipin Varghese, Qi Zhang,
Cristian Dumitrescu

1. Ori (RTE_FLOW maintainer) and others were not present, probably on vacation,
hopefully they will be able to attend the next call on this topic. Ferruh had a 
last
minute conflict that he could not avoid.

2. Cristian presented a few slides (attached) with the problem statement, 
current
RTE_FLOW gaps for P4-programmable devices and the list of current solution
proposals.

3. Everybody on the call agreed that the P4-programmable devices from Intel,
AMD and others need to be fully supported by DPDK and that there are some
gaps in RTE_FLOW to be fixed for supporting these devices.

3. Ori suggested in a previous email to potentially introduce a registration API
In RTE_FLOW for user-defined flow items and actions. Cristian replied with a 
proposal on the email list and currently wating for Ori's reply (see also 
proposal
#2 on slide 5).

4. Will setup a follow-up call in early September.

Regards,
Cristian


rte_flow_extensions_for_p4_devices.pptx
Description: rte_flow_extensions_for_p4_devices.pptx


Two weeks out - DPDK Summit - Register Now!

2023-08-28 Thread Nathan Southern
Dear DPDK Community,

I hope everyone is well, and that you each had a great weekend.

We're two weeks out at this point from the DPDK Summit in Dublin - which
runs from Tuesday Sep. 12th through Wed. Sep. 13th. Once again, this will
be held at the Gibson Hotel at Point Square in Dublin. Once
again, registration and attendance for this event are free. Virtual and
on-site attendance options will be available.

The schedule is posted here, and we have an outstanding two-day line-up of
speakers:

https://events.linuxfoundation.org/dpdk-summit/program/schedule/

If you have not already registered for the event, live or virtually, please
do so now. *You may do so here:*

https://events.linuxfoundation.org/dpdk-userspace-summit/

Any questions please let me know.

Thanks,

Nathan


[Bug 1280] rte_mempool_create returning error "EAL: eal_memalloc_alloc_seg_bulk(): couldn't find suitable memseg_list"

2023-08-28 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1280

Siva (pingtos...@gmail.com) changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|UNCONFIRMED |RESOLVED

--- Comment #3 from Siva (pingtos...@gmail.com) ---
After changing the RTE_MAX_MEM_MB_PER_TYPE macro in rte_config.h file, I am
able to consume all the huge pages.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[PATCH v3] mbuf: add ESP packet type

2023-08-28 Thread Alexander Kozyrev
Support the IP Encapsulating Security Payload (ESP) in transport mode.

Signed-off-by: Alexander Kozyrev 
Acked-by: Morten Brørup 
---
 lib/mbuf/rte_mbuf_ptype.h | 36 ++--
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/lib/mbuf/rte_mbuf_ptype.h b/lib/mbuf/rte_mbuf_ptype.h
index 17a2dd3576..cdd6fd460e 100644
--- a/lib/mbuf/rte_mbuf_ptype.h
+++ b/lib/mbuf/rte_mbuf_ptype.h
@@ -247,7 +247,7 @@ extern "C" {
  * It refers to those packets of any IP types, which can be recognized as
  * fragmented. A fragmented packet cannot be recognized as any other L4 types
  * (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP,
- * RTE_PTYPE_L4_NONFRAG).
+ * RTE_PTYPE_L4_NONFRAG, RTE_PTYPE_L4_IGMP, RTE_PTYPE_L4_ESP).
  *
  * Packet format:
  * <'ether type'=0x0800
@@ -290,14 +290,15 @@ extern "C" {
  *
  * It refers to those packets of any IP types, while cannot be recognized as
  * any of above L4 types (RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP,
- * RTE_PTYPE_L4_FRAG, RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP).
+ * RTE_PTYPE_L4_FRAG (for IPv6), RTE_PTYPE_L4_SCTP, RTE_PTYPE_L4_ICMP,
+ * RTE_PTYPE_L4_IGMP (for IPv4), RTE_PTYPE_L4_ESP).
  *
  * Packet format:
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0, 'frag_offset'=0>
+ * | 'version'=4, 'protocol'!=[1|2|6|17|50|132], 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
- * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ * | 'version'=6, 'next header'!=[1|6|17|44|50|132]>
  */
 #define RTE_PTYPE_L4_NONFRAG0x0600
 /**
@@ -308,6 +309,17 @@ extern "C" {
  * | 'version'=4, 'protocol'=2, 'MF'=0, 'frag_offset'=0>
  */
 #define RTE_PTYPE_L4_IGMP   0x0700
+/**
+ * ESP (IP Encapsulating Security Payload) transport packet type.
+ *
+ * Packet format:
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=50, 'MF'=0, 'frag_offset'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=50>
+ */
+#define RTE_PTYPE_L4_ESP0x0800
 /**
  * Mask of layer 4 packet types.
  * It is used for outer packet for tunneling cases.
@@ -652,12 +664,24 @@ extern "C" {
  *
  * Packet format (inner only):
  * <'ether type'=0x0800
- * | 'version'=4, 'protocol'!=[6|17|132|1], 'MF'=0, 'frag_offset'=0>
+ * | 'version'=4, 'protocol'!=[1|6|17|50|132], 'MF'=0, 'frag_offset'=0>
  * or,
  * <'ether type'=0x86DD
- * | 'version'=6, 'next header'!=[6|17|44|132|1]>
+ * | 'version'=6, 'next header'!=[1|6|17|44|50|132]>
  */
 #define RTE_PTYPE_INNER_L4_NONFRAG  0x0600
+/**
+ * ESP (IP Encapsulating Security Payload) transport packet type.
+ * It is used for inner packet only.
+ *
+ * Packet format (inner only):
+ * <'ether type'=0x0800
+ * | 'version'=4, 'protocol'=50, 'MF'=0, 'frag_offset'=0>
+ * or,
+ * <'ether type'=0x86DD
+ * | 'version'=6, 'next header'=50>
+ */
+#define RTE_PTYPE_INNER_L4_ESP  0x0800
 /**
  * Mask of inner layer 4 packet types.
  */
-- 
2.18.2



Re: [PATCH] drivers: add dependencies for some classes

2023-08-28 Thread Maxime Coquelin




On 8/25/23 19:02, David Marchand wrote:

A few classes meson.build were not expressing dependencies to the
associated device library. Define std_deps for baseband, gpu and regex
drivers.

Signed-off-by: David Marchand 
---
  drivers/baseband/acc/meson.build   | 2 +-
  drivers/baseband/fpga_5gnr_fec/meson.build | 2 +-
  drivers/baseband/fpga_lte_fec/meson.build  | 2 +-
  drivers/baseband/la12xx/meson.build| 2 +-
  drivers/baseband/meson.build   | 2 ++
  drivers/baseband/null/meson.build  | 2 +-
  drivers/baseband/turbo_sw/meson.build  | 2 +-
  drivers/gpu/cuda/meson.build   | 2 +-
  drivers/gpu/meson.build| 2 ++
  drivers/regex/cn9k/meson.build | 2 +-
  drivers/regex/meson.build  | 2 +-
  drivers/regex/mlx5/meson.build | 2 +-
  12 files changed, 14 insertions(+), 10 deletions(-)



Acked-by: Maxime Coquelin 



[PATCH] net/ice: fix tm configuration cannot be clear

2023-08-28 Thread Kaiwen Deng
When the device is stopped, DPDK resets the commit flag so that
we can update the hierarchy configuration. The commit flag is also
used to determine if the hierarchy configuration needs to be cleared.
When DPDK exits, it always stops the device first and also resets
the commit flag result in the hierarchy configuration is not cleared.

This patch adds a new flag "need_clear" to determine if the
hierarchy configuration needs to be cleared.

Fixes: 3a6bfc37eaf4 ("net/ice: support QoS config VF bandwidth in DCF")
Cc: sta...@dpdk.org

Signed-off-by: Kaiwen Deng 
---
 drivers/net/ice/ice_dcf.c   | 2 +-
 drivers/net/ice/ice_dcf.h   | 1 +
 drivers/net/ice/ice_dcf_sched.c | 3 +++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ice/ice_dcf.c b/drivers/net/ice/ice_dcf.c
index 7f8f5163ac..45d44ab73c 100644
--- a/drivers/net/ice/ice_dcf.c
+++ b/drivers/net/ice/ice_dcf.c
@@ -877,7 +877,7 @@ ice_dcf_uninit_hw(struct rte_eth_dev *eth_dev, struct 
ice_dcf_hw *hw)
struct rte_intr_handle *intr_handle = pci_dev->intr_handle;
 
if (hw->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_QOS)
-   if (hw->tm_conf.committed) {
+   if (hw->tm_conf.need_clear) {
ice_dcf_clear_bw(hw);
ice_dcf_tm_conf_uninit(eth_dev);
}
diff --git a/drivers/net/ice/ice_dcf.h b/drivers/net/ice/ice_dcf.h
index aa2a723f2a..af23b569f5 100644
--- a/drivers/net/ice/ice_dcf.h
+++ b/drivers/net/ice/ice_dcf.h
@@ -78,6 +78,7 @@ struct ice_dcf_tm_conf {
uint32_t nb_tc_node;
uint32_t nb_vsi_node;
bool committed;
+   bool need_clear;
 };
 
 struct ice_dcf_eth_stats {
diff --git a/drivers/net/ice/ice_dcf_sched.c b/drivers/net/ice/ice_dcf_sched.c
index a231c1e60b..5437fabb58 100644
--- a/drivers/net/ice/ice_dcf_sched.c
+++ b/drivers/net/ice/ice_dcf_sched.c
@@ -51,6 +51,7 @@ ice_dcf_tm_conf_init(struct rte_eth_dev *dev)
hw->tm_conf.nb_tc_node = 0;
hw->tm_conf.nb_vsi_node = 0;
hw->tm_conf.committed = false;
+   hw->tm_conf.need_clear = false;
 }
 
 void
@@ -870,6 +871,8 @@ static int ice_dcf_hierarchy_commit(struct rte_eth_dev *dev,
   ICE_NONDMA_TO_NONDMA);
 
hw->tm_conf.committed = true;
+   hw->tm_conf.need_clear = true;
+
return ret_val;
 
 fail_clear:
-- 
2.25.1