Re: [PATCH v3] eal: add seqlock

2022-04-03 Thread Ola Liljedahl
+__rte_experimental +static inline void +rte_seqlock_write_lock(rte_seqlock_t *seqlock) { + uint32_t sn; + + /* to synchronize with other writers */ + rte_spinlock_lock(&seqlock->lock); + + sn = seqlock->sn + 1; >>> The load of seqlock->sn could use

Re: [PATCH v3] eal: add seqlock

2022-04-03 Thread Ola Liljedahl
(Now using macOS Mail program in plain text mode, hope this works) > On 2 Apr 2022, at 21:31, Honnappa Nagarahalli > wrote: > > > >>> +__rte_experimental >>> +static inline bool >>> +rte_seqlock_read_tryunlock(const rte_seqlock_t *seqlock, uint32_t >>> +*begin_sn) { >>> + uint32_t end_sn; >

Re: [PATCH v3] eal: add seqlock

2022-04-02 Thread Ola Liljedahl
On 4/1/22 17:07, Mattias Rönnblom wrote: + +/** + * End a read-side critical section. + * + * A call to this function marks the end of a read-side critical + * section, for @p seqlock. The application must supply the sequence + * number produced by the corresponding rte_seqlock_read_lock() (or, +

Re: [PATCH v2] eal: add seqlock

2022-04-02 Thread Ola Liljedahl
On 4/2/22 02:50, Stephen Hemminger wrote: On Wed, 30 Mar 2022 16:26:02 +0200 Mattias Rönnblom wrote: + __atomic_store_n(&seqlock->sn, sn, __ATOMIC_RELAXED); + + /* __ATOMIC_RELEASE to prevent stores after (in program order) +* from happening before the sn store. +

Re: [PATCH v2] eal: add seqlock

2022-04-02 Thread Ola Liljedahl
On 4/2/22 12:25, Morten Brørup wrote: From: Stephen Hemminger [mailto:step...@networkplumber.org] Sent: Saturday, 2 April 2022 02.54 Semantics and naming should be the same as Linux kernel or you risk having to reeducate too many people. Although I do see significant value in that point, I do

Re: [PATCH v2] eal: add seqlock

2022-03-31 Thread Ola Liljedahl
(Thunderbird suddenly refuses to edit in plain text mode, hope the mail gets sent as text anyway) On 3/31/22 15:38, Mattias Rönnblom wrote: On 2022-03-31 11:04, Ola Liljedahl wrote: On 3/31/22 09:46, Mattias Rönnblom wrote: On 2022-03-30 16:26, Mattias Rönnblom wrote: Should the

Re: [PATCH v2] eal: add seqlock

2022-03-31 Thread Ola Liljedahl
I think lock()/unlock() should be avoided in the read operation names, because no lock is taken during read. I like the critical region begin()/end() names. I was following the naming convention of rte_rwlock. Isn't the seqlock just a more scalable implementation of a reader/writer lock? I se

Re: [PATCH v2] eal: add seqlock

2022-03-31 Thread Ola Liljedahl
On 3/31/22 11:25, Morten Brørup wrote: From: Ola Liljedahl [mailto:ola.liljed...@arm.com] Sent: Thursday, 31 March 2022 11.05 On 3/31/22 09:46, Mattias Rönnblom wrote: On 2022-03-30 16:26, Mattias Rönnblom wrote: A sequence lock (seqlock) is synchronization primitive which allows for data

Re: [PATCH v2] eal: add seqlock

2022-03-31 Thread Ola Liljedahl
On 3/31/22 09:46, Mattias Rönnblom wrote: On 2022-03-30 16:26, Mattias Rönnblom wrote: A sequence lock (seqlock) is synchronization primitive which allows for data-race free, low-overhead, high-frequency reads, especially for data structures shared across many cores and which are updated with

Re: [RFC] eal: add seqlock

2022-03-29 Thread Ola Liljedahl
On 3/28/22 12:53, Ananyev, Konstantin wrote: diff --git a/lib/eal/include/meson.build b/lib/eal/include/meson.build index 9700494816..48df5f1a21 100644 --- a/lib/eal/include/meson.build +++ b/lib/eal/include/meson.build @@ -36,6 +36,7 @@ headers += files( 'rte_per_lcore.h',

Re: [dpdk-dev] [PATCH v2 2/3] config: add arm neoverse N1 SDP configuration

2019-10-17 Thread Ola Liljedahl
; +# overrides the one defined in arch. > +# - can define CPU_LDFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - can define CPU_ASFLAGS variable (overridden by cmdline value) that > +# overrides the one defined in arch. > +# - may override any previously defined variable > +# > + > +# ARCH = > +# CROSS = > +# MACHINE_CFLAGS = > +# MACHINE_LDFLAGS = > +# MACHINE_ASFLAGS = > +# CPU_CFLAGS = > +# CPU_LDFLAGS = > +# CPU_ASFLAGS = > + > +include $(RTE_SDK)/mk/rte.helper.mk > + > +MACHINE_CFLAGS += $(call rte_cc_has_argument, -march=armv8.2-a+crc+crypto) > +MACHINE_CFLAGS += $(call rte_cc_has_argument, -mcpu=neoverse-n1) > > -- Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype ola.liljedahl

Re: [dpdk-dev] [RFC,v2] lfring: lock-free ring buffer

2019-06-15 Thread Ola Liljedahl
using ``rte_lfring_enqueue()`` or ``rte_lfring_enqueue_bulk()`` > + * is "single-producer". Otherwise, it is "multi-producers". > + *- LFRING_F_SC_DEQ: If this flag is set, the default behavior when > + * using ``rte_lfring_dequeue()`` or ``rte_lfring_dequeue_bulk()`` > + * is "single-consumer". Otherwise, it is "multi-consumers". > + * @return > + * On success, the pointer to the new allocated lfring. NULL on error with > + *rte_errno set appropriately. Possible errno values include: > + *- E_RTE_NO_CONFIG - function could not get pointer to rte_config > structure > + *- E_RTE_SECONDARY - function was called from a secondary process > instance > > E_RTE_SECONDARY and E_RTE_NO_CONFIG are not possible error cases > > > > > +#define ENQ_RETRY_LIMIT 32 > > Per the coding guidelines ( > https://doc.dpdk.org/guides/contributing/coding_style.html#macros), macros > should be prefixed with RTE_. > > > > > +/** > + * Return the number of elements which can be stored in the lfring. > + * > + * @param r > + * A pointer to the lfring structure. > + * @return > + * The usable size of the lfring. > + */ > +static inline unsigned int > +rte_lfring_get_capacity(const struct rte_lfring *r) { > + return r->size; > > I believe this should return r->mask, to account for the one unusable ring > entry. I think this is a mistake, all ring entries should be usable. > > > > > diff --git a/lib/librte_lfring/rte_lfring_version.map > b/lib/librte_lfring/rte_lfring_version.map > new file mode 100644 > index 000..d935efd > --- /dev/null > +++ b/lib/librte_lfring/rte_lfring_version.map > @@ -0,0 +1,19 @@ > +DPDK_2.0 { > + global: > + > + rte_ring_create; > + rte_ring_dump; > + rte_ring_get_memsize; > + rte_ring_init; > + rte_ring_list_dump; > + rte_ring_lookup; > > Need to fix function names and DPDK version number > > Thanks, > Gage > > -- Ola Liljedahl, System Architect, Arm

Re: [dpdk-dev] [PATCH v4] doc: announce ring API change

2019-05-10 Thread Ola Liljedahl
-free design should be the same as in Gage's patch. rte_lfring could of course be part of the rte_ring library. -- Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype ola.liljedahl

Re: [dpdk-dev] [PATCH v7 0/6] Add lock-free ring and mempool handler

2019-04-02 Thread Ola Liljedahl
so much worse to have separate but structurally equivalent API's? Yes, blocking vs non-blocking can no longer be selected at runtime (startup time), I think this is the biggest limitation. -- Ola (Unfortunately without hard numbers on the cost or benefit of such a change, these a

Re: [dpdk-dev] [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only)

2019-02-01 Thread Ola Liljedahl
On Fri, 2019-02-01 at 21:05 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Friday, February 1, 2019 1:44 PM > > To: Eads, Gage ; dev@dpdk.org > > Cc: jer...@marv

Re: [dpdk-dev] [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only)

2019-02-01 Thread Ola Liljedahl
On Fri, 2019-02-01 at 19:28 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Friday, February 1, 2019 1:02 PM > > To: Eads, Gage ; dev@dpdk.org > > Cc: jer...@marv

Re: [dpdk-dev] [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only)

2019-02-01 Thread Ola Liljedahl
On Fri, 2019-02-01 at 17:06 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Monday, January 28, 2019 5:02 PM > > To: Eads, Gage ; dev@dpdk.org > > Cc: arybche...@solar

Re: [dpdk-dev] [RFC] lfring: lock-free ring buffer

2019-02-01 Thread Ola Liljedahl
On Fri, 2019-02-01 at 15:40 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Wednesday, January 30, 2019 5:36 AM > > To: Honnappa Nagarahalli ; Richardson, > > Bruce ; E

[dpdk-dev] [RFC,v2] lfring: lock-free ring buffer

2019-01-30 Thread Ola Liljedahl
red data. This avoids ordering stores to private data. Signed-off-by: Ola Liljedahl --- config/common_base | 5 + lib/Makefile | 2 + lib/librte_lfring/Makefile | 22 ++ lib/librte_lfring/lockfree16.h | 143 +

Re: [dpdk-dev] [RFC] lfring: lock-free ring buffer

2019-01-30 Thread Ola Liljedahl
> ring entries (for dequeue) or unused ring entries (for enqueue). This > guarantees that the enq/deq operation will eventually complete, regardless of > the behavior of other threads. This is why the enqueue loop doesn't check if > space is available and the dequeue loop doesn'

Re: [dpdk-dev] [PATCH v4 1/5] ring: add 64-bit headtail structure

2019-01-29 Thread Ola Liljedahl
uint32_t *entries) > +{ > + unsigned int max = n; > + int success; > + > + do { > + /* Restore n as it may change every loop */ > + n = max; > + > + *old_head = r->cons_64.head; > + > + /* add rmb barrier to avoid load/load reorder in weak > +  * memory model. It is noop on x86 > +  */ > + rte_smp_rmb(); > + > + /* The subtraction is done between two unsigned 64bits value > +  * (the result is always modulo 64 bits even if we have > +  * cons_head > prod_tail). So 'entries' is always between 0 > +  * and size(ring)-1. > +  */ > + *entries = (r->prod_64.tail - *old_head); > + > + /* Set the actual entries for dequeue */ > + if (n > *entries) > + n = (behavior == RTE_RING_QUEUE_FIXED) ? 0 : > *entries; > + > + if (unlikely(n == 0)) > + return 0; > + > + *new_head = *old_head + n; > + if (is_sc) > + r->cons_64.head = *new_head, success = 1; > + else > + success = rte_atomic64_cmpset(&r->cons_64.head, > + *old_head, *new_head); > + } while (unlikely(success == 0)); > + return n; > +} > + >  #endif /* _RTE_RING_GENERIC_H_ */ -- Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype ola.liljedahl

Re: [dpdk-dev] [PATCH 1/1] eal: add 128-bit cmpset (x86-64 only)

2019-01-28 Thread Ola Liljedahl
the strong variant. > + * @param success > + *   If successful, the operation's memory behavior conforms to this (or a > + *   stronger) model. > + * @param failure > + *   If unsuccessful, the operation's memory behavior conforms to this (or a > + *   stronger) model. This argument cannot be RTE_ATOMIC_RELEASE, > + *   RTE_ATOMIC_ACQ_REL, or a stronger model than success. > + * @return > + *   Non-zero on success; 0 on failure. > + */ > +static inline int __rte_experimental > +rte_atomic128_cmpset(volatile rte_int128_t *dst, > +  rte_int128_t *exp, rte_int128_t *src, > +  unsigned int weak, > +  enum rte_atomic_memmodel_t success, > +  enum rte_atomic_memmodel_t failure); > +#endif > + >  #endif /* _RTE_ATOMIC_H_ */ -- Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype ola.liljedahl

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-28 Thread Ola Liljedahl
On Mon, 2019-01-28 at 18:54 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Monday, January 28, 2019 4:36 AM > > To: jer...@marvell.com; mcze...@marvell.com; Eads, Gage >

Re: [dpdk-dev] [RFC] lfring: lock-free ring buffer

2019-01-28 Thread Ola Liljedahl
On Mon, 2019-01-28 at 21:04 +, Eads, Gage wrote: > Hey Ola, > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Monday, January 28, 2019 6:29 AM > > To: dev@dpdk.org; Eads, Gage ; Honnappa Nagarahalli >

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-28 Thread Ola Liljedahl
On Mon, 2019-01-28 at 14:04 +, Jerin Jacob Kollanukkaran wrote: > > Does PPC (64-bit POWER?) have support for double-word (128-bit) CAS? > > I dont know, I was telling wrt in general C11 mem model for PPC. Sorry, I misunderstood. -- Ola Liljedahl, Networking System Architec

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-28 Thread Ola Liljedahl
On Mon, 2019-01-28 at 13:34 +, Jerin Jacob Kollanukkaran wrote: > On Fri, 2019-01-25 at 17:21 +, Eads, Gage wrote: > > > > > > > > -Original Message- > > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > > Sent: Wednesday,

[dpdk-dev] [RFC] lfring: lock-free ring buffer

2019-01-28 Thread Ola Liljedahl
necessary, it is a successful compare and exchange operation which provides atomicity. Signed-off-by: Ola Liljedahl --- config/common_base | 5 + lib/Makefile | 2 + lib/librte_lfring/Makefile | 22 ++ lib/librte_lfring/lockfree16.h

Re: [dpdk-dev] [PATCH v3 0/5] Add non-blocking ring

2019-01-28 Thread Ola Liljedahl
.@solarflare.com; Richardson, Bruce > > ; Ananyev, Konstantin > > ; step...@networkplumber.org; nd > > ; tho...@monjalon.net; Ola Liljedahl > > ; Gavin Hu (Arm Technology China) > > ; Song Zhu (Arm Technology China) > > ; nd > > Subject: RE: [dpdk-dev] [PATCH v

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-28 Thread Ola Liljedahl
On Fri, 2019-01-25 at 17:21 +, Eads, Gage wrote: > > > > > -Original Message----- > > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > > Sent: Wednesday, January 23, 2019 4:16 AM > > To: Eads, Gage ; dev@dpdk.org > > Cc: olivier.m...@6w

Re: [dpdk-dev] [PATCH v3 0/5] Add non-blocking ring

2019-01-23 Thread Ola Liljedahl
On Wed, 2019-01-23 at 16:02 +, Jerin Jacob Kollanukkaran wrote: > On Tue, 2019-01-22 at 09:27 +0000, Ola Liljedahl wrote: > > > > On Fri, 2019-01-18 at 09:23 -0600, Gage Eads wrote: > > > > > > v3: > > >  - Avoid the ABI break by putting 64-bit he

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-23 Thread Ola Liljedahl
sp > > > + *   Indicates whether to use single producer or multi-producer head > > > +update > > > + * @param free_space > > > + *   returns the amount of space after the enqueue operation has > > > +finished > > > + * @return > > > + *   Actual number of objects enqueued. > > > + *   If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. > > > + */ > > > +static __rte_always_inline unsigned int > > > +__rte_ring_do_nb_enqueue(struct rte_ring *r, void * const *obj_table, > > > +  unsigned int n, enum rte_ring_queue_behavior > > > behavior, > > > +  unsigned int is_sp, unsigned int *free_space) { > > > + if (is_sp) > > > + return __rte_ring_do_nb_enqueue_sp(r, obj_table, n, > > > +    behavior, free_space); > > > + else > > > + return __rte_ring_do_nb_enqueue_mp(r, obj_table, n, > > > +    behavior, free_space); > > > +} > > > + > > > +/** > > > + * @internal > > > + *   Dequeue several objects from the non-blocking ring > > > +(single-consumer > > > only) > > > + * > > > + * @param r > > > + *   A pointer to the ring structure. > > > + * @param obj_table > > > + *   A pointer to a table of void * pointers (objects). > > > + * @param n > > > + *   The number of objects to pull from the ring. > > > + * @param behavior > > > + *   RTE_RING_QUEUE_FIXED:Dequeue a fixed number of items from > > > + the ring > > > + *   RTE_RING_QUEUE_VARIABLE: Dequeue as many items as possible from > > > + the ring > > > + * @param available > > > + *   returns the number of remaining ring entries after the dequeue > > > + has > > > finished > > > + * @return > > > + *   - Actual number of objects dequeued. > > > + * If behavior == RTE_RING_QUEUE_FIXED, this will be 0 or n only. > > > + */ > > > +static __rte_always_inline unsigned int > > > +__rte_ring_do_nb_dequeue_sc(struct rte_ring *r, void **obj_table, > > > + unsigned int n, > > > + enum rte_ring_queue_behavior behavior, > > > + unsigned int *available) > > > +{ > > > + size_t head, next; > > > + uint32_t entries; > > > + > > > + n = __rte_ring_move_cons_head_64(r, 1, n, behavior, > > > +  &head, &next, &entries); > > > + if (n == 0) > > > + goto end; > > > + > > > + DEQUEUE_PTRS_NB(r, &r[1], head, obj_table, n); > > > + > > > + r->cons_64.tail += n; > > Memory ordering? Consumer synchronises with producer. > > > Agreed, that is missing here. Will fix. > > Thanks, > Gage > > > > > -- > > Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype > > ola.liljedahl -- Ola Liljedahl, Networking System Architect, Arm Phone +46706866373, Skype ola.liljedahl

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-22 Thread Ola Liljedahl
> + if (r->flags & RING_F_NB) > + return __rte_ring_do_nb_enqueue(r, obj_table, n, > + RTE_RING_QUEUE_VARIABLE, > + __IS_SP, free_space); > + else > + ret

Re: [dpdk-dev] [PATCH v3 0/5] Add non-blocking ring

2019-01-22 Thread Ola Liljedahl
Sorry about the confidental footer. I tried to remove it using some Exhange magic but it seems not to work with Evolution. I'll try some other way. -- Ola On Tue, 2019-01-22 at 09:27 +0000, Ola Liljedahl wrote: > On Fri, 2019-01-18 at 09:23 -0600, Gage Eads wrote: > > > > For

Re: [dpdk-dev] [PATCH v3 2/5] ring: add a non-blocking implementation

2019-01-22 Thread Ola Liljedahl
On Fri, 2019-01-18 at 09:23 -0600, Gage Eads wrote: > This commit adds support for non-blocking circular ring enqueue and > dequeue > functions. The ring uses a 128-bit compare-and-swap instruction, and > thus > is currently limited to x86_64. > > The algorithm is based on the original rte ring (de

Re: [dpdk-dev] [PATCH v3 0/5] Add non-blocking ring

2019-01-22 Thread Ola Liljedahl
On Fri, 2019-01-18 at 09:23 -0600, Gage Eads wrote: > For some users, the rte ring's "non-preemptive" constraint is not > acceptable; > for example, if the application uses a mixture of pinned high- > priority threads > and multiplexed low-priority threads that share a mempool. > > This patchset in

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 16:46, "Ola Liljedahl" wrote: On 08/10/2018, 16:44, "Bruce Richardson" wrote: On Mon, Oct 08, 2018 at 09:22:05AM +, Ola Liljedahl wrote: > "* multi-producer safe lock-free ring buffer enqueue"

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 16:44, "Bruce Richardson" wrote: On Mon, Oct 08, 2018 at 09:22:05AM +, Ola Liljedahl wrote: > "* multi-producer safe lock-free ring buffer enqueue" > The comment is also wrong. This design is not lock-free, how could it

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 14:21, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 17:35:25 +0530 > From: Jerin Jacob > To: Ola Liljedahl > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin&q

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 13:50, "Jerin Jacob" wrote: I don't know how that creates more undefined behavior. So replied in the context of your reply that, according to your view even Linux is running with undefined behavior. As I explained, Linux does not use C11 atomics (nor GCC __

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 12:47, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 10:25:45 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin&q

Re: [dpdk-dev] [PATCH] eal/armv7: add support for rte pause

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 10:42, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 08:25:28 + > From: Ola Liljedahl > To: Jerin Jacob > CC: Jan Viktorin , "Gavin Hu (Arm Technology > China)" , "dev

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
Test OK > -Original Message- > From: Ola Liljedahl > Sent: Monday, October 8, 2018 6:26 PM > To: Jerin Jacob > Cc: dev@dpdk.org; Honnappa Nagarahalli > ; Ananyev, Konstantin > ; Gavin Hu (Arm Technology China) > ; Steve Capper ; n

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 12:00, "Jerin Jacob" wrote: -Original Message- > Date: Mon, 8 Oct 2018 09:22:05 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin&q

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
Or maybe performance gets worse but not because of that one additional instruction/cycle in ring buffer enqueue and dequeue but because function or loop alignment changed for one or more functions. When the benchmarking noise (possibly several % due to changes in code alignment) is bigger than

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 08:06, "Jerin Jacob" wrote: -Original Message- > Date: Sun, 7 Oct 2018 20:44:54 + > From: Ola Liljedahl > To: Jerin Jacob > CC: "dev@dpdk.org" , Honnappa Nagarahalli > , "Ananyev, Konstantin&q

Re: [dpdk-dev] [PATCH] eal/armv7: add support for rte pause

2018-10-08 Thread Ola Liljedahl
On 08/10/2018, 08:27, "Jerin Jacob" wrote: -Original Message- > Date: Sun, 7 Oct 2018 21:09:25 + > From: Ola Liljedahl > To: Jerin Jacob , Jan Viktorin > , "Gavin Hu (Arm Technology China)" > > CC:

Re: [dpdk-dev] [PATCH] eal/armv7: add support for rte pause

2018-10-07 Thread Ola Liljedahl
On 07/10/2018, 08:32, "Jerin Jacob" wrote: Add support for rte_pause() implementation for armv7. Signed-off-by: Jerin Jacob --- The reference implementation for Linux's cpu_relax() for armv7 is at https://elixir.bootlin.com/linux/latest/source/arch/arm/include/asm/proce

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Ola Liljedahl
On 07/10/2018, 06:03, "Jerin Jacob" wrote: In arm64 case, it will have ATOMIC_RELAXED followed by asm volatile ("":::"memory") of rte_pause(). I would n't have any issue, if the generated code code is same or better than the exiting case. but it not the case, Right? The existing case

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-07 Thread Ola Liljedahl
On 07/10/2018, 06:03, "Jerin Jacob" wrote: How about fixing rte_pause() then? Meaning issuing power saving instructions on missing archs. Rte_pause() implemented as NOP or YIELD on ARM will likely not save any power. You should use WFE for that. I use this portable pattern:  //W

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Ola Liljedahl
Some blogs posts about undefined behaviour in C/C++: https://blog.regehr.org/archives/213 http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html -- Ola

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-06 Thread Ola Liljedahl
On 06/10/2018, 09:42, "Jerin Jacob" wrote: -Original Message- > Date: Fri, 5 Oct 2018 20:34:15 + > From: Ola Liljedahl > To: Honnappa Nagarahalli , Jerin Jacob > > CC: "Ananyev, Konstantin" , "Gavin Hu (Arm

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
ase register with no offset. So any offset has to be added > before > > the actual "atomic" instruction, LDR in this case. > > > > > > -- Ola > > > > > > On 05/10/2018, 19:07, "

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
ctual "atomic" instruction, LDR in this case. > > > -- Ola > > > On 05/10/2018, 19:07, "Jerin Jacob" > wrote: > > -Original Message- > > Date: Fri, 5 Oct 2018 15:11:44 + > &g

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
t; Date: Fri, 5 Oct 2018 15:11:44 + > From: Honnappa Nagarahalli > To: "Ananyev, Konstantin" , Ola Liljedahl > , "Gavin Hu (Arm Technology China)" > , Jerin Jacob > CC: "dev@dpdk.org" , Steve Capper , nd > , "

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
On 05/10/2018, 15:45, "Ananyev, Konstantin" wrote: We all know that 32bit load/store on cpu we support - are atomic. Well, not necessarily true for unaligned loads and stores. But the "problem" here is that we are not directly generating 32-bit load and store instructions (that would re

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
uot; https://preshing.com/20130618/atomic-vs-non-atomic-operations/ So if ht->tail is written using e.g. __atomic_store_n(&ht->tail, val, mo), we need to also read it using e.g. __atomic_load_n(). -- Ola On 05/10/2018, 13:15, "Ola Liljedahl" wrote: On 05/10/201

Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load

2018-10-05 Thread Ola Liljedahl
c: dev@dpdk.org; Honnappa Nagarahalli ; Steve Capper ; Ola Liljedahl > ; nd ; sta...@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/3] ring: read tail using atomic load > > Hi Jerin, > > Thanks for your review, inline comments from our i

Re: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys

2018-10-03 Thread Ola Liljedahl
> >Cc: De Lara Guarch, Pablo ; dev@dpdk.org; Gavin Hu (Arm Technology China) >; Steve Capper ; Ola Liljedahl ; nd ; Gobriel, >Sameh >Subject: RE: [dpdk-dev] [PATCH 3/4] hash: fix rw concurrency while moving keys > >> >-Orig

Re: [dpdk-dev] [PATCH 2/4] hash: add memory ordering to avoid race conditions

2018-10-01 Thread Ola Liljedahl
On 28/09/2018, 02:43, "Wang, Yipeng1" wrote: Some general comments for the various __atomic_store/load added, 1. Although it passes the compiler check, but I just want to confirm that if we should use GCC/clang builtins, or if There are higher level APIs in DPDK to do atomic

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Ola Liljedahl
ot; wrote: -Original Message- > Date: Wed, 29 Aug 2018 07:34:34 + > From: Ola Liljedahl > To: "Kokkilagadda, Kiran" , Honnappa > Nagarahalli , Gavin Hu , > Ferruh Yigit , "Jacob, Jerin" > > CC: "dev@dpdk.org&q

Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization

2018-08-29 Thread Ola Liljedahl
rin" Cc: "dev@dpdk.org" , nd , Ola Liljedahl , Steve Capper Subject: Re: [dpdk-dev] [PATCH v2] kni: fix kni Rx fifo producer synchronization Agreed. Please go a head and make the changes. You need to make same change in kernel side also. And please use c11 ring (see rte_rin