<snip> > > The initialization me->locked=1 in lock() must happen before > next->locked=0 in unlock(), otherwise a thread may hang forever, > waiting me->locked become 0. On weak memory systems (such as ARMv8), > the current implementation allows me->locked=1 to be reordered with > announcing the node (pred->next=me) and, consequently, to be > reordered with next->locked=0 in unlock(). > > This fix adds a release barrier to pred->next=me, forcing > me->locked=1 to happen before this operation. > > Signed-off-by: Diogo Behrens <diogo.behr...@huawei.com> The change looks fine to me. I have tested this on few x86 and Arm machines. Acked-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> --- > lib/librte_eal/include/generic/rte_mcslock.h | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_eal/include/generic/rte_mcslock.h > b/lib/librte_eal/include/generic/rte_mcslock.h > index 2bef28351..ce553f547 100644 > --- a/lib/librte_eal/include/generic/rte_mcslock.h > +++ b/lib/librte_eal/include/generic/rte_mcslock.h > @@ -68,7 +68,14 @@ rte_mcslock_lock(rte_mcslock_t **msl, rte_mcslock_t > *me) > */ > return; > } > - __atomic_store_n(&prev->next, me, __ATOMIC_RELAXED); > + /* The store to me->next above should also complete before the node is > + * visible to predecessor thread releasing the lock. Hence, the store > + * prev->next also requires release semantics. Note that, for example, > + * on ARM, the release semantics in the exchange operation is not > + * strong as a release fence and is not sufficient to enforce the > + * desired order here. > + */ > + __atomic_store_n(&prev->next, me, __ATOMIC_RELEASE); > > /* The while-load of me->locked should not move above the previous > * store to prev->next. Otherwise it will cause a deadlock. Need a > -- > 2.28.0 >