On 3/7/23 16:54, Richard Henderson wrote:
Just to be safe, I tried modeling this with cppmem
(http://svr-pes20-cppmem.cl.cam.ac.uk/cppmem/); support for
compare-and-swap is very limited, therefore the test looks nothing
like the C code(*), but it should be ok:
You do realize that QSLIST_REMOVE_HEAD is not a compare-and-swap, right?
#define QSLIST_REMOVE_HEAD(head, field) do { \
typeof((head)->slh_first) elm = (head)->slh_first; \
(head)->slh_first = elm->field.sle_next; \
elm->field.sle_next = NULL; \
} while (/*CONSTCOND*/0)
Yes, the compare-and-swap is just how I modeled the enqueuing thread's
fetch_or
cas_strong_explicit(&x, 0, 1, mo_acquire, mo_acquire);
x.load(mo_relaxed).readsvalue(1); // fetch_or returned 0
y.store(1, mo_release); // bh inserted
while QSLIST_REMOVE_HEAD in the dequeuing thread is not ordered at all:
y.store(0, mo_relaxed); // QSLIST_REMOVE_HEAD
x.store(0, mo_release); // fetch_and
As I read aio_bh_queue, this is exactly the situation you describe in
patch 1 justifying the introduction of the new barriers.
Only store-store reordering is required between QSLIST_REMOVE_HEAD and
atomic_fetch_and(); and that one *is* blocked by atomic_fetch_and(),
since mo_seq_cst is a superset of both mo_release. The new barriers are
needed for store-load reordering.
Paolo