Module Name:    src
Committed By:   martin
Date:           Sun Jul 30 12:13:02 UTC 2023

Modified Files:
        src/share/man/man9 [netbsd-10]: pcq.9
        src/sys/kern [netbsd-10]: subr_pcq.c

Log Message:
Pull up following revision(s) (requested by riastradh in ticket #263):

        share/man/man9/pcq.9: revision 1.9
        sys/kern/subr_pcq.c: revision 1.14
        sys/kern/subr_pcq.c: revision 1.15
        sys/kern/subr_pcq.c: revision 1.16
        sys/kern/subr_pcq.c: revision 1.17
        sys/kern/subr_pcq.c: revision 1.18
        sys/kern/subr_pcq.c: revision 1.19

pcq(9): Make pcq_put a release operation, in memory ordering.

Otherwise, for example, the following assertion could fail:
        /* publisher */
        nusers = foo->nusers;
        pcq_put(pcq, foo);
        KASSERT(nusers == 0);
        /* user */
        foo = pcq_get(pcq);
        if (foo != NULL)
                atomic_inc_uint(&foo->nusers);

pcq(9): Fix consume operation in pcq_peek/get.

These use atomic_load_consume to match the atomic_store_release in
pcq_put for pcq->pcq_items[c].

Reading the snapshot of pcq->pcq_pc need not be ordered:
- The consumer side (pcq_peek/get) is serialized by the API contract
  (single-consumer, multi-producer), so no ordering is necessary.
- The producer side updates pcq->pcq_pc first; if the consumer side
  sees that before the producer side has stored pcq->pcq_items[c],
  there's no problem -- it's as if the consumer had happened just a
  moment earlier and the producer hadn't entered pcq_put yet.

However, it should be an atomic load, not a plain load.  So use
atomic_load_relaxed, if for no other reason than to pacify thread
sanitizers.

pcq(9): Explain why store need not be atomic in pcq_get.
No functional change intended.

pcq(9): Explain why membar_release isn't needed in pcq_get.
No functional change intended.

pcq(9): Document memory ordering guarantees.

pcq(9): Sketch correctness proof for some critical properties.
No functional change intended.

pcq(9): KASSERT(A && B) -> KASSERT(A); KASSERT(B)


To generate a diff of this commit:
cvs rdiff -u -r1.8 -r1.8.14.1 src/share/man/man9/pcq.9
cvs rdiff -u -r1.13 -r1.13.18.1 src/sys/kern/subr_pcq.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/share/man/man9/pcq.9
diff -u src/share/man/man9/pcq.9:1.8 src/share/man/man9/pcq.9:1.8.14.1
--- src/share/man/man9/pcq.9:1.8	Thu Feb  8 09:03:23 2018
+++ src/share/man/man9/pcq.9	Sun Jul 30 12:13:02 2023
@@ -1,4 +1,4 @@
-.\"     $NetBSD: pcq.9,v 1.8 2018/02/08 09:03:23 dholland Exp $
+.\"     $NetBSD: pcq.9,v 1.8.14.1 2023/07/30 12:13:02 martin Exp $
 .\"
 .\" Copyright (c) 2010 The NetBSD Foundation, Inc.
 .\" All rights reserved.
@@ -118,6 +118,63 @@ otherwise, return
 The item must not have the value of
 .Dv NULL .
 .El
+.Ss Memory ordering
+Any memory operations sequenced before
+.Fn pcq_put
+of an item in one thread happen before all memory operations with data
+dependencies on the item returned by
+.Fn pcq_get
+or
+.Fn pcq_peek
+in another thread.
+For example:
+.Bd -literal -offset indent
+int mumble;
+
+/* producer */
+mumble = 42;			// A
+foo->x = 123;			// B
+refcnt = foo->refcnt;		// C
+pcq_put(pcq, foo);
+KASSERT(refcnt == 0);
+
+/* consumer */
+foo = pcq_get(pcq);
+if (foo == NULL)
+	return;
+atomic_inc_uint(&foo->refcnt);	// D
+x = foo->x;			// E
+if (x == 123)
+	KASSERT(mumble == 42);	// F
+.Ed
+.Pp
+In this example, memory operations B and C happen-before D and E.
+However, no ordering is guaranteed for A or F relative to any other
+memory operations, because the memory location of
+.Fa mumble
+is independent of the pointer
+.Fa foo
+returned by
+.Fn pcq_get .
+.Pp
+If you must guarantee A happens before F, then on the consumer side,
+after
+.Fn pcq_get
+or
+.Fn pcq_peek ,
+you can call
+.Fn membar_acquire
+to turn it into an acquire operation instead of a consume operation;
+.Fn pcq_put
+serves as the matching release operation.
+.Po
+This is a little dicey.
+Perhaps there should be separate
+.Fn pcq_peek_acq
+and
+.Fn pcq_get_acq
+operations if this semantics is necessary.
+.Pc
 .Sh CODE REFERENCES
 The
 .Nm

Index: src/sys/kern/subr_pcq.c
diff -u src/sys/kern/subr_pcq.c:1.13 src/sys/kern/subr_pcq.c:1.13.18.1
--- src/sys/kern/subr_pcq.c:1.13	Mon Feb  8 09:31:05 2021
+++ src/sys/kern/subr_pcq.c	Sun Jul 30 12:13:02 2023
@@ -1,4 +1,4 @@
-/*	$NetBSD: subr_pcq.c,v 1.13 2021/02/08 09:31:05 wiz Exp $	*/
+/*	$NetBSD: subr_pcq.c,v 1.13.18.1 2023/07/30 12:13:02 martin Exp $	*/
 
 /*-
  * Copyright (c) 2009, 2019 The NetBSD Foundation, Inc.
@@ -31,10 +31,94 @@
 
 /*
  * Lockless producer/consumer queue.
+ *
+ * Summary of the producer algorithm in pcq_put (may run many in
+ * parallel with each other and with a consumer):
+ *
+ *	P1. initialize an item
+ *
+ *	P2. atomic_cas(&pcq->pcq_pc) loop to advance the producer
+ *	    pointer, reserving a space at c (fails if not enough space)
+ *
+ *	P3. atomic_store_release(&pcq->pcq_items[c], item) to publish
+ *	    the item in the space it reserved
+ *
+ * Summary of the consumer algorithm in pcq_get (must be serialized by
+ * caller with other consumers, may run in parallel with any number of
+ * producers):
+ *
+ *	C1. atomic_load_relaxed(&pcq->pcq_pc) to get the consumer
+ *	    pointer and a snapshot of the producer pointer, which may
+ *	    point to null items or point to initialized items (fails if
+ *	    no space reserved for published items yet)
+ *
+ *	C2. atomic_load_consume(&pcq->pcq_items[c]) to get the next
+ *	    unconsumed but potentially published item (fails if item
+ *	    not published yet)
+ *
+ *	C3. pcq->pcq_items[c] = NULL to consume the next unconsumed but
+ *	    published item
+ *
+ *	C4. membar_producer
+ *
+ *	C5. atomic_cas(&pcq->pcq_pc) loop to advance the consumer
+ *	    pointer
+ *
+ *	C6. use the item
+ *
+ * Note that there is a weird bare membar_producer which is not matched
+ * by membar_consumer.  This is one of the rare cases of a memory
+ * barrier on one side that is not matched by a memory barrier on
+ * another side, but the ordering works out, with a somewhat more
+ * involved proof.
+ *
+ * Some properties that need to be proved:
+ *
+ *	Theorem 1.  For pcq_put call that leads into pcq_get:
+ *	Initializing item at P1 is dependency-ordered before usage of
+ *	item at C6, so items placed by pcq_put can be safely used by
+ *	the caller of pcq_get.
+ *
+ *	Proof sketch.
+ *
+ *		Assume load/store P2 synchronizes with load/store C1
+ *		(if not, pcq_get fails in `if (p == c) return NULL').
+ *
+ *		Assume store-release P3 synchronizes with load-consume
+ *		C2 (if not, pcq_get fails in `if (item == NULL) return
+ *		NULL').
+ *
+ *		Then:
+ *
+ *		- P1 is sequenced before store-release P3
+ *		- store-release P3 synchronizes with load-consume C2
+ *		- load-consume C2 is dependency-ordered before C6
+ *
+ *		Hence transitively, P1 is dependency-ordered before C6,
+ *		QED.
+ *
+ *	Theorem 2.  For pcq_get call followed by pcq_put: Nulling out
+ *	location at store C3 happens before placing a new item in the
+ *	same location at store P3, so items are not lost.
+ *
+ *	Proof sketch.
+ *
+ *		Assume load/store C5 synchronizes with load/store P2
+ *		(otherwise pcq_peek starts over the CAS loop or fails).
+ *
+ *		Then:
+ *
+ *		- store C3 is sequenced before membar_producer C4
+ *		- membar_producer C4 is sequenced before load/store C5
+ *		- load/store C5 synchronizes with load/store P2 at &pcq->pcq_pc
+ *		- P2 is sequenced before store-release P3
+ *
+ *		Hence transitively, store C3 happens before
+ *		store-release P3, QED.
  */
 
 #include <sys/cdefs.h>
-__KERNEL_RCSID(0, "$NetBSD: subr_pcq.c,v 1.13 2021/02/08 09:31:05 wiz Exp $");
+__KERNEL_RCSID(0, "$NetBSD: subr_pcq.c,v 1.13.18.1 2023/07/30 12:13:02 martin Exp $");
 
 #include <sys/param.h>
 #include <sys/types.h>
@@ -100,7 +184,7 @@ pcq_put(pcq_t *pcq, void *item)
 	KASSERT(item != NULL);
 
 	do {
-		v = pcq->pcq_pc;
+		v = atomic_load_relaxed(&pcq->pcq_pc);
 		pcq_split(v, &op, &c);
 		p = pcq_advance(pcq, op);
 		if (p == c) {
@@ -116,10 +200,7 @@ pcq_put(pcq_t *pcq, void *item)
 	 * that the caller made to the data item are globally visible
 	 * before we put it onto the list.
 	 */
-#ifndef __HAVE_ATOMIC_AS_MEMBAR
-	membar_producer();
-#endif
-	pcq->pcq_items[op] = item;
+	atomic_store_release(&pcq->pcq_items[op], item);
 
 	/*
 	 * Synchronization activity to wake up the consumer will ensure
@@ -135,14 +216,13 @@ pcq_put(pcq_t *pcq, void *item)
 void *
 pcq_peek(pcq_t *pcq)
 {
-	const uint32_t v = pcq->pcq_pc;
+	const uint32_t v = atomic_load_relaxed(&pcq->pcq_pc);
 	u_int p, c;
 
 	pcq_split(v, &p, &c);
 
 	/* See comment on race below in pcq_get(). */
-	return (p == c) ? NULL :
-	    (membar_datadep_consumer(), pcq->pcq_items[c]);
+	return (p == c) ? NULL : atomic_load_consume(&pcq->pcq_items[c]);
 }
 
 /*
@@ -157,15 +237,13 @@ pcq_get(pcq_t *pcq)
 	u_int p, c;
 	void *item;
 
-	v = pcq->pcq_pc;
+	v = atomic_load_relaxed(&pcq->pcq_pc);
 	pcq_split(v, &p, &c);
 	if (p == c) {
 		/* Queue is empty: nothing to return. */
 		return NULL;
 	}
-	/* Make sure we read pcq->pcq_pc before pcq->pcq_items[c].  */
-	membar_datadep_consumer();
-	item = pcq->pcq_items[c];
+	item = atomic_load_consume(&pcq->pcq_items[c]);
 	if (item == NULL) {
 		/*
 		 * Raced with sender: we rely on a notification (e.g. softint
@@ -174,21 +252,32 @@ pcq_get(pcq_t *pcq)
 		 */
 		return NULL;
 	}
+	/*
+	 * We have exclusive access to this slot, so no need for
+	 * atomic_store_*.
+	 */
 	pcq->pcq_items[c] = NULL;
 	c = pcq_advance(pcq, c);
 	nv = pcq_combine(p, c);
 
 	/*
-	 * Ensure that update to pcq_items[] becomes globally visible
+	 * Ensure that update to pcq_items[c] becomes globally visible
 	 * before the update to pcq_pc.  If it were reordered to occur
 	 * after it, we could in theory wipe out a modification made
-	 * to pcq_items[] by pcq_put().
+	 * to pcq_items[c] by pcq_put().
+	 *
+	 * No need for load-before-store ordering of membar_release
+	 * because the only load we need to ensure happens first is the
+	 * load of pcq->pcq_items[c], but that necessarily happens
+	 * before the store to pcq->pcq_items[c] to null it out because
+	 * it is at the same memory location.  Yes, this is a bare
+	 * membar_producer with no matching membar_consumer.
 	 */
 #ifndef __HAVE_ATOMIC_AS_MEMBAR
 	membar_producer();
 #endif
 	while (__predict_false(atomic_cas_32(&pcq->pcq_pc, v, nv) != v)) {
-		v = pcq->pcq_pc;
+		v = atomic_load_relaxed(&pcq->pcq_pc);
 		pcq_split(v, &p, &c);
 		c = pcq_advance(pcq, c);
 		nv = pcq_combine(p, c);
@@ -201,7 +290,8 @@ pcq_create(size_t nitems, km_flag_t kmfl
 {
 	pcq_t *pcq;
 
-	KASSERT(nitems > 0 && nitems <= PCQ_MAXLEN);
+	KASSERT(nitems > 0);
+	KASSERT(nitems <= PCQ_MAXLEN);
 
 	pcq = kmem_zalloc(offsetof(pcq_t, pcq_items[nitems]), kmflags);
 	if (pcq != NULL) {

Reply via email to