On 2014-02-15 16:18:00 +0100, Andres Freund wrote:
> On 2014-02-15 10:06:41 -0500, Tom Lane wrote:
> > Andres Freund <and...@2ndquadrant.com> writes:
> > > My current conclusion is that backporting barriers.h is by far the most
> > > reasonable way to go. The compiler problems have been ironed out by
> > > now...
> > 
> > -1.  IMO that code is still quite unproven, and what's more, the
> > problem we're discussing here is completely hypothetical.  If it
> > were real, we'd have field evidence of it.  We've not had that
> > much trouble seeing instances of even very narrow race-condition
> > windows in the past.
> 
> Well, the problem is that few of us have access to interesting !x86
> machines to run tests, and that's where we'd see problems (since x86
> gives enough guarantees to avoid this unless the compiler reorders
> stuff). I am personally fine with just using volatiles to avoid
> reordering in the older branches, but Florian argued against it.

Here's patches doing that. The 9.3 version also applies to 9.2; the 9.1
version applies back to 8.4.

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 0fe7ce4..a8d5b7f 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -647,12 +647,27 @@ LWLockRelease(LWLockId lockid)
 	 */
 	while (head != NULL)
 	{
+		/*
+		 * Use volatile to prevent the compiler from reordering the store to
+		 * lwWaitLink with the store to lwWaiting which could cause problems
+		 * when the to-be-woken-up backend wakes up spuriously and writes to
+		 * lwWaitLink when acquiring a new lock. That could corrupt the list
+		 * this backend is traversing leading to backends stuck waiting for a
+		 * lock.
+		 *
+		 * That's not neccessarily sufficient for out-of-order architectures,
+		 * but there've been no field report of problems. The proper solution
+		 * would be to use a write barrier, but those are not available in the
+		 * back branches.
+		 */
+		volatile PGPROC *vp = proc;
+
 		LOG_LWDEBUG("LWLockRelease", lockid, "release waiter");
-		proc = head;
-		head = proc->lwWaitLink;
-		proc->lwWaitLink = NULL;
-		proc->lwWaiting = false;
-		PGSemaphoreUnlock(&proc->sem);
+		vp = head;
+		head = vp->lwWaitLink;
+		vp->lwWaitLink = NULL;
+		vp->lwWaiting = false;
+		PGSemaphoreUnlock(&vp->sem);
 	}
 
 	/*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 4f88d3f..cff631d 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -27,6 +27,7 @@
 #include "commands/async.h"
 #include "miscadmin.h"
 #include "pg_trace.h"
+#include "storage/barrier.h"
 #include "storage/ipc.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -831,10 +832,21 @@ LWLockRelease(LWLockId lockid)
 	 */
 	while (head != NULL)
 	{
+		/*
+		 * Use a write barrier to prevent the compiler from reordering the
+		 * store to lwWaitLink with the store to lwWaiting which could cause
+		 * problems when the to-be-woken-up backend wakes up spuriously and
+		 * writes to lwWaitLink when acquiring a new lock. That could corrupt
+		 * the list this backend is traversing leading to backends stuck
+		 * waiting for a lock. A write barrier is sufficient as the read side
+		 * only accesses the data while holding a spinlock which acts as a
+		 * full barrier.
+		 */
 		LOG_LWDEBUG("LWLockRelease", lockid, "release waiter");
 		proc = head;
 		head = proc->lwWaitLink;
 		proc->lwWaitLink = NULL;
+		pg_write_barrier();
 		proc->lwWaiting = false;
 		PGSemaphoreUnlock(&proc->sem);
 	}
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 85a0ce9..22f8540 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1872,9 +1872,11 @@ WakeupWaiters(XLogRecPtr EndPos)
 	 */
 	while (head != NULL)
 	{
+		/* check comment in LWLockRelease() about barrier usage */
 		proc = head;
 		head = proc->lwWaitLink;
 		proc->lwWaitLink = NULL;
+		pg_write_barrier();
 		proc->lwWaiting = false;
 		PGSemaphoreUnlock(&proc->sem);
 	}
@@ -1966,9 +1968,11 @@ WALInsertSlotReleaseOne(int slotno)
 	 */
 	while (head != NULL)
 	{
+		/* check comment in LWLockRelease() about barrier usage */
 		proc = head;
 		head = proc->lwWaitLink;
 		proc->lwWaitLink = NULL;
+		pg_write_barrier();
 		proc->lwWaiting = false;
 		PGSemaphoreUnlock(&proc->sem);
 	}
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 82ef440..98c4845 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -28,6 +28,7 @@
 #include "miscadmin.h"
 #include "pg_trace.h"
 #include "replication/slot.h"
+#include "storage/barrier.h"
 #include "storage/ipc.h"
 #include "storage/predicate.h"
 #include "storage/proc.h"
@@ -944,10 +945,21 @@ LWLockRelease(LWLock *l)
 	 */
 	while (head != NULL)
 	{
+		/*
+		 * Use a write barrier to prevent the compiler from reordering the
+		 * store to lwWaitLink with the store to lwWaiting which could cause
+		 * problems when the to-be-woken-up backend wakes up spuriously and
+		 * writes to lwWaitLink when acquiring a new lock. That could corrupt
+		 * the list this backend is traversing leading to backends stuck
+		 * waiting for a lock. A write barrier is sufficient as the read side
+		 * only accesses the data while holding a spinlock which acts as a
+		 * full barrier.
+		 */
 		LOG_LWDEBUG("LWLockRelease", T_NAME(l), T_ID(l), "release waiter");
 		proc = head;
 		head = proc->lwWaitLink;
 		proc->lwWaitLink = NULL;
+		pg_write_barrier();
 		proc->lwWaiting = false;
 		PGSemaphoreUnlock(&proc->sem);
 	}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to