On 21.11.2010 15:18, Robert Haas wrote:
On Sat, Nov 20, 2010 at 4:07 PM, Tom Lane<t...@sss.pgh.pa.us> wrote:
Robert Haas<robertmh...@gmail.com> writes:
So what DO we need to guard against here?
I think the general problem can be stated as "process A changes two or
more values in shared memory in a fairly short span of time, and process
B, which is concurrently examining the same variables, sees those
changes occur in a different order than A thought it made them in".
In practice we do not need to worry about changes made with a kernel
call in between, as any sort of context swap will cause the kernel to
force cache synchronization.
Also, the intention is that the locking primitives will take care of
this for any shared structures that are protected by a lock. (There
were some comments upthread suggesting maybe our lock code is not
bulletproof; but if so that's something to fix in the lock code, not
a logic error in code using the locks.)
So what this boils down to is being an issue for shared data structures
that we access without using locks. As, for example, the latch
structures.
So is the problem case a race involving owning/disowning a latch vs.
setting that same latch?
No. (or maybe that as well, but that's not what we've been concerned
about here). As far as I've understood correctly, the problem is that
process A does something like this:
/* set a shared variable */
((volatile bool *) shmem)->variable = true;
/* Wake up process B to notice that we changed the variable */
SetLatch();
And process B does this:
for (;;)
{
ResetLatch();
if (((volatile bool *) shmem)->variable)
DoStuff();
WaitLatch();
}
This is the documented usage pattern of latches. The problem arises if
process A runs just before ResetLatch, but the effect of setting the
shared variable doesn't become visible until after the if-test in
process B. Process B will clear the is_set flag in ResetLatch(), but it
will not DoStuff(), so it in effect misses the wakeup from process A and
goes back to sleep even though it would have work to do.
This situation doesn't arise in the current use of latches, because the
shared state comparable to shmem->variable in the above example is
protected by a spinlock. But it might become an issue in some future use
case.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers