Re: [BUGS] LISTEN/NOTIFY race condition?

Tom Lane Mon, 10 Mar 2008 15:53:43 -0700

Laurent Birtz <[EMAIL PROTECTED]> writes:
> The command backend begins a transaction, then the event backend begins a
> transaction for LISTEN. The event backend updates the pg_listener table
> in mutual exclusion, but it releases the mutex before the transaction is
> commited. The command thread does its INSERT then starts its COMMIT. The
> command backend scans the pg_listener table and doesn't find anyone to
> notify. Before the command backend has the time to actually commit, the
> event backend commits and the listen thread has the time to execute the
> SELECT statement. Since the command thread didn't commit yet, the SELECT
> returns nothing. Finally, the command thread commits, without notifying
> anyone.


Hmm ... yup, that's a race condition all right, and of sufficiently low
probability that it's not surprising it's gone undetected for so long.

>    In Async_Listen(): change
>    'heap_close(lRel, ExclusiveLock);' for 'heap_close(lRel, NoLock);'.

This solution is pretty ugly, though, because we allow people to
execute LISTEN/UNLISTEN in transaction blocks, which means that the
ExclusiveLock could be held for quite some time.  Not only is that bad
for performance but it poses significant risks of deadlocks.

What I am thinking is that we need to change LISTEN/UNLISTEN so that
they don't attempt to modify the pg_listener table until COMMIT time.
It would go about like this:

1. LISTEN and UNLISTEN would make entries in a transaction-lifespan
list, much as NOTIFY does, with suitable logic to cancel each others'
effects on the list as needed.  They don't touch pg_listener at all.

2. If we abort the transaction then the list of pending actions is just
thrown away.  (Hmm, we'd also need to account for subtransactions...)

3. If we commit, then AtCommit_Notify is responsible for applying any
pg_listener insertions and deletions indicated by the pending-actions
list, after it grabs ExclusiveLock and before it starts the notify loop.
(A CommandCounterIncrement between this step and the notify loop will
assure that LISTEN and NOTIFY in the same transaction result in a
self-notify just as before.)

Since AtCommit_Notify doesn't release the lock, the race condition is
fixed, and since it only happens immediately before commit, there is no
added risk of deadlock.

The only externally visible difference in behavior is that it's less
likely for failed LISTENing transactions to leave dead tuples in
pg_listener.  There is an *internally* visible difference, in that
a sequence like

        BEGIN;
        LISTEN foo;
        SELECT ... FROM pg_listener ...
        COMMIT;

will fail to see any tuple from the LISTEN in pg_listener, where
historically it did.  I suppose it's conceivable that some application
out there depends on this, but it doesn't seem very likely.  (Slony guys
want to speak up here?)

Comments?  Have I missed anything?

Looking at this sketch, my first reaction is that it's a lot of code
to write in support of an implementation we hope to throw away soon.
But my second reaction is that we're going to need most of that code
anyway in a pg_listener-less implementation, because the module will
have to emulate the current transactional behavior of LISTEN/UNLISTEN
without any table to store rows in.  So there should be something
salvageable from the effort.

Assuming that we implement this (I'm willing to write the code),
is it sane to back-patch such a non-trivial change?  It's a bit
scary but on the other hand we've back-patched larger changes when
we had to.  Leaving the race condition in place isn't appetizing,
and neither is introducing deadlock risks that weren't there before.

                        regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Re: [BUGS] LISTEN/NOTIFY race condition?

Reply via email to