On 21.11.2013 17:08, Merlin Moncure wrote:
On Thu, Nov 21, 2013 at 9:02 AM, Andres Freund <and...@2ndquadrant.com> wrote:
On 2013-11-21 16:25:02 +0200, Heikki Linnakangas wrote:
Hmm. All callers of RecoveryInProgress() must be prepared to handle the case
that RecoveryInProgress() returns true, but the system is no longer in
recovery. No matter what locking we do in RecoveryInProgress(), the startup
process might finish recovery just after RecoveryInProgress() has returned.
True.
What about the attached? It reads the shared variable without a lock or
barrier. If it returns 'true', but the system in fact just exited recovery,
that's OK. As explained above, all the callers must tolerate that anyway.
But if it returns 'false', then it performs a full memory barrier, which
should ensure that it sees any other shared variables as it is after the
startup process cleared SharedRecoveryInProgress (notably,
XLogCtl->ThisTimeLineID).
I'd argue that we should also remove the spinlock in StartupXLOG and
replace it with a write barrier. Obviously not for performance reasons,
but because somebody might add more code to run under that spinlock.
Looks good otherwise, although a read memory barrier ought to suffice.
This code is in a very hot code path. Are we *sure* that the read
barrier is fast enough that we don't want to provide an alternate
function that only returns the local flag? I don't know enough about
them to say either way.
In my patch, I put the barrier inside the if (!LocalRecoveryInProgress)
block. That codepath can only execute once in a backend, so performance
is not an issue there. Does that look sane to you?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers