Re: [HACKERS] Re: We have got a serious problem with pg_clog/WAL synchronization

[EMAIL PROTECTED] Fri, 13 Aug 2004 16:02:06 -0700

On Thu, Aug 12, 2004 at 01:13:46PM -0400, Tom Lane wrote:
> Kenneth Marshall <[EMAIL PROTECTED]> writes:
> > On Thu, Aug 12, 2004 at 09:58:56AM -0400, Tom Lane wrote:
> >> How would a read-only action work to block out the checkpoint?
>
> > The latch+version number is use by the checkpoint process. The
> > other processes can do a read of the latch to determine if it has
> > been set. This does not cause a cache invalidation hit. If the
> > latch is set, the competing processes read until it has been
> > cleared and the version updated. This makes the general case of
> > no checkpoint not incur a write and the consequent cache-line
> > invalidation and reload by all processors on an SMP system.
>
> Except that reading the latch and finding it clear offers no guarantee
> that a checkpoint isn't about to start.  The problem is that we are
> performing two separate actions (write a COMMIT xlog record and update
> transaction status in clog) and we have to prevent a checkpoint from
> starting in between those actions.  I don't see that there's any way to
> do that with a read-only latch.
>


...just caught up on this.

ISTM that more heavily loading the checkpoint process IS possible if the
checkpoint uses a two-phase lock. That would replace 1 write lock with 2
lock reads...which is likely to be beneficial for SMP, given I have faith
that the other two problems you mention will succumb to some solution in the
mid-term. The first lock is an "intent lock" followed by a second,
heavyweight lock just as you now have it.

Comitter:
1. prior to COMMIT: reads for an intent lock, if found then it attempts to
take heavyweight lock...if that is not possible, then the commit waits until
after the checkpoint, just as you currently suggest
2. prior to update clog: reads for an intent lock, if found then takes
heavyweight lock...if that is not possible, then report a server error

Checkpointer: (straight to step 4 for a shutdown checkpoint)
1. writes an intent lock (it always can)
2. wait for the group commit timeout
3. wait for 0.5 second more
4. begins to wait on an exclusive heavyweight lock, before starting
checkpoint proper

This is not a provably correct state machine, but the error message should
not occur under current "normal" situations. (It is possible that an intent
lock could be written by Checkpointer (step 1), after a Committer reads for
it (step 1), then a very long delay occurs before Committer's step 2), such
that Checkpointer step 4 begins before Committer step 2.) It is very likely
that this would be noticed by Comitter step 2 and reported upon, in the
unlikely event that it occurs.

Is a longer term solution for pg to use a background log writer? That would
make group commit much easier to perform automatically without the
false-delay model currently available.

Best Regards, Simon Riggs


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Re: [HACKERS] Re: We have got a serious problem with pg_clog/WAL synchronization

Reply via email to