Re: [HACKERS] corrupt pages detected by enabling checksums

Simon Riggs Thu, 04 Apr 2013 05:31:19 -0700

On 4 April 2013 02:39, Andres Freund <and...@2ndquadrant.com> wrote:

> Ok, I think I see the bug. And I think its been introduced in the
> checkpoints patch.


Well spotted. (I think you mean checksums patch).

> If by now the first backend has proceeded to PageSetLSN() we are writing
> different data to disk than the one we computed the checksum of
> before. Boom.

Right, so nothing else we were doing was wrong, that's why we couldn't
spot a bug. The problem is that we aren't replaying enough WAL because
the checksum on the WAL record is broke.

> I think the whole locking interactions in MarkBufferDirtyHint() need to
> be thought over pretty carefully.

When we write out a buffer with checksums enabled, we take a copy of
the buffer so that the checksum is consistent, even while other
backends may be writing hints to the same bufer.

I missed out on doing that with XLOG_HINT records, so the WAL CRC can
be incorrect because the data is scanned twice; normally that would be
OK because we have an exclusive lock on the block, but with hints we
only have share lock. So what we need to do is take a copy of the
buffer before we do XLogInsert().

Simple patch to do this attached for discussion. (Not tested).

We might also do this by modifying the WAL record to take the whole
block and bypass the BkpBlock mechanism entirely. But that's more work
and doesn't seem like it would be any cleaner. I figure lets solve the
problem first then discuss which approach is best.

-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

copy_before_XLOG_HINT.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

Reply via email to