On 2013-05-29 03:56:38 +0200, Andres Freund wrote: > On 2013-05-28 21:36:17 -0400, Robert Haas wrote: > > On Tue, May 28, 2013 at 1:58 PM, Andres Freund <and...@2ndquadrant.com> > > wrote: > > > Guessing around I looked and noticed the following problematic pattern: > > > 1) A: wants to do an update, doesn't have enough freespace > > > 2) A: extends the relation on the filesystem level > > > (RelationGetBufferForTuple) > > > 3) A: does PageInit (RelationGetBufferForTuple) > > > 4) A: aborts, e.g. due to a serialization failure (heap_update) > > > > > > At this point the page is initialized in memory, but not wal logged. It > > > isn't pinned or locked either. > > > > > > 5) B: vacuum finds that page and it's empty. So it marks it all > > > visible. But since the page wasn't written out (we haven't even marked > > > it dirty in 3.) the standby doesn't know that and reports the page as > > > being uninitialized. > > > > > > ISTM the best backbranchable fix for this is to teach lazy_scan_heap to > > > log an FPI for the heap page via visibilitymap_set in that rather > > > limited case. > > > > > > Happy to provide a patch unless somebody has a better idea? > > > > Good catch. However, I would suggest using log_newpage() before > > visibilitymap_set() rather than trying to stick extra logic into > > visibilitymap_set(). I think that will be cleaner and simpler. > > Thought about that, but given that 9.3's visibilitymap_set already will > already FPI heap pages I concluded it wouldn't really be an improvement > since it's only one ||log_heap_page or so there. Not sure what's > better. Will write the patch and see how it goes.
Ended up using log_newpage_buffer since reusing visibilitymap_set's record would break the wal format as we currently do not accept an FPI on the heap pages during replay when < 9.3. Forcing to upgrade the client first would be rather unfriendly... That has the disadvantage of logging a full heap page since it doesn't use the hole optimization but this happens really infrequently, so ... I don't think this can happen < 9.2 in at least this code path. I wonder though if there are other codepaths were this can happen although I haven't found any on a cursory inspection. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
>From 62e1c6fc679af80f8f4d26f1fe9d2fae8f430100 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Wed, 29 May 2013 15:37:05 +0200 Subject: [PATCH] Ensure that all_visible WAL records operate on an intialized page There was a race condition were lazy_scan_heap could cause all_visible records to be emitted although the page marked all visible wasn't logged in an initialized state yet causing a PANIC during WAL replay. --- src/backend/commands/vacuumlazy.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c index 9d30415..4eff3c1 100644 --- a/src/backend/commands/vacuumlazy.c +++ b/src/backend/commands/vacuumlazy.c @@ -663,6 +663,31 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, /* empty pages are always all-visible */ if (!PageIsAllVisible(page)) { + /* + * There's a race condition here we need to prevent: Another + * backend might have extended the heap and PageInit'ed but + * ERRORed before filling the page with actual content. + * + * Since relation extension is not WAL logged (and we don't + * even dirty the buffer) it's quite possible that the page's + * content hasn't made it to disk yet, especially on a + * standby. During WAL replay we then could find an + * uninitialized page when trying to set the all visible flag + * which will cause a PANIC lateron. + * + * If there already is a WAL record covering this page + * (i.e. its LSN is valid) the respective record it will + * initialize the page, so we don't need to do anything in that + * case. + * + * XXX: It would be nice to use a logging method supporting + * standard buffers here since log_newpage_buffer() will write + * the full block instead of omitting the hole. + */ + if (RelationNeedsWAL(onerel) && + PageGetLSN(page) == InvalidXLogRecPtr) + log_newpage_buffer(buf); + PageSetAllVisible(page); MarkBufferDirty(buf); visibilitymap_set(onerel, blkno, buf, InvalidXLogRecPtr, -- 1.7.10.4
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers