On 2015-12-03 22:09:43 +0100, Andres Freund wrote: > On 2015-11-20 16:11:15 +0900, Michael Paquier wrote: > > + if (bkpb.fork == INIT_FORKNUM) > > + { > > + SMgrRelation srel; > > + srel = smgropen(bkpb.node, InvalidBackendId); > > + smgrimmedsync(srel, INIT_FORKNUM); > > + smgrclose(srel); > > + } > > A smgrwrite() instead of a smgrimmedsync() should be sufficient here.
More importantly, the smgrimmedsync() won't actually achieve anything - RestoreBackupBlockContents() will just have written the data into shared buffers. smgrimmedsync() doesn't flush that. And further, just flushing the buffer directly to disk using smgrwrite(), without reflecting that in the in-memory state, doesn't seem prudent. At the very least we'll write all those blocks twice. It also likely misses dealing with checksums and such. What I think we really ought to do instead is to use "proper" buffer functionality, e.g. flush the buffer via FlushBuffer(). It seems slightly better not to directly expose FlushBuffer() and instead add a tiny wrapper. Couldn't come up with a grand name tho. For me the attached, preliminary, patch, fixes the problem in master; previous branches ought to look mostly similar, except the flush moved to RestoreBackupBlockContents/RestoreBackupBlock. Does anybody have a better idea? Suitable for the back-branches? I'm kinda wondering if it wouldn't have been better to go through shared buffers in ResetUnloggedRelationsInDbspaceDir() instead of using copy_file(). Regareds, Andres
>From 9c4a7691595c1edfc12c5583cd007bb569f9db94 Mon Sep 17 00:00:00 2001 From: Andres Freund <and...@anarazel.de> Date: Tue, 8 Dec 2015 13:56:18 +0100 Subject: [PATCH] Preliminary-unlogged-rel-recovery-fix --- src/backend/access/transam/xlogutils.c | 9 +++++++++ src/backend/storage/buffer/bufmgr.c | 21 +++++++++++++++++++++ src/include/storage/bufmgr.h | 1 + 3 files changed, 31 insertions(+) diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index a5003c3..9073a85 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -368,6 +368,15 @@ XLogReadBufferForRedoExtended(XLogReaderState *record, MarkBufferDirty(*buf); + /* + * At the end of crash recovery the init forks of unlogged relations + * are copied, without going through shared buffers. So we need to + * force the on-disk state of init forks to always be in sync with the + * state in shared buffers. + */ + if (forknum == INIT_FORKNUM) + FlushOneBuffer(*buf); + return BLK_RESTORED; } else diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 63188a3..b32543b 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -2989,6 +2989,27 @@ FlushDatabaseBuffers(Oid dbid) } /* + * Flush a previously, shared or exclusively, locked and pinned buffer to the + * OS. + */ +void +FlushOneBuffer(Buffer buffer) +{ + BufferDesc *bufHdr; + + /* currently not needed, but no fundamental reason not to support */ + Assert(!BufferIsLocal(buffer)); + + Assert(BufferIsPinned(buffer)); + + bufHdr = GetBufferDescriptor(buffer - 1); + + LWLockHeldByMe(bufHdr->content_lock); + + FlushBuffer(bufHdr, NULL); +} + +/* * ReleaseBuffer -- release the pin on a buffer */ void diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h index 0f59201..09b2b11 100644 --- a/src/include/storage/bufmgr.h +++ b/src/include/storage/bufmgr.h @@ -176,6 +176,7 @@ extern void CheckPointBuffers(int flags); extern BlockNumber BufferGetBlockNumber(Buffer buffer); extern BlockNumber RelationGetNumberOfBlocksInFork(Relation relation, ForkNumber forkNum); +extern void FlushOneBuffer(Buffer buffer); extern void FlushRelationBuffers(Relation rel); extern void FlushDatabaseBuffers(Oid dbid); extern void DropRelFileNodeBuffers(RelFileNodeBackend rnode, -- 2.6.0.rc3
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers