Hi,

On 2015-12-10 18:36:32 +0100, Andres Freund wrote:
> On 2015-12-10 12:19:12 -0500, Robert Haas wrote:
> > > The real problem there imo isn't that the copy_relation_data() doesn't
> > > deal with 0 block tables, but that ATExecSetTableSpace() doesn't have a
> > > unlogged table specific codepath like heap_create_with_catalog() has.
> > 
> > It looks to me like somewhere we need to do log_smgrcreate(...,
> > INIT_FORKNUM) in the unlogged table case.
> 
> Yes.
> 
> > RelationCreateStorage()
> > skips this for the main forknum of an unlogged table, which seems OK,
> > but there's nothing that even attempts it for the init fork, which
> > does not seem OK.
> 
> We unfortunately can't trivially delegate that work to
> RelationCreateStorage(). E.g. heap_create() documents that only the main
> fork is created :(
> 
> > I guess that logic should happen in
> > ATExecSetTableSpace just after smgrcreate(dstrel, forkNum, false).
> 
> Looks like it's the easiest place.

> > > A second problem is that the smgrimmedsync() in copy_relation_data()
> > > isn't called for the init fork of unlogged relations, even if it needs
> > > to.

Here's a patch doing that. It's not yet fully polished, but I wanted to
get it out, because I noticed one thing:

In ATExecSetTableSpace(), for !main forks, we currently call
smgrcreate(), but not log_smgrcreate(). Even for PERSISTENT
relations. That seems a bit odd to me. It currently seems to be without
further consequence because, if there's actual data in the fork, we'll
just create the relation in _mdfd_getseg(); or we can cope with the
relation not being there.  But to me that feels wrong.

It seems better to do the log_smgrcreate() for RELPERSISTENCE_PERMANENT,
not just INIT_FORKNUM. What do you guys think?

> It sounds worthwhile to check that other locations rewriting tables,
> e.g. cluster/vacuum full/reindex are safe.

Seems to be ok, on a first glance.

Andres
>From a823b8aa2a24e61786ce924a3b237d846063f5fb Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Thu, 10 Dec 2015 20:14:39 +0100
Subject: [PATCH] WIP: Fix ALTER TABLE ... SET TABLESPACE for unlogged
 relations.

XXX: ATST over streaming rep didn't create the init fork on the other
side. Fix that. Also fix syncing of the init fork of unlogged relations
on the primary. Additionally always WAL log file creation for !main
forks of persistent relations.

Symptom:
ERROR:  58P01: could not open file "pg_tblspc/...": No such file or directory
on a promoted standby.

Reported-By: Michael Paquier
Author: Michael Paquier and Andres Freund
Backpatch: 9.1, where unlogged relations were introduced
---
 src/backend/commands/tablecmds.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 0ddde72..6df4939 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -42,6 +42,7 @@
 #include "catalog/pg_type.h"
 #include "catalog/pg_type_fn.h"
 #include "catalog/storage.h"
+#include "catalog/storage_xlog.h"
 #include "catalog/toasting.h"
 #include "commands/cluster.h"
 #include "commands/comment.h"
@@ -9659,6 +9660,15 @@ ATExecSetTableSpace(Oid tableOid, Oid newTableSpace, LOCKMODE lockmode)
 		if (smgrexists(rel->rd_smgr, forkNum))
 		{
 			smgrcreate(dstrel, forkNum, false);
+
+			/*
+			 * WAL log creation if the relation is persistent, or this is the
+			 * init fork of an unlogged relation.
+			 */
+			if (rel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT ||
+				(rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
+				 forkNum == INIT_FORKNUM))
+				log_smgrcreate(&newrnode, forkNum);
 			copy_relation_data(rel->rd_smgr, dstrel, forkNum,
 							   rel->rd_rel->relpersistence);
 		}
@@ -9878,6 +9888,7 @@ copy_relation_data(SMgrRelation src, SMgrRelation dst,
 	char	   *buf;
 	Page		page;
 	bool		use_wal;
+	bool		copying_initfork;
 	BlockNumber nblocks;
 	BlockNumber blkno;
 
@@ -9891,10 +9902,19 @@ copy_relation_data(SMgrRelation src, SMgrRelation dst,
 	page = (Page) buf;
 
 	/*
+	 * The init fork for an unlogged relation in many respects has to be
+	 * treated the same as normal relation, changes need to be WAL logged and
+	 * it needs to be synced to disk.
+	 */
+	copying_initfork = relpersistence == RELPERSISTENCE_UNLOGGED &&
+		forkNum == INIT_FORKNUM;
+
+	/*
 	 * We need to log the copied data in WAL iff WAL archiving/streaming is
 	 * enabled AND it's a permanent relation.
 	 */
-	use_wal = XLogIsNeeded() && relpersistence == RELPERSISTENCE_PERMANENT;
+	use_wal = XLogIsNeeded() &&
+		(relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork);
 
 	nblocks = smgrnblocks(src, forkNum);
 
@@ -9949,7 +9969,7 @@ copy_relation_data(SMgrRelation src, SMgrRelation dst,
 	 * wouldn't replay our earlier WAL entries. If we do not fsync those pages
 	 * here, they might still not be on disk when the crash occurs.
 	 */
-	if (relpersistence == RELPERSISTENCE_PERMANENT)
+	if (relpersistence == RELPERSISTENCE_PERMANENT || copying_initfork)
 		smgrimmedsync(dst, forkNum);
 }
 
-- 
2.6.0.rc3

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to