Hi,

On 2018-11-21 10:36:42 -0800, Andres Freund wrote:
> >pg_upgrade against old versions, by the look of it. Even after I drop 
> >oids from user tables, I get errors like this when running pg_dumpall 
> >against a pg_upgraded REL_9_4_STABLE datadir:

Not pg_upgrade in general, I did test that. "Just" large objects.
pg_largeobject and pg_largeobject_metadata are the only catalog tables
that we copy over from the old cluster. I'd checked that pg_largeobject
doesn't contain an oid column, but somehow missed that pg_upgrade also
copies pg_largeobject_metadata - which does contain an oid.

Part of that is me just missing that fact, but part of it is also that a
binary upgrade pg_dump actually makes it nearly entirely unnecessary to
migrate pg_largeobject_metadata. pg_dump in binary upgrade mode *does*
issue lo_create() calls, and also emits ALTER LARGE OBJECT .. OWNER TO
.. statements. Which make it largely unnecessary to copy
pg_largeobject_metadata over.  Only grants where skipped, but that's
trivial to change.

The attached fix fixes this issue "simply" by enabling the dumping of
blob ACLS in binary upgrade mode, and not copying the underlying files
over anymore. As we already emit multiple statements for each large
objects, doing so additionally for grants ought not to matter too
much. The right way to optimize that, if we wanted, would be to stop
dumping large objects metadata via pg_dump alltogether in pg_upgrade
(and make it a single COPY (SELECT) command that's restrored via COPY).

There's one comment atop of pg_upgrade that I'm not sure about anymore:
 *      We control all assignments of pg_authid.oid because these oids are 
stored
 *      in pg_largeobject_metadata. XXX still

That's not true anymore after this change. Turns out we currently don't
prevent regrole from being used as a column type (I'll start a thread
about insufficient reg* verification), but we could just prohibit that.
Any comments?


Note that this doesn't fix the < 9.0 path where we recreate
pg_largeobject_metadata. I basically think we should just remove support
for that version. But even if we cannot agree to that, I first want to
know if this fixes your full set of problems.


> >2018-11-21 13:01:58.582 EST [11861] ERROR:  large object 10 does not
> >exist
> >2018-11-21 13:01:58.637 EST [11686] ERROR:  could not open file 
> >"base/17486/2840_vm": No such file or directory
> >2018-11-21 13:01:58.637 EST [11686] CONTEXT:  writing block 0 of 
> >relation base/17486/2840_vm
> >2018-11-21 13:01:59.638 EST [11686] ERROR:  could not open file 
> >"base/17486/2840_vm": No such file or directory
> >2018-11-21 13:01:59.638 EST [11686] CONTEXT:  writing block 0 of 
> >relation base/17486/2840_vm
> >2018-11-21 13:01:59.638 EST [11686] WARNING:  could not write block 0
> >of 
> >base/17486/2840_vm

I was not able to reproduce this issue.  Could you check whether you
still encounter the issue after applying the attached fix? And if so,
provide me with more detailed instructions on how to reproduce?

Greetings,

Andres Freund
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 9c4e1dc32a6..a8e2daf0ed1 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -2866,8 +2866,8 @@ dumpDatabase(Archive *fout)
 					 NULL, NULL);
 
 	/*
-	 * pg_largeobject and pg_largeobject_metadata come from the old system
-	 * intact, so set their relfrozenxids and relminmxids.
+	 * pg_largeobject comes from the old system intact, so set its
+	 * relfrozenxids and relminmxids.
 	 */
 	if (dopt->binary_upgrade)
 	{
@@ -2912,47 +2912,6 @@ dumpDatabase(Archive *fout)
 
 		PQclear(lo_res);
 
-		/*
-		 * pg_largeobject_metadata
-		 */
-		if (fout->remoteVersion >= 90000)
-		{
-			resetPQExpBuffer(loFrozenQry);
-			resetPQExpBuffer(loOutQry);
-
-			if (fout->remoteVersion >= 90300)
-				appendPQExpBuffer(loFrozenQry, "SELECT relfrozenxid, relminmxid\n"
-								  "FROM pg_catalog.pg_class\n"
-								  "WHERE oid = %u;\n",
-								  LargeObjectMetadataRelationId);
-			else
-				appendPQExpBuffer(loFrozenQry, "SELECT relfrozenxid, 0 AS relminmxid\n"
-								  "FROM pg_catalog.pg_class\n"
-								  "WHERE oid = %u;\n",
-								  LargeObjectMetadataRelationId);
-
-			lo_res = ExecuteSqlQueryForSingleRow(fout, loFrozenQry->data);
-
-			i_relfrozenxid = PQfnumber(lo_res, "relfrozenxid");
-			i_relminmxid = PQfnumber(lo_res, "relminmxid");
-
-			appendPQExpBufferStr(loOutQry, "\n-- For binary upgrade, set pg_largeobject_metadata relfrozenxid and relminmxid\n");
-			appendPQExpBuffer(loOutQry, "UPDATE pg_catalog.pg_class\n"
-							  "SET relfrozenxid = '%u', relminmxid = '%u'\n"
-							  "WHERE oid = %u;\n",
-							  atooid(PQgetvalue(lo_res, 0, i_relfrozenxid)),
-							  atooid(PQgetvalue(lo_res, 0, i_relminmxid)),
-							  LargeObjectMetadataRelationId);
-			ArchiveEntry(fout, nilCatalogId, createDumpId(),
-						 "pg_largeobject_metadata", NULL, NULL, "",
-						 "pg_largeobject_metadata", SECTION_PRE_DATA,
-						 loOutQry->data, "", NULL,
-						 NULL, 0,
-						 NULL, NULL);
-
-			PQclear(lo_res);
-		}
-
 		destroyPQExpBuffer(loFrozenQry);
 		destroyPQExpBuffer(loOutQry);
 	}
@@ -3266,18 +3225,16 @@ getBlobs(Archive *fout)
 			binfo[i].dobj.dump &= ~DUMP_COMPONENT_ACL;
 
 		/*
-		 * In binary-upgrade mode for blobs, we do *not* dump out the data or
-		 * the ACLs, should any exist.  The data and ACL (if any) will be
-		 * copied by pg_upgrade, which simply copies the pg_largeobject and
-		 * pg_largeobject_metadata tables.
+		 * In binary-upgrade mode for blobs, we do *not* dump out the data.
+		 * The data will be copied by pg_upgrade, which simply copies the
+		 * pg_largeobject table.
 		 *
-		 * We *do* dump out the definition of the blob because we need that to
-		 * make the restoration of the comments, and anything else, work since
-		 * pg_upgrade copies the files behind pg_largeobject and
-		 * pg_largeobject_metadata after the dump is restored.
+		 * We *do* dump out anything but the data, as pg_upgrade copies just
+		 * pg_largeobject, but not pg_largeobject_metadata, after the dump is
+		 * restored.
 		 */
 		if (dopt->binary_upgrade)
-			binfo[i].dobj.dump &= ~(DUMP_COMPONENT_DATA | DUMP_COMPONENT_ACL);
+			binfo[i].dobj.dump &= ~DUMP_COMPONENT_DATA;
 	}
 
 	/*
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 8d66fd2ee7b..2afd950591b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -2791,9 +2791,9 @@ my %tests = (
 			data_only              => 1,
 			section_pre_data       => 1,
 			test_schema_plus_blobs => 1,
+			binary_upgrade => 1,
 		},
 		unlike => {
-			binary_upgrade => 1,
 			no_blobs       => 1,
 			no_privs       => 1,
 			schema_only    => 1,
diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index fd0b44c3ce9..95a00a49132 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -441,8 +441,7 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 	 *
 	 * pg_largeobject contains user data that does not appear in pg_dump
 	 * output, so we have to copy that system table.  It's easiest to do that
-	 * by treating it as a user table.  Likewise for pg_largeobject_metadata,
-	 * if it exists.
+	 * by treating it as a user table.
 	 */
 	snprintf(query + strlen(query), sizeof(query) - strlen(query),
 			 "WITH regular_heap (reloid, indtable, toastheap) AS ( "
@@ -458,10 +457,8 @@ get_rel_infos(ClusterInfo *cluster, DbInfo *dbinfo)
 			 "                        'binary_upgrade', 'pg_toast') AND "
 			 "      c.oid >= %u::pg_catalog.oid) OR "
 			 "     (n.nspname = 'pg_catalog' AND "
-			 "      relname IN ('pg_largeobject'%s) ))), ",
-			 FirstNormalObjectId,
-			 (GET_MAJOR_VERSION(old_cluster.major_version) >= 900) ?
-			 ", 'pg_largeobject_metadata'" : "");
+			 "      relname IN ('pg_largeobject') ))), ",
+			 FirstNormalObjectId);
 
 	/*
 	 * Add a CTE that collects OIDs of toast tables belonging to the tables
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index b777f9d651b..abb4171431e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -29,7 +29,7 @@
  *	in user tables as enum values.
  *
  *	We control all assignments of pg_authid.oid because these oids are stored
- *	in pg_largeobject_metadata.
+ *	in pg_largeobject_metadata. XXX still
  */
 
 

Reply via email to