On Thu, Jul 17, 2025 at 09:24:02AM -0400, Robert Haas wrote: > On Mon, Jul 7, 2025 at 3:27 PM Noah Misch <n...@leadboat.com> wrote: > > Let's get rid of pg_dump's need to sort by OID, apart from catalog > > corruption > > scenarios. > > +1. I had at one point believed that sorting by OID was a good way to > make dumps stable, but this disproves that theory. Sorting by logical > properties of the object is better.
Sorting by OID was a reasonable approximation, for its time. > > Adding an assert found a total of seven affected object types. > > See the second attached patch. The drawback is storing five more fields in > > pg_dump memory: oprleft, oprright, opcmethod, opfmethod, and collencoding. > > That seems minor relative to existing pg_dump memory efficiency. Since this > > is a source of test flakes in v18, I'd like to back-patch to v18. I'm not > > sure why the buildfarm hasn't seen the above diff, but I expect the diff > > could > > happen there. This is another nice win for the new test from commit > > 172259afb563d35001410dc6daad78b250924038. The order instability was always > > bad for users, but the test brought it to the forefront. One might argue > > for > > back-patching $SUBJECT further, too. > > I agree with back-patching it at least as far as v18. I think it > probably wouldn't hurt anything to back-patch further, and it might > avoid future buildfarm failures. Against that, there's a remote > possibility that someone who is currently saving pg_dump output for > later comparison, say in a case where OIDs are always stable in > practice, could be displeased to see the pg_dump order change in a > minor release. But that seems like a very weak argument against > back-patching. I can't see us ever deciding to put up with buildfarm > instability on such grounds. Thanks for reviewing. I agree with those merits of back-patching further. A back-patch to v13 has no pg_dump_sort.c conflicts, while pg_dump.c has mechanical conflicts around retrieving the extra sort inputs. If there are no objections in the next 72h, I'll likely back-patch. > Reviewing: > > + * Sort by name. This differs from "Name:" in plain format output, which > + * is a _tocEntry.tag. For example, DumpableObject.name of a constraint > + * is pg_constraint.conname, but _tocEntry.tag of a constraint is relname > + * and conname joined with a space. > > This comment is useful, but if I were to be critical, it does a better > job saying what this field isn't than what it is. True. I've changed it to this: * Sort by name. With a few exceptions, names here are single catalog * columns. To get a fuller picture, grep pg_dump.c for "dobj.name = ". * Names here don't match "Name:" in plain format output, which is a * _tocEntry.tag. For example, DumpableObject.name of a constraint is * pg_constraint.conname, but _tocEntry.tag of a constraint is relname and * conname joined with a space. The patch's original change to the comment was a reaction to my own surprise. Reading "pg_dump regression|grep Name:|sort|grep CONSTRAINT" I saw unique "Name:" output for constraints, which felt at odds with the instability in DOTypeNameCompare() sorting them. But it wasn't the name I was looking for: - getConstraints() sets DumpableObject.name = conname - DOTypeNameCompare() sorts by DumpableObject.name - dumpConstraint() sets _tocEntry.tag = "relname conname" - _tocEntry.tag becomes the "Name:" in pg_dump output Long-term, in a post-scarcity world, I'd do one of these or similar: a. Change what we store in DumpableObject.name so it matches _tocEntry.tag. b. Rename field DumpableObject.name, so there's no longer a field called "name" with contents different from the "Name:" values in pg_dump output. > + * Sort by encoding, per pg_collation_name_enc_nsp_index. This is > + * mostly academic, because CREATE COLLATION has restrictions to make > + * (nspname, collname) uniquely identify a collation within a given > + * DatabaseEncoding. pg_import_system_collations() bypasses those > + * restrictions, but pg_dump+restore fails after a > + * pg_import_system_collations('my_schema') that creates collations > + * for a blend of encodings. > > This comment is also useful, but if I were to be critical again, it > does a better job saying why we shouldn't do what the code then does > than why we should. I've tried the following further refinement. If it's worse, I can go back to the last version. diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c index ffae7b3..f7d6a03 100644 --- a/src/bin/pg_dump/pg_dump_sort.c +++ b/src/bin/pg_dump/pg_dump_sort.c @@ -248,7 +250,19 @@ DOTypeNameCompare(const void *p1, const void *p2) if (cmpval != 0) return cmpval; - /* To have a stable sort order, break ties for some object types */ + /* + * To have a stable sort order, break ties for some object types. Most + * catalogs have a natural key, e.g. pg_proc_proname_args_nsp_index. + * Where the above "namespace" and "name" comparisons don't cover all + * natural key columns, compare the rest here. + * + * The natural key usually refers to other catalogs by surrogate keys. + * Hence, this translates each of those references to the natural key of + * the referenced catalog. That may descend through multiple levels of + * catalog references. For example, to sort by pg_proc.proargtypes, + * descend to each pg_type and then further to its pg_namespace, for an + * overall sort by (nspname, typname). + */ if (obj1->objType == DO_FUNC || obj1->objType == DO_AGG) { FuncInfo *fobj1 = *(FuncInfo *const *) p1; @@ -312,13 +326,16 @@ DOTypeNameCompare(const void *p1, const void *p2) CollInfo *cobj2 = *(CollInfo *const *) p2; /* - * Sort by encoding, per pg_collation_name_enc_nsp_index. This is - * mostly academic, because CREATE COLLATION has restrictions to make - * (nspname, collname) uniquely identify a collation within a given - * DatabaseEncoding. pg_import_system_collations() bypasses those - * restrictions, but pg_dump+restore fails after a - * pg_import_system_collations('my_schema') that creates collations - * for a blend of encodings. + * Sort by encoding, per pg_collation_name_enc_nsp_index. Wherever + * this changes dump order, restoring the dump fails anyway. CREATE + * COLLATION can't create a tie for this to break, because it imposes + * restrictions to make (nspname, collname) uniquely identify a + * collation within a given DatabaseEncoding. While + * pg_import_system_collations() can create a tie, pg_dump+restore + * fails after pg_import_system_collations('my_schema') does so. + * There's little to gain by ignoring one natural key column on the + * basis of those limitations elsewhere, so respect the full natural + * key like we do for other object types. */ cmpval = cobj1->collencoding - cobj2->collencoding; if (cmpval != 0)