On Tue, Oct 29, 2019 at 1:52 AM Peter Eisentraut <peter.eisentr...@2ndquadrant.com> wrote: > As mentioned in [0], pg_upgrade currently does not preserve the version > of collation objects created by initdb. Here is an attempt to fix that. > > The way I deal with this here is by having the binary-upgrade mode in > pg_dump delete all the collations created by initdb and then dump out > CREATE COLLATION commands with version information normally.
This seems to be basically OK. It does mean that the target database has collation OIDs >= FirstNormalObjectId. That is, they don't look like initdb-created objects, which is OK because they aren't, I'm just highlighting this to see if anyone else sees a problem with it. Suppose you pg_upgrade again: now you'll dump these collations just as you did the first time around, because they look exactly like user-defined collations. It also means that if you pg_upgrade to a target cluster created by a build without ICU we'll try to create ICU collations and that'll fail ("ICU is not supported in this build"), whereas before if had ICU collations and didn't ever make use of them, you'd be able to do such an upgrade; again this doesn't seem like a major problem, it's just an observation about an edge case. One more thing to note is if you upgrade from 12 to 13 on a glibc system, I think we'll automatically pick up the *current* version when creating the collations in the target DB, which seems to be OK but it is a choice to default to assuming that the database's indexes are not corrupted. Another observation is that you finish up with different OIDs in each database you upgrade, which again doesn't seem like a problem in itself. It is slightly odd that template1 finishes up with the old initdb's template1 collatoins, rather than the new initdb's opinion of the current set of collations, but I am not sure if it's a problem. I think it has to be like that, because you might have created other stuff that depends on those collations in your source template1 database, and so you have to preserve the versions. > I had originally imagined doing some kind of ALTER COLLATION (or perhaps > a direct UPDATE pg_collation) to update the version information, but > that doesn't really work because we don't know whether the collation > object with a given name in the new cluster is the same as the one in > the old cluster. So it seems more robust to just delete all existing > collations and create them from scratch. > > Thoughts? Seems to work as described with -E UTF-8, but it fails with clusters using -E SQL_ASCII. That causes the pg_upgrade check to fail on machines where that is the default encoding chosen by initdb (where unpatched master succeeds): pg_restore: creating COLLATION "pg_catalog.af-NA-x-icu" pg_restore: while PROCESSING TOC: pg_restore: from TOC entry 1700; 3456 12683 COLLATION af-NA-x-icu tmunro pg_restore: error: could not execute query: ERROR: collation "pg_catalog.af-NA-x-icu" for encoding "SQL_ASCII" does not exist Command was: ALTER COLLATION pg_catalog."af-NA-x-icu" OWNER TO tmunro;