On Mon, Nov 4, 2019 at 11:13 AM Julien Rouhaud <rjuju...@gmail.com> wrote: > > On Mon, Nov 4, 2019 at 4:58 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > > > > Unfortunately I haven't had time to work on it seriously, but here's a > > quick rebase to get the proof-of-concept back into working shape.
Thanks! I also removed the test for REFRESH VERSION command that was forgotten in the patch set, and run a pgindent. > > Here are some problems to think about: > > > > * We'd need to track dependencies on the default collation once we > > have versioning for that (see > > https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com). > > That is how most people actually consume collations out there in real > > life, and yet we don't normally track dependencies on the default > > collation and I don't know if that's simply a matter of ripping out > > all the code that looks like "xxx != DEFAULT_COLLATION_ID" in the > > dependency analysis code or if there's more to it. > > This isn't enough. What would remain is: > > - teach get_collation_version_for_oid() to return the default > collation name, which is simple > - have recordDependencyOnVersion() actually records the dependency, > which wouldn't happen as the default collation is pinned. > > An easy fix would be to teach isObjectPinned() to ignore > CollationRelationId / DEFAULT_COLLATION_OID, but that's ugly and would > allow too many dependencies to be stored. Not pinning the default > collation during initdb doesn't sound a good alternative either. > Maybe adding a force flag or a new DependencyType that'd mean "normal > but forced" would be ok? Attached 4th patch handles default collation. I went with an ignore_systempin flag in recordMultipleDependencies. > > > * Andres mentioned off-list that pg_depend rows might get blown away > > and recreated in some DDL circumstances. We need to look into that. I tried various flavour of DDL but I couldn't wipe out the pg_depend rows without having an index rebuild triggered (like changing the underlying column datatype). Do you have any scenario where the index rebuild wouldn't be triggered? > > * Another is that pg_upgrade won't preserve pg_depend rows, so you'd > > need some catalog manipulation (direct or via new DDL) to fix that. Attached 5th patch add a new "ALTER INDEX idx_name DEPENDS ON COLLATION coll_oid VERSION coll_version_text" that can only be executed in binary upgrade mode, and teach pg_dump to generate such commands (including for indexes created for constraints). One issue is that older versions don't have pg_depend information, so pg_dump can't find the dependencies to generate such commands and override the version with anything else. It means that the upgraded cluster will show all indexes as depending on the current collation provider version. I'm not sure if that's the best thing to do, or if we should change all pg_depend rows to mention "unknown" version or something like that. It would generate so much noise that it's probably not worth it. I didn't do anything for the spurious warning when running a reindex, and kept original approach of pg_depend catalog.
0002-Add-pg_depend.refobjversion.patch
Description: Binary data
0005-Preserve-index-dependencies-on-collation-during-pg_u.patch
Description: Binary data
0001-Remove-pg_collation.collversion.patch
Description: Binary data
0004-Also-track-default-collation-version.patch
Description: Binary data
0003-Track-collation-versions-for-indexes.patch
Description: Binary data