Hello Thomas, On Mon, Nov 4, 2019 at 4:58 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > > On Fri, Nov 1, 2019 at 2:21 AM Julien Rouhaud <rjuju...@gmail.com> wrote: > > Are you planning to continue working on it? For the record, that's > > something needed to be able to implement a filter in REINDEX command > > [1]. > > Bonjour Julien, > > Unfortunately I haven't had time to work on it seriously, but here's a > quick rebase to get the proof-of-concept back into working shape. > It's nice to see progress in other bits of the problem-space. I hope > to have time to look at this patch set again soon, but if you or > someone else would like hack on or think about it too, please feel > free!
Thanks! I already did some hack on it when looking at the code so I can try to make some progress. > Yes indeed this is exactly the same problem that you're trying to > solve, approached from a different starting point. > > Here are some problems to think about: > > * We'd need to track dependencies on the default collation once we > have versioning for that (see > https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com). > That is how most people actually consume collations out there in real > life, and yet we don't normally track dependencies on the default > collation and I don't know if that's simply a matter of ripping out > all the code that looks like "xxx != DEFAULT_COLLATION_ID" in the > dependency analysis code or if there's more to it. This isn't enough. What would remain is: - teach get_collation_version_for_oid() to return the default collation name, which is simple - have recordDependencyOnVersion() actually records the dependency, which wouldn't happen as the default collation is pinned. An easy fix would be to teach isObjectPinned() to ignore CollationRelationId / DEFAULT_COLLATION_OID, but that's ugly and would allow too many dependencies to be stored. Not pinning the default collation during initdb doesn't sound a good alternative either. Maybe adding a force flag or a new DependencyType that'd mean "normal but forced" would be ok? > * Andres mentioned off-list that pg_depend rows might get blown away > and recreated in some DDL circumstances. We need to look into that. > * Another is that pg_upgrade won't preserve pg_depend rows, so you'd > need some catalog manipulation (direct or via new DDL) to fix that. > * Some have expressed doubt that pg_depend is the right place for > this; let's see if any counter-proposals appear. When working on the REINDEX FILTER, I totally missed this thread and wrote a POC saving the version in pg_index. That's not ideal though, as you need to record multiple version strings. In my version I used a json type, using the collprovider as the key, but that's not enough for ICU as each collation can have a different version string. I'm not a huge fan of using pg_depend to record the version, but storing a collprovider/collname -> version per row in pg_index is definitely a no go, so I don't have any better counter-proposal. > > > # reindex table t1; > > WARNING: 01000: index "t1_val_idx" depends on collation 13330 version > > "a153.97.35.8", but the current version is "153.97.35.8" > > DETAIL: The index may be corrupted due to changes in sort order. > > HINT: REINDEX to avoid the risk of corruption. > > LOCATION: index_check_collation_version, index.c:1263 > > Duh. Yeah, that's stupid and needs to be fixed somehow. I don't have a clever solution for that either.