On Thu, Jun 9, 2022 at 10:54 AM Jeremy Schneider <schnei...@ardentperf.com> wrote: > I’m probably just going to end up rehashing the old threads I haven’t read > yet… > > One challenge with this approach is you have things like sort-merge joins > that require the same collation across multiple objects. So I think you’d > need to keep all the old indexes around until you have new indexes available > for all objects in a database, and somehow the planner would need to be smart > enough to dynamically figure out old vs new versions on a query-by-query > basis.
I don't think that it would be fundamentally difficult to have the planner deal with collations at the level required to avoid incorrect query plans. I'm not suggesting that this is an easy project, or that the end result would be totally free of caveats, such as the issue with merge joins. I am only suggesting that something like this seems doable. There aren't that many distinct high level approaches that could possibly decouple upgrading Postgres/the OS from reindexing. This is one. > And my opinion is that the problems caused by depending on OS libraries for > collation need to be addressed on a shorter timeline than what’s realistic > for inventing a new way for a relational database to offer transparent or > online upgrades of linguistic collation versions. But what does that really mean? You can use ICU collations as the default for the entire cluster now. Where do we still fall short? Do you mean that there is still a question of actively encouraging using ICU collations? I don't understand what you're arguing for. Literally everybody agrees that the current status quo is not good. That much seems settled to me. > Also I still think folks are overcomplicating this by focusing on linguistic > collation as the solution. I don't think that's true; I think that everybody understands that being on the latest linguistic collation is only very rarely a compelling feature. The whole way that BCP47 tags are so forgiving is entirely consistent with that view of things. But what difference does it make? As long as you accept that any collation *might* need to be updated, or the default ICU version might change on OS upgrade, then you have to have some strategy for dealing with the transition. Not being on a very old obsolete version of ICU will eventually become a "compelling feature" in its own right. I believe that EDB adopted ICU many years ago, and stuck with one vendored version for quite a few years. And eventually being on a very old version of ICU became a real problem. -- Peter Geoghegan