On Wed, 2025-03-19 at 08:46 -0400, Robert Haas wrote: > I see your point, but most people don't use the builtin collation > provider.
The other providers aren't affected by us updating Unicode, so I think we got off track somehow. I suppose what I meant was: "If you are concerned about inconsistencies, and you move to the builtin provider, then 99% of the inconsistency problem is gone. We can remove the last 1% of the problem if we do all the work listed above." > When an EDB customer asks "if I do X, > will anything break," it's often the case that answering "maybe" is > the same as answering "yes". That's a good point. However, note that "doesn't break primary keys" is a nice guarantee, even if there's still some remaining doubts about expression indexes, etc. > They want a hard guarantee that the behavior will not > change. My understanding of this thread so far was that we were mostly concerned about internal inconsistencies of stored structures; e.g. indexes that could return different results than a seqscan. Not changing query results at all between major versions is a valid concern, but a fairly strict one that doesn't seem limited to immutable functions or collation issues. Surely, at least the results of "SELECT version()" should change from release to release ;-) > Again, I'm not trying to oblige > you to deliver that behavior and I confess to ignorance on how we > could realistically get there. FWIW I'm not complaining about doing the work. But I think the results will be better if we can get a few people aligned on a general plan and collaborating. I will try to kick that off. > and to be able to easily know exactly what they need to reindex. That's the main one, I think. The upgrade check offers that for the builtin provider, though admittedly it's not a very user-friendly solution, and we can do better. > And from that point of view -- and again, I'm not volunteering to > implement it and I'm not telling you to do it either -- Joe's > proposal > of supporting multiple versions sounds fantastic. I certainly don't oppose giving users that choice. But I view it as a burden we are placing on the users -- better than breakage, but not really great, either. So if we do put in a ton of work, I'd like it if we could arrive at a bettter destination. If we actually want the BEST user experience possible, they'd not even really know that their index was ever inconsistent. Autovacuum would come along and just find the few entries in the index that need fixing, and reindex just those few tuples. In theory, it should be possible: there are a finite number of codepoints that change each Unicode version, and we can just search for them in the data and fix up derived structures. Regards, Jeff Davis