On Wed, Jun 15, 2022 at 7:10 AM Jeremy Schneider <schnei...@ardentperf.com> wrote: > > On Jun 14, 2022, at 14:10, Peter Eisentraut > > <peter.eisentr...@enterprisedb.com> wrote: > > Conversely, why are we looking at the ICU version instead of the collation > > version. If we have recorded the collation as being version 1234, we need > > to look through the available ICU versions (assuming we can load multiple > > ones somehow) and pick the one that provides 1234. It doesn't matter > > whether it's the same ICU version that the collation was originally created > > with, as long as the collation version stays the same.
One difference would be the effect if ICU ever ships a minor library version update that changes the reported collversion. 1. With the code I proposed in my v4 patch, our version mismatch warnings would kick in, but otherwise everything would continue to work (and corrupt indexes, if they really moved anything around). 2. With a system that (somehow) opens all available libraries and looks for match, it would fail to find one. That is assuming that you are using the typical major-versioned packages we can see in software distributions like Debian. I don't know if minor version changes actually do that, though have wondered out loud a few times in these threads. I might go and poke at some ancient packages to see if that's happened before. To defend against that, we could instead do major + minor versioning, but so far I worried about major only because that's they way they ship 'em in Debian and (AFAICS) RHEL etc, so if you can't easily install 68.0 and 68.1 at the same time. On the other hand, you could always "pin" (or similar concepts) the libicu68 package to a specific minor release, to fix the problem (whether you failed like 1 or like 2 above). > (Common mistake I’ve seen folks make when comparing OS glibc versions is only > looking at locale data, not realizing there have been changes to root > behavior that didn’t involve any changes to local data files) Yeah, I've wondered idly before if libc projects and ICU couldn't just offer a way to ask for versions explicitly, and ship historical data. With some system of symlinks to make it all work with defaults for those who don't care, a libc could have /usr/share/locale/en...@cldr34.utf-8 etc so you could setlocale(LC_COLLATE, "en_US@CLDR34"), or something. I suppose they don't want to promise to be able to interpret the old data in future releases, and, as you say, sometimes the changes are in C code, due to bugs or algorithm changes, not the data.