On Sat, 2022-10-22 at 14:22 +1300, Thomas Munro wrote: > Problem 2: If ICU 67 ever decides to report a different version for > a > given collation (would it ever do that? I don't expect so, but ...), > we'd be unable to open the collation with the search-by-collversion > design, and potentially the database. What is a user supposed to do > then? Presumably our error/hint for that would be "please insert the > correct ICU library into drive A", but now there is no correct > library
Let's say that Postgres is compiled against version 67.X, and the sysadmin upgrades the ICU package to 67.Y, which reports a different collation version for some locale. Your current patch makes this impossible for the administrator to fix, because there's no way to have two different libraries loaded with the same major version number, so it will always pick the compiled-in ICU. The user will be forced to accept the new version of the collation, see WARNINGs in their logs, and possibly corrupt their indexes. Search-by-collversion would still be frustrating for the admin, but at least it would be possible to fix by compiling their own 67.X and asking Postgres to search that library, too. We could make it slightly more friendly by having an error that reports the libraries searched and the collation versions found, if none of the versions match. We can have a GUC that controls whether a failure to find the right version is a WARNING or an ERROR. On Sat, 2022-11-19 at 07:38 +1300, Thomas Munro wrote: > > * We'll need some clearer instructions on how to build/install > > extra > > ICU versions that might not be provided by the distribution > > packaging. > > For instance, I got a cryptic error until I used --enable-rpath, > > which > > might not be obvious to all users. > > Suggestions welcome. No docs at all yet... I tried to write up some docs. It's hard to explain why we are exposing to the user the collation version and the library version in these different ways, and what effects they have. The current patch feels like it hasn't decided whether the collation version is ucol_getVersion() (collversion) or u_getVersion() (library version). The collversion is more prominent in the UI (with its own syntax), yet it's just a cross-check for whether to issue a WARNING or not; while the library version is hidden in the locale field and it actually decides which symbol is called. > > > Yeah. I just don't like the way it *appears* to be doing something > clever, but > it doesn't solve any fundamental problem at all because the > collversion > information is under human control and so it's really doing something > stupid. I assume by "human control" you mean "ALTER COLLATION ... REFRESH VERSION". I agree that relying on the admin's declaration is dubious, especially when we provide no good advice on how to actually do that safely. But I don't see what using the library version instead buys us here, except that library version is part of the LOCALE, and there's no ALTER command for that. You could just as easily deprecate/eliminate the ALTER COLLATION REFRESH VERSION, and then say that the collversion is out of human control, too. By introducing multiple libraries, I think we need to change that syntax anyway, to be something like: ALTER COLLATION ... SET VERSION TO '...' or even: ALTER COLLATION ... FORCE VERSION TO '...' > Hence desire to build something that at least admits that it's > primitive and > just gives you some controls, in a first version. Using either the library version or the collation version seems reasonably simple to me. But from a documentation and usability standpoint, the way they are currently mixed seems confusing. -- Jeff Davis PostgreSQL Contributor Team - AWS