On 6/3/22 9:21 AM, Tom Lane wrote: > > According to that document, they changed it in macOS 11, which came out > a year and a half ago. Given the lack of complaints, it doesn't seem > like this is urgent enough to mandate a post-beta change that would > have lots of downside (namely, false-positive warnings for every other > macOS update).
Sorry, I'm going to rant for a minute... it is my very strong opinion that using language like "false positive" here is misguided and dangerous. If new version of sort order is released, for example when they recently updated backwards-secondary sorting in french [CLDR-2905] or matching of v and w in swedish and finnish [CLDR-7088], it is very dangerous to use language like “false positive” to describe a database where there just didn't happen to be any rows with accented french characters at the point in time where PostgreSQL magically changed which version of sort order it was using from the 2010 french version to the 2020 french version. No other piece of software that calls itself a database would do what PostgreSQL is doing: just give users a "warning" after suddenly changing the sort order algorithm (most users won't even read warnings in their logs). Oracle, DB2, SQL Server and even MySQL carefully version collation data, hardcode a pseudo-linguistic collation into the DB (like PG does for timezones), and if they provide updates to linguistic sort order (from Unicode CLDR) then they allow the user to explicitly specify which version of french or german ICU sorting they are want to use. Different versions are treated as different sort orders; they are not conflated. I have personally seen PostgreSQL databases where an update to an old version of glibc was applied (I'm not even talking 2.28 here) and it resulted in data loss b/c crash recovery couldn't replay WAL records and the user had to do a PITR. That's aside from the more common issues of segfaults or duplicate records that violate unique constraints or wrong query results like missing data. And it's not just updates - people can set up a hot standby on a different version and see many of these problems too. Collation versioning absolutely must be first class and directly controlled by users, and it's very dangerous to allow users - at all - to take an index and then use a different version than what the index was built with. Not to mention all the other places in the DB where collation is used... partitioning, constraints, and any other place where persisted data can make an assumption about any sort of string comparison. It feels to me like we're still not really thinking clearly about this within the PG community, and that the seriousness of this issue is not fully understood. -Jeremy Schneider -- http://about.me/jeremy_schneider