Robert Haas wrote: > For someone who is currently defaulting to es_ES.utf8 or fr_FR.utf8, > a change to C.utf8 would be a much bigger problem, I would > think. Their alphabet isn't in code point order, and so things would > be alphabetized wrongly.
> That might be OK if they don't care about ordering for any purpose > other than equality lookups, but otherwise it's going to force them > to change the default, where today they don't have to do that. Sure, in whatever collation setup we expose, we need to keep it possible and even easy to sort properly with linguistic rules. But some reasons to use $LANG as the default locale/collation are no longer as good as they used to be, I think. Starting with v10/ICU we have many pre-created ICU locales with fixed names, and starting with v16, we can simply write "ORDER BY textfield COLLATE unicode" which is good enough in most cases. So the configuration "bytewise sort by default" / "linguistic sort on-demand" has become more realistic. By contrast in the pre-v10 days with only libc collations, an application could have no idea which collations were going to be available on the server, and how they were named precisely, as this varies across OSes and across installs even with the same OS. On Windows, I think that before v16 initdb did not create any libc collation beyond C/POSIX and the default language/region of the OS. In that libc context, if a db wants the C locale by default for performance and truly immutable indexes, but the client app needs to occasionally do in-db linguistic sorts, the app needs to figure out which collation name will work for that. This is hard if you don't target a specific installation that guarantees that such or such collation is going to be installed. Whereas if the linguistic locale is the default, the app never needs to know its name or anything about it. So it's done that way, linguistic by default. But that leaves databases with many indexes sorted linguistically instead of bytewise for fields that semantically never need any linguistic sort. Best regards, -- Daniel Vérité https://postgresql.verite.pro/ Twitter: @DanielVerite