On Thu, 2023-06-15 at 19:15 +1200, Thomas Munro wrote: > Hmm, OK let's explore that. What could we do that would be helpful > here, without affecting users of the "true" C.UTF-8 for the rest of > time?
Where is the "true" C.UTF-8 defined? I assume you mean that the collation order can't (shouldn't, anyway) change. But what about the ctype (upper/lower/initcap) behavior? Is that also locked down for all time, or could it change if some new unicode characters are added? Would it be correct to interpret LC_COLLATE=C.UTF-8 as LC_COLLATE=C, but leave LC_CTYPE=C.UTF-8 as-is? Regards, Jeff Davis