On 12/28/23 6:57 PM, Jeff Davis wrote: > On Wed, 2023-12-27 at 17:26 -0800, Jeff Davis wrote: > Attached a more complete version that fixes a few bugs, stabilizes the > tests, and improves the documentation. I optimized the performance, too > -- now it's beating both libc's "C.utf8" and ICU "en-US-x-icu" for both > collation and case mapping (numbers below). > > It's really nice to finally be able to have platform-independent tests > that work on any UTF-8 database.
I think we missed something in psql, pretty sure I applied all the patches but I see this error: =# \l ERROR: 42703: column d.datlocale does not exist LINE 8: d.datlocale as "Locale", ^ HINT: Perhaps you meant to reference the column "d.daticulocale". LOCATION: errorMissingColumn, parse_relation.c:3720 ===== This is interesting. Jeff your original email didn't explicitly show any other initcap() results, but on Ubuntu 22.04 (glibc 2.35) I see different results: =# SELECT initcap('axxE áxxÉ DŽxxDŽ Džxxx džxxx'); initcap -------------------------- Axxe Áxxé DŽxxdž DŽxxx DŽxxx =# SELECT initcap('axxE áxxÉ DŽxxDŽ Džxxx džxxx' COLLATE C_UTF8); initcap -------------------------- Axxe Áxxé Džxxdž Džxxx Džxxx The COLLATE sql syntax feels awkward to me. In this example, we're just using it to attach locale info to the string, and there's not actually any collation involved here. Not sure if COLLATE comes from the standard, and even if it does I'm not sure whether the standard had upper/lowercase in mind. That said, I think the thing that mainly matters will be the CREATE DATABASE syntax and the database default. I want to try a few things with table-level defaults that differ from database-level defaults, especially table-level ICU defaults because I think a number of PostgreSQL users set that up in the years before we supported DB-level ICU. Some people will probably keep using their old/existing schema-creation scripts even after they begin provisioning new systems with new database-level defaults. -Jeremy -- http://about.me/jeremy_schneider