On 2021-Oct-20, Mark Dilger wrote: > I tried testing how this plays out by handing `createdb` the name é > (U+00E9 "LATIN SMALL LETTER E WITH ACCUTE") and then again the name é > (U+0065 "LATIN SMALL LETTER E" followed by U+0301 "COMBINING ACCUTE > ACCENT".) That results in two distinct databases, not an error about > a duplicate database name: > > # select oid, datname, datdba, encoding, datcollate, datctype from > pg_catalog.pg_database where datname IN ('é', 'é'); > oid | datname | datdba | encoding | datcollate | datctype > -------+---------+--------+----------+-------------+------------- > 37852 | é | 10 | 6 | en_US.UTF-8 | en_US.UTF-8 > 37855 | é | 10 | 6 | en_US.UTF-8 | en_US.UTF-8 > (2 rows) > > But that doesn't seem to prove much, as other tools in my locale don't > treat those as equal either. (Testing with perl's "eq" operator, they > compare as distinct.) I expected to find regression tests providing > better coverage for this somewhere, but did not. Anybody know more > about it?
I think it would appropriate to normalize identifiers that are going to be stored in catalogs. As presented, this is a bit ridiculous and I see no reason to continue to support it. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Ed is the standard text editor." http://groups.google.com/group/alt.religion.emacs/msg/8d94ddab6a9b0ad3