On 2021-Oct-20, Mark Dilger wrote:

> I tried testing how this plays out by handing `createdb` the name é
> (U+00E9 "LATIN SMALL LETTER E WITH ACCUTE") and then again the name é
> (U+0065 "LATIN SMALL LETTER E" followed by U+0301 "COMBINING ACCUTE
> ACCENT".)  That results in two distinct databases, not an error about
> a duplicate database name:
> 
> # select oid, datname, datdba, encoding, datcollate, datctype from 
> pg_catalog.pg_database where datname IN ('é', 'é');
>   oid  | datname | datdba | encoding | datcollate  |  datctype   
> -------+---------+--------+----------+-------------+-------------
>  37852 | é       |     10 |        6 | en_US.UTF-8 | en_US.UTF-8
>  37855 | é       |     10 |        6 | en_US.UTF-8 | en_US.UTF-8
> (2 rows)
> 
> But that doesn't seem to prove much, as other tools in my locale don't
> treat those as equal either.  (Testing with perl's "eq" operator, they
> compare as distinct.)  I expected to find regression tests providing
> better coverage for this somewhere, but did not.  Anybody know more
> about it?

I think it would appropriate to normalize identifiers that are going to
be stored in catalogs.  As presented, this is a bit ridiculous and I see
no reason to continue to support it.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Ed is the standard text editor."
      http://groups.google.com/group/alt.religion.emacs/msg/8d94ddab6a9b0ad3


Reply via email to