On 07.08.24 22:44, Peter Eisentraut wrote:
(Now that I look at it, pg_tolower() has some short-circuiting for ASCII
letters, so it would not handle Turkish-i correctly if that had been the
global locale. By removing the use of pg_tolower(), we fix that issue
in passing.)
It occurred to me that this issue also surfaces in a more prominent
place. These arguably-wrong pg_tolower() and pg_toupper() calls were
also used by the normal SQL lower() and upper() functions before commit
e9931bfb751 if you used a single byte encoding.
For example, in PG17, multi-byte encoding:
initdb --locale=tr_TR.utf8
select upper('hij'); --> HİJ
PG17, single-byte encoding:
initdb --locale=tr_TR # uses LATIN5
select upper('hij'); --> HIJ
With current master, after commit e9931bfb751, you get the first result
in both cases.
So this could break indexes across pg_upgrade in such configurations.