On Fri, Jul 05, 2024 at 02:38:45PM -0700, Jeff Davis wrote: > On Thu, 2024-07-04 at 14:26 -0700, Noah Misch wrote: > > I think you're saying that if some Unicode update changes the results > > of a > > STABLE function but does not change the result of any IMMUTABLE > > function, we > > may as well import that update. Is that about right? If so, I > > agree. > > If you are proposing that Unicode updates should not be performed if > they affect the results of any IMMUTABLE function, then that's a new > policy. > > For instance, the results of NORMALIZE() changed from PG15 to PG16 due > to commit 1091b48cd7: > > SELECT NORMALIZE(U&'\+01E030',nfkc)::bytea; > > Version 15: \xf09e80b0 > > Version 16: \xd0b0
As a released feature, NORMALIZE() has a different set of remedies to choose from, and I'm not proposing one. I may have sidetracked this thread by talking about remedies without an agreement that pg_c_utf8 has a problem. My question for the PostgreSQL maintainers is this: textregexeq(... COLLATE pg_c_utf8, '[[:alpha:]]') and lower(), despite being IMMUTABLE, will change behavior in some major releases. pg_upgrade does not have a concept of IMMUTABLE functions changing, so index scans will return wrong query results after upgrade. Is it okay for v17 to release a pg_c_utf8 planned to behave that way when upgrading v17 to v18+? If the answer is yes, the open item closes. If the answer is no, determining the remedy can come next. Lest concrete details help anyone reading, here are some affected objects: CREATE TABLE t (s text COLLATE pg_c_utf8); INSERT INTO t VALUES (U&'\+00a7dc'), (U&'\+001dd3'); CREATE INDEX iexpr ON t ((lower(s))); CREATE INDEX ipred ON t (s) WHERE s ~ '[[:alpha:]]'; v17 can simulate the Unicode aspect of a v18 upgrade, like this: sed -i 's/^UNICODE_VERSION.*/UNICODE_VERSION = 16.0.0/' src/Makefile.global.in # ignore test failures (your ICU likely doesn't have the Unicode 16.0.0 draft) make -C src/common/unicode update-unicode make make install pg_ctl restart Behavior after that: -- 2 rows w/ seq scan, 0 rows w/ index scan SELECT 1 FROM t WHERE s ~ '[[:alpha:]]'; SET enable_seqscan = off; SELECT 1 FROM t WHERE s ~ '[[:alpha:]]'; -- ERROR: heap tuple (0,1) from table "t" lacks matching index tuple within index "iexpr" SELECT bt_index_parent_check('iexpr', heapallindexed => true); -- ERROR: heap tuple (0,1) from table "t" lacks matching index tuple within index "ipred" SELECT bt_index_parent_check('ipred', heapallindexed => true);