On Mon, 2024-07-01 at 16:03 -0700, Noah Misch wrote: > I agree the v17 code is fine. Today, a user can (with difficulty) > choose > dependency libraries so regexp_matches() is IMMUTABLE, as marked. I > don't > want $SUBJECT to be the ctype that, at some post-v17 version, can't > achieve > that with unpatched PostgreSQL.
We aren't forcing anyone to use the builtin "C.UTF-8" locale. Anyone can still use the builtin "C" locale (which never changes), or another provider if they can sort out the difficulties (and live with the consequences) of pinning the dependencies to a specific version. > Let's change the documentation to say this > provider uses a particular snapshot of Unicode data, taken around > PostgreSQL > 17. We plan never to change that data, so IMMUTABLE functions can > rely on the > data. We can discuss this in the context of version 18 or the next time we plan to update Unicode. I don't think we should make such a promise in version 17. > If we provide a newer Unicode data set in the future, we'll provide > it > in such a way that DDL must elect the new data. How well would that > suit your > vision for this feature? Thomas tried tracking collation versions along with individual objects, and it had to be reverted (ec48314708). It fits my vision to do something like that as a way of tightening things up. But there are some open design questions we need to settle, along with a lot of work. So I don't think we should pre-emptively block all Unicode updates waiting for it. > An alternative would be to make pg_upgrade reject > operating on a cluster that contains use of $SUBJECT. That wouldn't help anyone. Regards, Jeff Davis