On Mon, Jul 01, 2024 at 12:24:15PM -0700, Jeff Davis wrote: > On Sat, 2024-06-29 at 15:08 -0700, Noah Misch wrote: > > lower(), initcap(), upper(), and regexp_matches() are > > PROVOLATILE_IMMUTABLE. > > Until now, we've delegated that responsibility to the user. The user > > is > > supposed to somehow never update libc or ICU in a way that changes > > outcomes > > from these functions. > > To me, "delegated" connotes a clear and organized transfer of > responsibility to the right person to solve it. In that sense, I > disagree that we've delegated it.
Good point. > > Now that postgresql.org is taking that responsibility > > for builtin C.UTF-8, how should we govern it? I think the above text > > and [1] > > convey that we'll update the Unicode data between major versions, > > making > > functions like lower() effectively STABLE. Is that right? > > Marking them STABLE is not a viable option, that would break a lot of > valid use cases, e.g. an index on LOWER(). I agree. > I don't think we need code changes for 17. Some documentation changes > might be helpful, though. Should we have a note around LOWER()/UPPER() > that users should REINDEX any dependent indexes when the provider is > updated? I agree the v17 code is fine. Today, a user can (with difficulty) choose dependency libraries so regexp_matches() is IMMUTABLE, as marked. I don't want $SUBJECT to be the ctype that, at some post-v17 version, can't achieve that with unpatched PostgreSQL. Let's change the documentation to say this provider uses a particular snapshot of Unicode data, taken around PostgreSQL 17. We plan never to change that data, so IMMUTABLE functions can rely on the data. If we provide a newer Unicode data set in the future, we'll provide it in such a way that DDL must elect the new data. How well would that suit your vision for this feature? An alternative would be to make pg_upgrade reject operating on a cluster that contains use of $SUBJECT.