On Sat, Jul 06, 2024 at 04:19:21PM -0400, Tom Lane wrote:
> Noah Misch <n...@leadboat.com> writes:
> > As a released feature, NORMALIZE() has a different set of remedies to choose
> > from, and I'm not proposing one.  I may have sidetracked this thread by
> > talking about remedies without an agreement that pg_c_utf8 has a problem.  
> > My
> > question for the PostgreSQL maintainers is this:
> 
> >   textregexeq(... COLLATE pg_c_utf8, '[[:alpha:]]') and lower(), despite 
> > being
> >   IMMUTABLE, will change behavior in some major releases.  pg_upgrade does 
> > not
> >   have a concept of IMMUTABLE functions changing, so index scans will return
> >   wrong query results after upgrade.  Is it okay for v17 to release a
> >   pg_c_utf8 planned to behave that way when upgrading v17 to v18+?
> 
> I do not think it is realistic to define "IMMUTABLE" as meaning that
> the function will never change behavior until the heat death of the
> universe.  As a counterexample, we've not worried about applying
> bug fixes or algorithm improvements that change the behavior of
> "immutable" numeric computations.

True.  There's a continuum from "releases can change any IMMUTABLE function"
to "index integrity always wins, even if a function is as wrong as 1+1=3".
I'm less concerned about the recent "Incorrect results from numeric round"
thread, even though it's proposing to back-patch.  I'm thinking about these
aggravating factors for $SUBJECT:

- $SUBJECT is planning an annual cadence of this kind of change.

- We already have ICU providing collation support for the same functions.
  Unlike $SUBJECT, ICU integration gives packagers control over when to accept
  corruption at pg_upgrade time.

- SQL Server, DB2 and Oracle do their Unicode updates in a non-corrupting way.
  (See Jeremy Schneider's reply concerning DB2 and Oracle.)

- lower() and regexp are more popular in index expressions than
  high-digit-count numeric calculations.

> I'd say a realistic policy is "immutable means we don't intend to
> change it within a major release".  If we do change the behavior,
> either as a bug fix or a major-release improvement, that should
> be release-noted so that people know they have to rebuild dependent
> indexes and matviews.

It sounds like you're very comfortable with $SUBJECT proceeding in its current
form.  Is that right?


Reply via email to