On Thu, Nov 10, 2022 at 11:47:01AM +0100, Karsten Hilbert wrote:
> Thanks, Julien, for your explanation.
> 
> > > regarding changed collation versions this
> > >
> > >   https://www.postgresql.org/docs/devel/sql-altercollation.html
> > >
> > > says:
> > >
> > >   The following query can be used to identify all
> > >   collations in the current database that need to be
> > >   refreshed and the objects that depend on them:
> > >
> > >   SELECT pg_describe_object(refclassid, refobjid, refobjsubid) AS 
> > > "Collation",
> > >          pg_describe_object(classid, objid, objsubid) AS "Object"
> > >     FROM pg_depend d JOIN pg_collation c
> > >          ON refclassid = 'pg_collation'::regclass AND refobjid = c.oid
> > >     WHERE c.collversion <> pg_collation_actual_version(c.oid)
> > >     ORDER BY 1, 2;
> > >
> > > I feel the result of that query can be slightly surprising
> > > because it does not return (to my testing) any objects
> > > depending on the database default collation, nor the database
> > > itself (as per a collation version mismatch in pg_database).
> >
> > Indeed.  The default collation is "pinned", so we don't record any 
> > dependency
> > on it.
> 
> Indirectly we do, don't we ?  Or else
> 
> > >   WHERE
> > >           collprovider IN ('d', 'c')
> 
> would not make much sense, right ?

What I meant is that we don't insert record in pg_depend to track dependencies
on pinned object, including the default collation.  The collprovider here comes
from pg_index.indcollation which is a different thing.  It can indeed store the
default collation, but it's only a small step toward less false negative.

Try that query with e.g.

CREATE INDEX ON sometable ( (somecol > 'somevalue') );

or

CREATE INDEX ON sometable (someid) WHERE somecol > 'somevalue';

Both clearly can get corrupted if the underlying collation library changes the
result of somecol > 'somevalue', but wouldn't be detected by that query.  There
are likely a lot more cases that would be missed, you can refer to the
discussions from a couple years ago when we tried to properly track all index
collation dependencies.

> The comment above the query in the official documentation is rather assertive
> (even if may true to the letter) and may warrant some more cautionary
> wording ?   Added, perhaps, some variation of this:
> 
> > For now, the only safe way to go is either reindex everything, or everything
> > except some safe cases (non-partial indexes on plain-non-collatable 
> > datatypes
> > only).

I think the comment is very poorly worded, as it leads readers to believe that
objects with a pg_depend dependency on a collation are the only one that would
get corrupted in case of glibc/ICU upgrade.

I agree that there should be a big fat red warning saying something like
"reindex everything if there's any discrepancy between the recorded collation
version and the currently reported one unless you REALLY know what you're
doing."


Reply via email to