Re: Collation versioning

Stephen Frost Tue, 18 Sep 2018 05:49:12 -0700

Greetings,

* Thomas Munro (thomas.mu...@enterprisedb.com) wrote:
> On Mon, Sep 17, 2018 at 9:02 AM Stephen Frost <sfr...@snowman.net> wrote:
> > * Thomas Munro (thomas.mu...@enterprisedb.com) wrote:
> > > On Mon, Sep 17, 2018 at 6:13 AM Douglas Doole <dougdo...@gmail.com> wrote:
> > > > On Sun, Sep 16, 2018 at 1:20 AM Thomas Munro 
> > > > <thomas.mu...@enterprisedb.com> wrote:
> > > >> 3.  Fix the tracking of when reindexes need to be rebuilt, so that you
> > > >> can't get it wrong (as you're alluding to above).
> > > >
> > > > I've mentioned this in the past, but didn't seem to get any traction, 
> > > > so I'll try it again ;-)
> > >
> > > Probably because we agree with you, but don't have all the answers :-)
> >
> > Agreed.
> >
> > > > The focus on indexes when a collation changes is, in my opinion, the 
> > > > least of the problems. You definitely have to worry about indexes, but 
> > > > they can be easily rebuilt. What about other places where collation is 
> > > > hardened into the system, such as constraints?
> > >
> > > We have to start somewhere and indexes are the first thing that people
> > > notice, and are much likely to actually be a problem (personally I've
> > > encountered many cases of index corruption due to collation changes in
> > > the wild, but never a constraint corruption, though I fully understand
> > > the theoretical concern).  Several of us have observed specifically
> > > that the same problems apply to CHECK constraints and PARTITION
> > > boundaries, and there may be other things like that.  You could
> > > imagine tracking collation dependencies on those, requiring a RECHECK
> > > or REPARTITION operation to update them after a depended-on collation
> > > version changes.
> > >
> > > Perhaps that suggests that there should be a more general way to store
> > > collation dependencies -- something more like pg_depend, rather than
> > > bolting something like indcollversion onto indexes and every other
> > > kind of catalog that might need it.  I don't know.
> >
> > Agreed.  If we start thinking about pg_depend then maybe we realize
> > that this all comes back to pg_attribute as the holder of the
> > column-level information and maybe what we should be thinking about is a
> > way to encode version information into the typmod for text-based
> > types...
> 
> So to be more concrete: pg_depend could have a new column
> "refobjversion".  Whenever indexes are created or rebuilt, we'd
> capture the current version string in the pg_depend rows that link
> index attributes and collations.  Then we'd compare those against the
> current value when we first open an index and complain if they don't
> match.  (In this model there would be no "collversion" column in the
> pg_collation catalog.)


I'm really not sure why you're pushing to have this in pg_depend..

> That'd leave a place for other kinds of database objects (CHECKs,
> PARTITIONS, ...) to store their version dependency, if someone later
> wants to add support for that.

Isn't what matters here where the data's stored, as in, in a column..?

All of those would already have dependencies on the column so that they
can be tracked back there.

> I'm not sure if my idea about updating the default collation row in
> newly created databases has legs though.  Any thoughts on that?

My initial reaction is that we should have a version included basically
everywhere and then let users decide how they want to change it.  For a
new cluster, I'd agree with using the latest available (while allowing
it to be chosen if a user wishes for something else) but I'm not sure
I'd go farther than that.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: Collation versioning

Reply via email to