Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

Andres Freund Thu, 02 May 2019 08:31:32 -0700

Hi,

On 2019-05-02 10:49:00 -0400, Tom Lane wrote:
> Andres Freund <[email protected]> writes:
> > On 2019-05-01 22:01:53 -0400, Tom Lane wrote:
> >> I think that argument is pretty pointless considering that "REINDEX TABLE
> >> pg_class" does it this way, and that code is nearly old enough to
> >> vote.
>
> > IMO the reindex_relation() case isn't comparable.
>
> IMV it's the exact same case: we need to perform a pg_class update while
> one or more of pg_class's indexes shouldn't be touched.  I am kind of
> wondering why it didn't seem to be necessary to cover this for REINDEX
> INDEX back in 2003, but it clearly is necessary now.
>
> > That's not pretty either :(
>
> So, I don't like your patch, you don't like mine.  Anybody else
> want to weigh in?


Well, I think I can live with your fix. I think it's likely to hide
future bugs, but this is an active bug. And, as you say, we don't have a
lot of time.


ISTM that if we go down this path, we should split (not now, but either
still in v12, or *early* in v13), the sets of indexes that are intended
to a) not being used for catalog queries b) may be skipped for index
insertions. It seems pretty likely that somebody will otherwise soon
introduce an heap_update() somewhere into the index build process, and
it'll work just fine in testing due to HOT.


We already have somewhat separate and half complimentary mechanisms
here:
1) When REINDEX_REL_SUPPRESS_INDEX_USE is set (only cluster.c), we mark
   indexes on tables as unused by SetReindexPending(). That prevents them
   from being used for catalog queries. But it disallows new inserts
   into them.

2) When reindex_index() starts processing an index, it marks it as being
   processed. Indexes on this list are not alowed to be inserted to
   (enforced by assertions).  Note that this currently removes the
   specific index from the list set by 1).

   It also marks the heap as being reindexed, which then triggers (as
   the sole effect afaict), some special case logic in
   index_update_stats(), that avoids the syscache and opts for a direct
   manual catalog scan. I'm a bit confused as to why that's necessary.

3) Just for pg_class, reindex_relation(), just hard-sets the list of
   indexes that are alreday rebuilt. This allows index insertions into
   the the indexes that are later going to be rebuilt - which is
   necessary because we currently update pg_class in
   RelationSetNewRelfilenode().

Seems odd to resort to RelationSetIndexList(), when we could just mildly
extend the SetReindexPending() logic instead.

I kinda wonder if there's not a third approach hiding somewhere here. We
could just stop updating pg_class in RelationSetNewRelfilenode() in
pg_class, when it's an index on pg_class. The pg_class changes for
mapped indexes done aren't really crucial, and are going to be
overwritten later by index_update_stats().  That'd have the big
advantage that we'd afaict not need the logic of having to allow
catalog modifications at all during the reindex path at all.


> We do not have the luxury of time to argue about this.  If we commit
> something today, we *might* get a useful set of CLOBBER_CACHE_ALWAYS
> results for all branches by Sunday.

Yea. I think I'll also just trigger a manual CCA run of check-world for
all branches (thank god for old workstations). And CCR for at least a
few crucial bits.


> Those regression tests will have to come out of the back branches on
> Sunday, because we are not shipping minor releases with unstable
> regression tests, and I've heard no proposal for avoiding the
> occasional-deadlock problem.

Yea, I've just proposed the same in a separate thread.

Greetings,

Andres Freund

Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

Reply via email to