Hi,

On 2019-04-30 00:50:20 -0400, Tom Lane wrote:
> Andres Freund <and...@anarazel.de> writes:
> > On April 29, 2019 9:37:33 PM PDT, Tom Lane <t...@sss.pgh.pa.us> wrote:
> >> Seems like putting reindexes of pg_class into a test script that runs
> >> in parallel with other DDL wasn't a hot idea.
> 
> > Saw that. Will try to reproduce (and if necessary either run separately or 
> > revert). But isn't that somewhat broken? They're not run in a transaction, 
> > so the locking shouldn't be deadlock prone.
> 
> Hm?  REINDEX INDEX is deadlock-prone by definition, because it starts
> by opening/locking the index and then it has to open/lock the index's
> table.  Every other operation locks tables before their indexes.

We claim to have solved that:

/*
 * ReindexIndex
 *              Recreate a specific index.
 */
void
ReindexIndex(RangeVar *indexRelation, int options, bool concurrent)


        /*
         * Find and lock index, and check permissions on table; use callback to
         * obtain lock on table first, to avoid deadlock hazard.  The lock level
         * used here must match the index lock obtained in reindex_index().
         */
        indOid = RangeVarGetRelidExtended(indexRelation,
                                                                          
concurrent ? ShareUpdateExclusiveLock : AccessExclusiveLock,
                                                                          0,
                                                                          
RangeVarCallbackForReindexIndex,
                                                                          (void 
*) &heapOid);

and I don't see an obvious hole in the general implementation. Minus the
comment that code exists back to 9.4.

I suspect the problem isn't REINDEX INDEX in general, it's REINDEX INDEX
over catalog tables modified during reindex. The callback acquires a
ShareLock lock on the index's table, but *also* during the reindex needs
a RowExclusiveLock on pg_class, etc. E.g. in RelationSetNewRelfilenode()
on pg_class, and on pg_index in index_build(). Which means there's a
lock-upgrade hazard (Share to RowExclusive - well, that's more a
side-grade, but still deadlock prone).

I can think of ways to fix that (e.g. if reindex is on pg_class or
index, use SHARE ROW EXCLUSIVE, rather than SHARE), but we'd probably
not want to backpatch that.

I'll try to reproduce tomorrow.

Greetings,

Andres Freund


Reply via email to