On 13.08.2020 07:38, Michael Paquier wrote:
Hi all,

While working on support for REINDEX for partitioned relations, I have
noticed an old bug in the logic of ReindexMultipleTables(): the list
of relations to process is built in a first transaction, and then each
table is done in an independent transaction, but we don't actually
check that the relation still exists when doing the real work.  I
think that a complete fix involves two things:
1) Addition of one SearchSysCacheExists1() at the beginning of the
loop processing each item in the list in ReindexMultipleTables().
This would protect from a relation dropped, but that would not be
enough if ReindexMultipleTables() is looking at a relation dropped
before we lock it in reindex_table() or ReindexRelationConcurrently().
Still that's simple, cheaper, and would protect from most problems.
2) Be completely water-proof and adopt a logic close to what we do for
VACUUM where we try to open a relation, but leave if it does not
exist.  This can be achieved with just some try_relation_open() calls
with the correct lock used, and we also need to have a new
REINDEXOPT_* flag to control this behavior conditionally.

2) and 1) are complementary, but 2) is invasive, so based on the lack
of complaints we have seen that does not seem really worth
backpatching to me, and I think that we could also just have 1) on
stable branches as a minimal fix, to take care of most of the
problems that could show up to users.

Attached is a patch to fix all that, with a cheap isolation test that
fails on HEAD with a cache lookup failure.  I am adding that to the
next CF for now, and I would rather fix this issue before moving on
with REINDEX for partitioned relations as fixing this issue reduces
the use of session locks for partition trees.

Any thoughts?
--
Michael

Hi,
I reviewed the patch. It does work and the code is clean and sane. It implements a logic that we already had in CLUSTER, so I think it was simply an oversight. Thank you for fixing this.

I noticed that REINDEXOPT_MISSING_OK can be passed to the TOAST table reindex. I think it would be better to reset the flag in this reindex_relation() call, as we don't expect a concurrent DROP here.

    /*
     * If the relation has a secondary toast rel, reindex that too while we
     * still hold the lock on the main table.
     */
    if ((flags & REINDEX_REL_PROCESS_TOAST) && OidIsValid(toast_relid))
        result |= reindex_relation(toast_relid, flags, options);


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply via email to