On Tue, Jan 1, 2019 at 2:11 AM Andrey Borodin <x4...@yandex-team.ru> wrote:
> > This isn't consistent with what you do within verify_nbtree.c, which
> > deliberately avoids ever holding more than a single buffer lock at a
> > time, on general principle. That isn't necessarily a reason why you
> > have to do the same, but it's not clear why you do things that way.
> > Why isn't it enough to have a ShareLock on the relation? Maybe this is
> > a sign that it would be a good idea to always operate on a palloc()'d
> > copy of the page, by introducing something equivalent to
> > palloc_btree_page(). (That would also be an opportunity to do very
> > basic checks on every page.)
> If we unlock parent page before checking child, some insert can adjust tuple 
> on parent, sneak into child and insert new tuple.
> This can trigger false positive. I'll think about it more.

I think that holding a buffer lock on an internal pages for as long as
it takes to check all of the child pages is a non-starter. If you
can't think of a way of not doing that that's race free with a
relation-level AccessShareLock, then a relation-level ShareLock (which
will block VACUUM) seems necessary.

I think that you should duplicate some of what's in
bt_index_check_internal() -- lock the heap table as well as the index,
to avoid deadlocks. We might not do this with contrib/pageinspect, but
that's not really intended to be used in production in the same way
this will be.

> > * Maybe gist_index_check() should be called gist_index_parent_check(),
> > since it's rather like the existing verification function
> > bt_index_parent_check().
> >
> > * Alternatively, you could find a way to make your function only need
> > an AccessShareLock -- that would make gist_index_check() an
> > appropriate name. That would probably require careful thought about
> > VACUUM.
>
> I've changed lock level to AccessShareLock. IMV scan is just as safe as 
> regular GiST index scan.
> There is my patch with VACUUM on CF, it can add deleted pages. I'll update 
> one of these two patches accordingly, if other is committed.

Maybe you should have renamed it to gist_index_parent_check() instead,
and kept the ShareLock. I don't think that this business with buffer
locks is acceptable. And it is mostly checking parent/child
relationships. Right?

You're not really able to check GiST tuples against anything other
than their parent, unlike with B-Tree (you can only do very simple
things, like the IndexTupleSize()/lp_len cross-check). Naming the
function gist_index_parent_check() seems like the right way to go,
even if you could get away with an AccessShareLock (which I now tend
to doubt). It was way too optimistic to suppose that there might be a
clever way of locking down race conditions that allowed you to not
couple buffer locks, but also use an AccessShareLock. After all, even
the B-Tree amcheck code doesn't manage to do this when verifying
parent/child relationships (it only does something clever with the
sibling page cross-check).

> > * Why is it okay to do this?:
> >
> >> +       if (GistTupleIsInvalid(idxtuple))
> >> +           ereport(LOG,
> >> +                   (errmsg("index \"%s\" contains an inner tuple marked 
> >> as invalid",
> >> +                           RelationGetRelationName(rel)),
> >> +                    errdetail("This is caused by an incomplete page split 
> >> at crash recovery before upgrading to PostgreSQL 9.1."),
> >> +                    errhint("Please REINDEX it.")));
> >
> > You should probably mention the gistdoinsert() precedent for this.
> This invalid tuple will break inserts, but will not affect select. I do not 
> know, should this be error or warning in amcheck?

It should be an error. Breaking inserts is a serious problem.

-- 
Peter Geoghegan

Reply via email to