On Sun, Sep 8, 2019 at 1:21 AM Andrey Borodin <x4...@yandex-team.ru> wrote: > Maybe we should PageGetItemIdCareful() to amcheck.c too? > I think it can be reused later by GIN entry tree and to some extent by > SP-GiST. > SP-GiST uses redirect tuples, but do not store this information in line > pointer.
Well, the details are slightly different in each case in at least one way -- we use the size of the special area to determine the exact bounds that it is safe for a tuple to appear in. The size of the special area varies based on the access method. (Actually, pg_filedump relies on this difference when inferring which access method a particular page image is based on -- it starts out by looking at the standard pd_special field that appears in page headers. So clearly they're often different.) > > My main concern now is the heavyweight lock strength needed by the new > > function. I don't feel particularly qualified to sign off on the > > concurrency aspects of the patch. Heikki's v6 used a ShareLock, like > > bt_index_parent_check(), but you went back to an AccessShareLock, > > Andrey. Why is this safe? I see that you do gist_refind_parent() in > > your v9 a little differently to Heikki in his v6, which you seemed to > > suggest made this safe in your e-mail on March 28, but I don't > > understand that at all. > > Heikki's version was reading childblkno instead of parentblkno, thus never > refinding parent tuple. > When we suspect key consistency violation, we hold lock on page with some > tuple. Then we take pin and lock on page that was parent for current some > time before. > For example of validity see gistfinishsplit(). Comments state "On entry, the > caller must hold a lock on stack->buffer", line 1330 acquires > LockBuffer(stack->parent->buffer, GIST_EXCLUSIVE); > This function is used during inserts, but we are not going to modify data and > place row locks, thus neither ROW EXCLUSIVE, not ROW SHARE is necessary. But gistfinishsplit() is called when finishing a page split -- the F_FOLLOW_RIGHT bit must be set on the leaf. Are you sure that you can generalize from that, and safely relocate the parent? I would be a lot more comfortable with this if Heikki weighed in. I am also concerned about the correctness of this because of this paragraph from the GiST README file: """ This page deletion algorithm works on a best-effort basis. It might fail to find a downlink, if a concurrent page split moved it after the first stage. In that case, we won't be able to remove all empty pages. That's OK, it's not expected to happen very often, and hopefully the next VACUUM will clean it up. """ Why is this not a problem for the new amcheck checks? Maybe this is a very naive question. I don't claim to be a GiST expert. -- Peter Geoghegan