On 20/05/2023 05:17, Peter Geoghegan wrote:
It has come to my attention that there is a remaining issue of the
same general nature in nbtree VACUUM's page deletion code. Though this
remaining issue seems significantly less likely to come up in
practice, there is no reason to take any chances here.

Yeah, let's be consistent. Any idea what might cause this corruption?

Attached patch fixes it.
+       /*
+        * Validate target's right sibling page's left link points back to 
target.
+        *
+        * This is known to fail in the field in the presence of index 
corruption,
+        * so we go to the trouble of avoiding a hard ERROR.  This is fairly 
close
+        * to what we did earlier on when we located the target's left sibling
+        * (iff target has a left sibling).
+        */

This comment notes that this is similar to what we did with the left sibling, but there isn't really any mention at the left sibling code about avoiding hard ERRORs. Feels a bit backwards. Maybe move the comment about avoiding the hard ERROR to where the left sibling is handled. Or explain it in the function comment and just have short "shouldn't happen, but avoid hard ERROR if the index is corrupt" comment here.

Also attached is a bugfix for a minor issue in amcheck's
bt_index_parent_check() function, which I noticed in passing, while I
tested the first patch. We assumed that we'd always land on the
leftmost page on each level first (the leftmost according to internal
pages one level up). That assumption is faulty because page deletion
of the leftmost page is quite possible. Page deletion can be
interrupted, leaving a half-dead leaf page (possibly the leftmost leaf
page) without any downlink one level up, while still leaving a left
sibling link on the leaf level (in the leaf page that isn't about to
become the leftmost, but won't until the interrupted page deletion can
be completed).

You could check that the left sibling is indeed a half-dead page.

        ereport(DEBUG1,
                (errcode(ERRCODE_NO_DATA),
                 errmsg("block %u is not leftmost in index \"%s\"",
                current, RelationGetRelationName(state->rel))));

ERRCODE_NO_DATA doesn't look right. Let's just leave out the errcode.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to