On Tue, 28 Apr 2020 at 11:35, Peter Geoghegan <p...@bowt.ie> wrote: > > On Mon, Apr 27, 2020 at 11:02 AM Peter Geoghegan <p...@bowt.ie> wrote: > > I would like to backpatch both patches to all branches that have > > commit 857f9c36cda -- v11, v12, and master. The second issue isn't > > serious, but it seems worth keeping v11+ in sync in this area. Note > > that any backpatch theoretically creates an ABI break for callers of > > the _bt_pagedel() function. > > I plan to commit both fixes, while backpatching to v11, v12 and the > master branch on Friday -- barring objections.
Thank you for the patches and for starting the new thread. I agree with both patches. For the first fix it seems better to push down the logic to the page deletion code as your 0001 patch does so. The following change changes the page deletion code so that it emits LOG message indicating the index corruption when a deleted page is passed. But we used to ignore in this case. @@ -1511,14 +1523,21 @@ _bt_pagedel(Relation rel, Buffer buf) for (;;) { - page = BufferGetPage(buf); + page = BufferGetPage(leafbuf); opaque = (BTPageOpaque) PageGetSpecialPointer(page); /* * Internal pages are never deleted directly, only as part of deleting * the whole branch all the way down to leaf level. + * + * Also check for deleted pages here. Caller never passes us a fully + * deleted page. Only VACUUM can delete pages, so there can't have + * been a concurrent deletion. Assume that we reached any deleted + * page encountered here by following a sibling link, and that the + * index is corrupt. */ - if (!P_ISLEAF(opaque)) + Assert(!P_ISDELETED(opaque)); + if (!P_ISLEAF(opaque) || P_ISDELETED(opaque)) { /* * Pre-9.4 page deletion only marked internal pages as half-dead, @@ -1537,13 +1556,21 @@ _bt_pagedel(Relation rel, Buffer buf) errmsg("index \"%s\" contains a half-dead internal page", RelationGetRelationName(rel)), errhint("This can be caused by an interrupted VACUUM in version 9.3 or older, before upgrade. Please REINDEX it."))); - _bt_relbuf(rel, buf); + + if (P_ISDELETED(opaque)) + ereport(LOG, + (errcode(ERRCODE_INDEX_CORRUPTED), + errmsg("index \"%s\" contains a deleted page with a live sibling link", + RelationGetRelationName(rel)))); + + _bt_relbuf(rel, leafbuf); return ndeleted; } I agree with this change but since I'm not sure this change directly relates to this bug I wonder if it can be a separate patch. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services