I wrote: > Right, but _bt_getstackbuf is working from a search stack created by > a standard search for the victim page's high key. If that search > descended through a page to the right of the victim page's actual > parent, _bt_getstackbuf isn't able to recover.
What I'm tempted to do, at least in the back branches, is simply adjust _bt_pagedel to be able to recover from _bt_getstackbuf failure in this scenario. It could use the same method that _bt_insert_parent does in the concurrent-root-split case, ie (untested): ItemPointerSet(&(stack->bts_btentry.t_tid), target, P_HIKEY); pbuf = _bt_getstackbuf(rel, stack, BT_WRITE); if (pbuf == InvalidBuffer) + { + /* Find the leftmost page at the next level up */ + pbuf = _bt_get_endpoint(rel, opaque->btpo.level + 1, false); + stack->bts_blkno = BufferGetBlockNumber(pbuf); + stack->bts_offset = InvalidOffsetNumber; + _bt_relbuf(rel, pbuf); + /* and repeat search from there */ + pbuf = _bt_getstackbuf(rel, stack, BT_WRITE); + if (pbuf == InvalidBuffer) elog(ERROR, "failed to re-find parent key in \"%s\"", RelationGetRelationName(rel)); + } parent = stack->bts_blkno; poffset = stack->bts_offset; The question is whether we want a cleaner answer for future development, and if so what that answer ought to look like. It seems like the alternatives we've been discussing may not end up any simpler/shorter than the current code, and it seems hard to justify giving up some concurrency in the name of a simplification that doesn't simplify much. Thoughts? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend