Sorry. That's partially wrong. At Wed, 03 Aug 2016 17:31:16 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote in <20160803.173116.111915228.horiguchi.kyot...@lab.ntt.co.jp> > I had an inquiry about the following log messages. > > 2016-07-20 10:16:58.294 JST,,,3240,,578ed102.ca8,1,,2016-07-20 10:16:50 > JST,30/75,0,LOG,00000,"no left sibling (concurrent deletion?) in > ""some_index_rel""",,,,,,,,"_bt_unlink_halfdead_page, nbtpage.c:1643","" > 2016-07-20 10:16:58.294 JST,,,3240,,578ed102.ca8,2,,2016-07-20 10:16:50 > JST,30/75,0,ERROR,XX000,"lock main 13879 is not held",,,,,"automatic vacuum > of table ""db.nsp.tbl""",,,"LWLockRelease, lwlock.c:1137","" > > These are gotten after pg_upgrade from 9.1.13 to 9.4. > > The first line is emitted for simultaneous deletion of a index > page, which is impossible by design in a consistent state so the > complained situation should be the result of an index corruption > before upgading, specifically, inconsistent sibling pointers > around a deleted page. > > I noticed the following part in nbtpage.c related to this. It is > the same still in the master. > > nbtpage.c:1635@9.4.8: > > > while (P_ISDELETED(opaque) || opaque->btpo_next != target) > > { > > /* step right one page */ > > leftsib = opaque->btpo_next; > > _bt_relbuf(rel, lbuf); > > if (leftsib == P_NONE) > > { > > elog(LOG, "no left sibling (concurrent deletion?) in > > \"%s\"", > > RelationGetRelationName(rel)); > > return false; > > With the condition for the while loop, if the just left sibling > of target is (mistakenly, of course) in deleted state (and the > target is somehow pointing to the deleted page as left sibling), > lbuf finally goes beyond to right side of the target. This seems > to result in unintentional releasing of the lock on target and > the second log message. > > > My point here is that if concurrent deletion can't be perfomed by > the current implement, this while loop could be removed and > immediately error out or log a message, > > > > if (P_ISDELETED(opaque) || opaque->btpo_next != target) > > { > > elog(ERROR, "no left sibling of page %d (concurrent deletion?) in > > \"%s\"",..
The above is the result of forgetting the main object of this loop. Please forget it. Still it seems right to stop before target. > or, the while loop at least should stop before overshooting the > target. > > > while (P_ISDELETED(opaque) || opaque->btpo_next != target) > > { > > /* step right one page */ > > leftsib = opaque->btpo_next; > > _bt_relbuf(rel, lbuf); > > if (leftsib == target || leftsib == P_NONE) > > { > > elog(ERROR, "no left sibling of page %d (concurrent deletion?) in > > \"%s\"",.. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers