On Wed, Jan 14, 2015 at 8:50 PM, Peter Geoghegan <p...@heroku.com> wrote: > I am mistaken on one detail here - blocks 2 and 9 are actually fully > identical. I still have no idea why, though.
So, I've looked at it in more detail and it appears that the page of block 2 split at some point, thereby creating a new page (the block 9 page). There is a sane downlink in the root page for the new rightlink page. The root page looks totally sane, as does every other page - as I said, the problem is only that block 9 is spuriously identical to block 2. So the (correct) downlink in the root page, to block 9, is the same as the (incorrect) high key value in block 9 - Oid value 69924. To be clear: AFAICT everything is perfect except block 9, which is bizarrely identical to block 2. Now, since the sane page downlink located in the root (like every downlink, a lower bound on items in its child) is actually a copy of the high key on the page that is the child's left link (that is to say, it comes from the original target of a page split - it shares the target's high key value, Oid value 69924), there may have never been sane data in block 9, even though its downlink is sane (so maybe the page split patch is implicated). But it's hard to see how that could be true. The relevant code wasn't really what was changed about page splits in 9.4 anyway (plus this wasn't a non-leaf split, since there aren't enough pages for those to be a factor). There just isn't that many items on page 2 (or its bizarre identical twin, page 9), so a recent split seems unlikely. And, the target and new right page are locked together throughout both the split and down link insertion (even though there are two atomic operations/WAL inserts). So to reiterate, a close by page split that explains the problem seems unlikely. I'm going to focus on the page deletion patch for the time being. Merlin - it would be great if you could revert all the page split commits (which came after the page deletion fix). All the follow-up page split commits [1] were fairly straightforward bugs with recovery, so it should be easy enough to totally remove the page split stuff from 9.4 for the purposes of isolating the bug. [1] http://www.postgresql.org/message-id/cam3swzspj6m9hfhksjuiuof30auwxyyb56fjbw1_dogqkbe...@mail.gmail.com -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers