Hitesh Bhambhani <hitesh.bhambh...@asg.com> writes: >> From: Alvaro Herrera [mailto:alvhe...@commandprompt.com] >> Sent: Wednesday, August 04, 2010 11:30 PM >> There probably is. What kind of relation are the ones unable to truncate? >> Please see in pg_class where relfilenode = '41274' in this >> case: >> > [HiteshB] the relation is called pmoinstance_idx_pmotypeid.
So the relation that couldn't be truncated is indeed the same one complained of in the later message. What it looks like to me is that page 30 was deleted, and then vacuum tried to truncate it off the index altogether, and that failed because of Windows randomness, and then later it's trying to delete page 30 again. Which naturally fails because the parent downlink entry is long gone. But it's odd that it tries to delete page 30 twice. The page should have been in BTP_DELETED state before the truncate attempt, and therefore should still be dead later, so why's it trying again? [ thinks for a bit ... ] I do see a way that could happen. The page could still be live on disk when we reach smgrtruncate(): the update to BTP_DELETED state might only exist in a dirty shared buffer. And lookee here what smgrtruncate does: /* * Get rid of any buffers for the about-to-be-deleted blocks. bufmgr will * just drop them without bothering to write the contents. */ DropRelFileNodeBuffers(reln->smgr_rnode, forknum, isTemp, nblocks); So we throw away the BTP_DELETED state update without ever writing it to disk, and then when the truncate fails, the old page state is still out there, ready to confuse us later. Seems like we need to think harder about recovering from a truncate failure. A few random ideas: 1. Write the dirty buffers before dropping them. Kind of ugly from a performance viewpoint, but simple and safe. 2. Treat truncation failure as a PANIC condition; then WAL replay will fix things. Unpleasant. Actually, worse than unpleasant: I think what the comment in RelationTruncate() is pointing out is that if the failure is repeatable, we'd fail to recover at all. 3. Don't discard the dirty buffers until after successfully truncating. The hard part here is to be sure no other process (like bgwriter) will try to write them in between. I don't see a way to guarantee that, at least not without interlock infrastructure that doesn't exist today. And see also that comment in RelationTruncate(). Seems like the whole problem of coping with truncation failure needs more thought than we've given it. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs