On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote: > I've reproduced the problem, this time in block 74 of relation > base/16384/4931589, and a tarball of the data directory is here: > > > https://docs.google.com/file/d/0Bzqrh1SO9FcELS1majlFcTZsR0k/edit?usp=sharing > > > > (the table is in database jjanes under role jjanes, the binary is > commit 9ad27c215362df436f8c) > > > What I would probably really want is the data as it existed after the > crash but before recovery started, but since the postmaster > immediately starts recovery after the crash, I don't know of a good > way to capture this.
Can you just turn off the restart_after_crash GUC? I had a chance to look at this, and seeing the block before and after recovery would be nice. I didn't see a log file in the data directory, but it didn't go through recovery, so I assume it already did that. The block is corrupt as far as I can tell. The first third is written, and the remainder is all zeros. The header looks like this: (These numbers are mostly from pageinspect. The checksum value that comes from pageinspect needed to be cast back to an unsigned short to match the error message above -- should that be changed in pageinspect?). lsn: 7/252E4080 checksum: 34212 flags: 1 lower: 1188 upper: 5952 special: 8192 pagesize: 8192 version: 4 prune_xid: 156833911 So the header looks good, but most of the page data is missing. I tried with pg_filedump (the 9.2.0 version, which should be fine), and it agrees with me that the page is corrupt. Interestingly, that doesn't result in a user-visible error when ignore_checksum_failure=on. That's because the item pointers appear to all be either not normal or they point to the zeroed region. Testing the visibility of a zeroed tuple header is always false, so no problem. You'd still think this would cause incorrect results, but I think what's happening is that this is a new page (otherwise it would have been written with something other than zeroes before). So, the tuples that are supposed to be there may be uncommitted anyway. So, the page may be corrupt without checksums as well, but it just happens to be hidden for the same reason. Can you try to reproduce it without -k? And on the checkin right before checksums were added? Without checksums, you'll need to use pg_filedump (or similar) to find whether an error has happened. To start speculating about the root cause: something is violating the WAL before data rule, or not writing a FPI when it's supposed to, or not properly restoring the FPI during recovery, or something sets the wrong LSN. This could still be caused by the checksums patch, but it seems a little less likely now. The reason I say that is because it's a new page with tuples on it, so that means something in the insert/update path ended up not writing the FPI before writing the page. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers