On Tue, 2013-03-26 at 02:50 +0900, Fujii Masao wrote: > Hi, > > I found that the regression test failed when I created the database > cluster with the checksum and set wal_level to archive. I think that > there are some bugs around checksum feature. Attached is the regression.diff.
Thank you for the report. This was a significant oversight, but simple to diagnose and fix. There were several places that were doing something like: PageSetChecksumInplace if (use_wal) log_newpage smgrextend Which is obviously wrong, because log_newpage set the LSN of the page, invalidating the checksum. We need to set the checksum after log_newpage. Also, I noticed that copy_relation_data was doing smgrread without validating the checksum (or page header, for that matter), so I also fixed that. Patch attached. Only brief testing done, so I might have missed something. I will look more closely later. Regards, Jeff Davis
*** a/src/backend/access/heap/rewriteheap.c --- b/src/backend/access/heap/rewriteheap.c *************** *** 273,286 **** end_heap_rewrite(RewriteState state) /* Write the last page, if any */ if (state->rs_buffer_valid) { - PageSetChecksumInplace(state->rs_buffer, state->rs_blockno); - if (state->rs_use_wal) log_newpage(&state->rs_new_rel->rd_node, MAIN_FORKNUM, state->rs_blockno, state->rs_buffer); RelationOpenSmgr(state->rs_new_rel); smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno, (char *) state->rs_buffer, true); } --- 273,287 ---- /* Write the last page, if any */ if (state->rs_buffer_valid) { if (state->rs_use_wal) log_newpage(&state->rs_new_rel->rd_node, MAIN_FORKNUM, state->rs_blockno, state->rs_buffer); RelationOpenSmgr(state->rs_new_rel); + + PageSetChecksumInplace(state->rs_buffer, state->rs_blockno); + smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno, (char *) state->rs_buffer, true); } *************** *** 616,623 **** raw_heap_insert(RewriteState state, HeapTuple tup) { /* Doesn't fit, so write out the existing page */ - PageSetChecksumInplace(page, state->rs_blockno); - /* XLOG stuff */ if (state->rs_use_wal) log_newpage(&state->rs_new_rel->rd_node, --- 617,622 ---- *************** *** 632,637 **** raw_heap_insert(RewriteState state, HeapTuple tup) --- 631,639 ---- * end_heap_rewrite. */ RelationOpenSmgr(state->rs_new_rel); + + PageSetChecksumInplace(page, state->rs_blockno); + smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno, (char *) page, true); *** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *************** *** 51,56 **** --- 51,57 ---- #include "commands/tablespace.h" #include "commands/trigger.h" #include "commands/typecmds.h" + #include "common/relpath.h" #include "executor/executor.h" #include "foreign/foreign.h" #include "miscadmin.h" *************** *** 8902,8913 **** copy_relation_data(SMgrRelation src, SMgrRelation dst, smgrread(src, forkNum, blkno, buf); ! PageSetChecksumInplace(page, blkno); /* XLOG stuff */ if (use_wal) log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page); /* * Now write the page. We say isTemp = true even if it's not a temp * rel, because there's no need for smgr to schedule an fsync for this --- 8903,8923 ---- smgrread(src, forkNum, blkno, buf); ! if (!PageIsVerified(page, blkno)) ! ereport(ERROR, ! (errcode(ERRCODE_DATA_CORRUPTED), ! errmsg("invalid page in block %u of relation %s", ! blkno, ! relpathbackend(src->smgr_rnode.node, ! src->smgr_rnode.backend, ! forkNum)))); /* XLOG stuff */ if (use_wal) log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page); + PageSetChecksumInplace(page, blkno); + /* * Now write the page. We say isTemp = true even if it's not a temp * rel, because there's no need for smgr to schedule an fsync for this
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers