xxxx"

Alvaro Herrera Tue, 15 Jul 2014 21:47:54 -0700

I'm not saying there is no multixact bug here, but I wonder if this part
of your crasher patch might be the cause:


--- 754,771 ----
                                 errmsg("could not seek to block %u in file 
\"%s\": %m",
                                                blocknum, 
FilePathName(v->mdfd_vfd))));
  
!         if (JJ_torn_page > 0 && counter++ > JJ_torn_page && 
!RecoveryInProgress()) {
!         nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ/3);
!               ereport(FATAL,
!                               (errcode(ERRCODE_DISK_FULL),
!                                errmsg("could not write block %u of relation 
%s: wrote only %d of %d bytes",
!                                               blocknum,
!                                               relpath(reln->smgr_rnode, 
forknum),
!                                               nbytes, BLCKSZ),
!                                errhint("JJ is screwing with the database.")));
!         } else {
!         nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ);
!       }

Wouldn't this BLCKSZ/3 business update the page's LSN but not the full
contents, meaning that on xlog replay the block wouldn't be rewritten
when the xlog replays next time around?  That could cause the block to
have the upper two thirds containing multixacts in xmax that had been
removed by a vacuuming round previous to the crash.

(Maybe I'm just too tired and I'm failing to fully understand the torn
page protection.  I thought I understood how it worked, but now I'm not
sure -- I mean I don't see how it can possibly have any value at all.
Surely if the disk writes the first 512-byte sector of the page and then
forgets the updates to the next 15 sectors, the page will appear as not
needing the full page image to be restored ...)

Is the page containing the borked multixact value the one that was
half-written by this code?

Is the problem reproducible if you cause this path to ereport(FATAL)
without writing 1/3rd of the page?

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: 9.3: more problems with "Could not open file "pg_multixact/members/xxxx"

Reply via email to