On Fri, 23 Jan 2004, Tom Lane wrote: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Tom's answer will be undoubtly better ... > > Nope, I think you got all the relevant points. > > The only thing I'd add after having had more time to think about it is > that this seems very much like the problem we noticed recently with > recovery-from-WAL being broken by the new code in bufmgr.c that tries to > validate the header fields of any page it reads in. We had to add an > escape hatch to disable that check while InRecovery, and I expect what > we will end up with here is a few lines added to slru.c to make it treat > read-past-EOF as a non-error condition when InRecovery. Now the clog > code has always had all that paranoid error checking, but because it > deals in such tiny volumes of data (only 2 bits per transaction), it's > unlikely to suffer an out-of-disk-space condition. That's why we hadn't > seen this failure mode before.
It seems that by adding the following to SlruPhysicalReadPage() we can recover in a reasonable way here. Instead of: if (lseek(fd, (off_t) offset, SEEK_SET) < 0) { slru_errcause = SLRU_SEEK_FAILED; slru_errno = errno; return false; } We have: if (lseek(fd, (off_t) offset, SEEK_SET) < 0) { if(!InRecovery) { slru_errcause = SLRU_SEEK_FAILED; slru_errno = errno; return false; } ereport(LOG, (errmsg("Short read from file \"%s\", reading as zeroes", path))); MemSet(shared->page_buffer[slotno], 0, BLCKSZ); return true; } Which is exactly how we recover from a missing pg_clog file. > > regards, tom lane Gavin ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html