> > Tom Lane wrote: > > I said: > > > If there wasn't disk space enough to hold the clog page, the checkpoint > > > attempt should have failed. So it may be that allowing a short read in > > > slru.c would be patching the symptom of a bug that is really elsewhere. > > > > After more staring at the code, I have a theory. SlruPhysicalWritePage > > and SlruPhysicalReadPage are coded on the assumption that close() can > > never return any interesting failure. However, it now occurs to me that > > there are some filesystem implementations wherein ENOSPC could be > > returned at close() rather than the preceding write(). (For instance, > > the HPUX man page for close() states that this never happens on local > > filesystems but can happen on NFS.) So it'd be possible for > > SlruPhysicalWritePage to think it had successfully written a page when > > it hadn't. This would allow a checkpoint to complete :-( > > > > Chris, what's your platform exactly, and what kind of filesystem are > > you storing pg_clog on? > > We already have a TODO on fclose(): > > * Add checks for fclose() failure > Tom was referring to close(), not fclose(). I once had an awful time searching for a memory leak caused by a typo using close instead of fclose. So adding checks for both is probably a good idea.
Regards, Christoph ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster