On 4 April 2018 at 13:29, Thomas Munro <thomas.mu...@enterprisedb.com> wrote:
> On Wed, Apr 4, 2018 at 2:44 PM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: > > On Wed, Apr 4, 2018 at 2:14 PM, Bruce Momjian <br...@momjian.us> wrote: > >> Uh, are you sure it fixes our use-case? From the email description it > >> sounded like it only reported fsync errors for every open file > >> descriptor at the time of the failure, but the checkpoint process might > >> open the file _after_ the failure and try to fsync a write that happened > >> _before_ the failure. > > > > I'm not sure of anything. I can see that it's designed to report > > errors since the last fsync() of the *file* (presumably via any fd), > > which sounds like the desired behaviour: > > > > [..] > > Scratch that. Whenever you open a file descriptor you can't see any > preceding errors at all, because: > > /* Ensure that we skip any errors that predate opening of the file */ > f->f_wb_err = filemap_sample_wb_err(f->f_mapping); > > https://github.com/torvalds/linux/blob/master/fs/open.c#L752 > > Our whole design is based on being able to open, close and reopen > files at will from any process, and in particular to fsync() from a > different process that didn't inherit the fd but instead opened it > later. But it looks like that might be able to eat errors that > occurred during asynchronous writeback (when there was nobody to > report them to), before you opened the file? > Holy hell. So even PANICing on fsync() isn't sufficient, because the kernel will deliberately hide writeback errors that predate our fsync() call from us? I'll see if I can expand my testcase for that. I'm presently dockerizing it to make it easier for others to use, but that turns out to be a major pain when using devmapper etc. Docker in privileged mode doesn't seem to play nice with device-mapper. Does that mean that the ONLY ways to do reliable I/O are: - single-process, single-file-descriptor write() then fsync(); on failure, retry all work since last successful fsync() or - direct I/O ? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services