On Tue, Apr 10, 2018 at 2:22 AM, Anthony Iliopoulos <ail...@altatus.com> wrote: > On Mon, Apr 09, 2018 at 03:33:18PM +0200, Tomas Vondra wrote: >> Well, there seem to be kernels that seem to do exactly that already. At >> least that's how I understand what this thread says about FreeBSD and >> Illumos, for example. So it's not an entirely insane design, apparently. > > It is reasonable, but even FreeBSD has a big fat comment right > there (since 2017), mentioning that there can be no recovery from > EIO at the block layer and this needs to be done differently. No > idea how an application running on top of either FreeBSD or Illumos > would actually recover from this error (and clear it out), other > than remounting the fs in order to force dropping of relevant pages. > It does provide though indeed a persistent error indication that > would allow Pg to simply reliably panic. But again this does not > necessarily play well with other applications that may be using > the filesystem reliably at the same time, and are now faced with > EIO while their own writes succeed to be persisted.
Right. For anyone interested, here is the change you mentioned, and an interesting one that came a bit earlier last year: https://reviews.freebsd.org/rS316941 -- drop buffers after device goes away https://reviews.freebsd.org/rS326029 -- update comment about EIO contract Retrying may well be futile, but at least future fsync() calls won't report success bogusly. There may of course be more space-efficient ways to represent that state as the comment implies, while never lying to the user -- perhaps involving filesystem level or (pinned) inode level errors that stop all writes until unmounted. Something tells me they won't resort to flakey fsync() error reporting. I wonder if anyone can tell us what Windows, AIX and HPUX do here. > [1] > https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-pillai.pdf Very interesting, thanks. -- Thomas Munro http://www.enterprisedb.com