On 9 April 2018 at 06:29, Bruce Momjian <br...@momjian.us> wrote:
> > I think the big problem is that we don't have any way of stopping > Postgres at the time the kernel reports the errors to the kernel log, so > we are then returning potentially incorrect results and committing > transactions that might be wrong or lost. Right. Specifically, we need a way to ask the kernel at checkpoint time "was everything written to [this set of files] flushed successfully since the last time I asked, no matter who did the writing and no matter how the writes were flushed?" If the result is "no" we PANIC and redo. If the hardware/volume is screwed, the user can fail over to a standby, do PITR, etc. But we don't have any way to ask that reliably at present. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services