Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Craig Ringer Sun, 08 Apr 2018 18:36:13 -0700

On 9 April 2018 at 06:29, Bruce Momjian <[email protected]> wrote:


>
> I think the big problem is that we don't have any way of stopping
> Postgres at the time the kernel reports the errors to the kernel log, so
> we are then returning potentially incorrect results and committing
> transactions that might be wrong or lost.


Right.

Specifically, we need a way to ask the kernel at checkpoint time "was
everything written to [this set of files] flushed successfully since the
last time I asked, no matter who did the writing and no matter how the
writes were flushed?"

If the result is "no" we PANIC and redo. If the hardware/volume is screwed,
the user can fail over to a standby, do PITR, etc.

But we don't have any way to ask that reliably at present.

-- 
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS

Reply via email to