On Sat, Dec 30, 2017 at 4:20 AM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Sat, Dec 30, 2017 at 04:30:07AM +0300, Sergey Burladyan wrote: > > We use this scripts: > > https://github.com/avito-tech/dba-utils/tree/master/pg_archive > > > > But I can reproduce problem with simple cp & mv: > > archive_command: > > test ! -f /var/lib/postgresql/wals/%f && \ > > test ! -f /var/lib/postgresql/wals/%f.tmp && \ > > cp %p /var/lib/postgresql/wals/%f.tmp && \ > > mv /var/lib/postgresql/wals/%f.tmp /var/lib/postgresql/wals/%f > > This is unsafe. PostgreSQL expects the WAL segment archived to be > flushed to disk once the archive command has returned its result to the > backend. Don't be surprised if you get corrupted instances or that you > are not able to recover up to a consistent point if you need to roll in > a backup. Note that the documentation of PostgreSQL provides a simple > example of archive command, which is itself bad enough not to use. > True, but that but doesn't explain the current situation, as it reproduces without an OS level crash so a missing sync would not be relevant. (and on some systems, mv'ing a file will force it to be synced under some conditions, so it might be safe anyway) I thought I'd seen something recently in the mail lists or commit log about an off-by-one error which causes it to re-fetch the previous file rather than the current file if the previous file ends with just the right type of record and amount of padding. But now I can't find it. Cheers, Jeff