On Tue, Jul 31, 2018 at 02:55:58PM +0200, Emre Hasegeli wrote: > == The Workarounds == > > We can possibly work around this inside the "restore_command" or > by delaying the archiving. Working around inside the "restore_command" > would involve checking whether the file exists under pg_wal/. This > should not be easy because the WAL file may be written partially. It > should be easier for Postgres to do this as it knows where to stop > processing the local WAL.
It is also not that complicated to check if a WAL segment is properly shaped by just running pg_waldump or such, so that would be fine for all your cases with back-branches perhaps? > == The Change == > > This "restore_command" behavior is coming from the initial archiving > and point-in-time-recovery implementation [2]. The code says > "the reason is that the file in XLOGDIR could be an old, un-filled or > partly-filled version that was copied and restored as part of > backing up $PGDATA." This was probably a good reason in 2004, but > I don't think it still is. AFAIK "pg_basebackup" eliminates this > problem. pg_basebackup is not the only backup solution, though I'd like that folks use it more, it can be a bottleneck and comes with its own limitations when streaming for example tar data with multiple tablespaces for example still... > Also, with this reasoning, we should also try streaming from the > master before trying the local WAL, but AFAIU we don't. ... You have a point here, things are rather inconsistent by this argument. I have not worked on that in details, but at least WaitForWALToBecomeAvailable() which enforces XLOG_FROM_ARCHIVE when the current source is XLOG_FROM_PG_WAL would need to be changed. -- Michael
signature.asc
Description: PGP signature