Hi Horiguchi-san,
(To recap: In a replication set using archive, startup tries to restore WAL files from archive before checking pg_wal directory for the desired file. The behavior itself is intentionally designed and reasonable. However, the restore code notifies of a restored file regardless of whether it has been already archived or not. If archive_command is written so as to return error for overwriting as we suggest in the documentation, that behavior causes archive failure.) After playing with this, I see the problem just by restarting a standby even in a simple archive-replication set after making not-special prerequisites. So I think this is worth fixing. With this patch, KeepFileRestoredFromArchive compares the contents of just-restored file and the existing file for the same segment only when: - archive_mode = always and - the file to restore already exists in pgwal and - it has a .done and/or .ready status file. which doesn't happen usually. Then the function skips archive notification if the contents are identical. The included TAP test is working both on Linux and Windows.
Thank you for the analysis and the patch. I'll try the patch tomorrow. I just noticed that this thread is still tied to another thread (it's not an independent thread). To fix that, it may be better to create a new thread again. Regards, Tatsuro Yamada