On Fri, Sep 27, 2019 at 4:07 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Fri, Sep 27, 2019 at 3:36 AM David Steele <da...@pgmasters.net> wrote: > > > > On 9/24/19 1:25 AM, Fujii Masao wrote: > > > > > > When backup_label exists, the startup process enters archive recovery mode > > > even if recovery.signal file doesn't exist. In this case, the startup > > > process > > > tries to retrieve WAL files by using restore_command. Then, at the > > > beginning > > > of the archive recovery, the contents of backup_label are copied to > > > pg_control > > > and backup_label file is removed. This would be an intentional behavior. > > > > > But I think the problem is that, if the server shuts down during that > > > archive recovery, the restart of the server may cause the recovery to fail > > > because neither backup_label nor recovery.signal exist and the server > > > doesn't enter an archive recovery mode. Is this intentional, too? Seems > > > No. > > > > > > So the problematic scenario is; > > > > > > 1. the server starts with backup_label, but not recovery.signal. > > > 2. the startup process enters an archive recovery mode because > > > backup_label exists. > > > 3. the contents of backup_label are copied to pg_control and > > > backup_label is deleted. > > > > Do you mean deleted or renamed to backup_label.old? > > > > > 4. the server shuts down.. > > > > This happens after the cluster has reached consistency? > > > > > 5. the server is restarted. neither backup_label nor recovery.signal > > > exist. > > > 6. the startup process starts just crash recovery because neither > > > backup_label > > > nor recovery.signal exist. Since it cannot retrieve WAL files from > > > archival > > > area, it may fail. > > > > I tried a few ways to reproduce this but was not successful without > > manually removing WAL. > > Hmm me too. I think that since we enter crash recovery at step #6 we > don't retrieve WAL files from archival area. > > But I reproduced the problem Fujii-san mentioned that the restart of > the server during archive recovery causes to the crash recovery > instead of resuming the archive recovery.
Yes, it's strange and unexpected to start crash recovery when restarting archive recovery. Archive recovery should start again in that case, I think. Regards, -- Fujii Masao