On 9/27/19 4:41 AM, Fujii Masao wrote: > On Fri, Sep 27, 2019 at 4:07 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: >> >> On Fri, Sep 27, 2019 at 3:36 AM David Steele <da...@pgmasters.net> wrote: >>> >>> On 9/24/19 1:25 AM, Fujii Masao wrote: >>>> >>>> When backup_label exists, the startup process enters archive recovery mode >>>> even if recovery.signal file doesn't exist. In this case, the startup >>>> process >>>> tries to retrieve WAL files by using restore_command. Then, at the >>>> beginning >>>> of the archive recovery, the contents of backup_label are copied to >>>> pg_control >>>> and backup_label file is removed. This would be an intentional behavior. >>> >>>> But I think the problem is that, if the server shuts down during that >>>> archive recovery, the restart of the server may cause the recovery to fail >>>> because neither backup_label nor recovery.signal exist and the server >>>> doesn't enter an archive recovery mode. Is this intentional, too? Seems No. >>>> >>>> So the problematic scenario is; >>>> >>>> 1. the server starts with backup_label, but not recovery.signal. >>>> 2. the startup process enters an archive recovery mode because >>>> backup_label exists. >>>> 3. the contents of backup_label are copied to pg_control and >>>> backup_label is deleted. >>> >>> Do you mean deleted or renamed to backup_label.old? >>> >>>> 4. the server shuts down.. >>> >>> This happens after the cluster has reached consistency? >>> >>>> 5. the server is restarted. neither backup_label nor recovery.signal exist. >>>> 6. the startup process starts just crash recovery because neither >>>> backup_label >>>> nor recovery.signal exist. Since it cannot retrieve WAL files from >>>> archival >>>> area, it may fail. >>> >>> I tried a few ways to reproduce this but was not successful without >>> manually removing WAL. >> >> Hmm me too. I think that since we enter crash recovery at step #6 we >> don't retrieve WAL files from archival area. >> >> But I reproduced the problem Fujii-san mentioned that the restart of >> the server during archive recovery causes to the crash recovery >> instead of resuming the archive recovery. > > Yes, it's strange and unexpected to start crash recovery > when restarting archive recovery. Archive recovery should > start again in that case, I think.
+1 -- -David da...@pgmasters.net