Hello, Lief, Peter. At Thu, 21 Nov 2019 12:50:18 +0000, "Leif Gunnar Erlandsen" <l...@lako.no> wrote in > Adding another patch which is not only for recovery_target_time but also for > xid, name and lsn. > > > After studying this a bit more, I think the current behavior is totally > > bogus and needs a serious > > rethink. > > > > If you specify a recovery target and it is reached, recovery pauses > > (depending on > > recovery_target_action). > > > > If you specify a recovery target and it is not reached when the end of the > > archive is reached > > (i.e., restore_command fails), then recovery ends and the server is > > promoted, without any further > > information. This is clearly wrong in multiple ways. > > Yes, that is why I have created the patch.
It seems premising to be used in prepeated trial-and-error recovery by well experiecned operators. When it is used, I think that the target goes back gradually through repetitions so anyway we need to start from a clean backup for each repetition, in the expected usage. Unintended promotion doesn't harm in the case. In this persipective, I don't think the behavior is totally wrong but FATAL'ing at EO-WAL before target seems good to do. > > I think what we should do is if we specify a recovery target and we don't > > reach it, we should > > ereport(FATAL). Somewhere around > > > If recovery pauses or a FATAL error is reported, is not important, as long as > it is possible to get some more WAL and continue recovery. Pause has the > benefit of the possibility to inspect tables in the database. > > > in StartupXLOG(), where we already check for other conditions that are > > undesirable at the end of > > recovery. Then a user can make fixes either by getting more WAL files to > > restore and adjusting the > > recovery target and starting again. I don't think pausing is the right > > behavior, but perhaps an > > argument could be made to offer it as a nondefault behavior. > > Pausing was choosen in the patch as pause was the expected behaivior if > target was reached. > > And the patch does not interfere with any other functionality as far as I > know. With the current behavior, if server promotes without stopping as told by target_action variables, it is a sign that something's wrong. But if server pauses before reaching target, operators may overlook the message if they don't know of the behavior. And if server poses in the case, I think there's nothing to do. So +1 for FATAL. regards. -- Kyotaro Horiguchi NTT Open Source Software Center