At Wed, 20 Oct 2021 21:35:44 +0530, Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote in > Hi, > > The FATAL error "recovery ended before configured recovery target was > reached" introduced by commit at [1] in PG 14 is causing the standby > to go down after having spent a good amount of time in recovery. There > can be cases where the arrival of required WAL (for reaching recovery > target) from the archive location to the standby may take time and > meanwhile the standby failing with the FATAL error isn't good. > Instead, how about we make the standby wait for a certain amount of > time (with a GUC) so that it can keep looking for the required WAL. If > it gets the required WAL during the wait time, then it succeeds in > reaching the recovery target (no FATAL error of course). If it > doesn't, the timeout occurs and the standby fails with the FATAL > error. The value of the new GUC can probably be set to the average > time it takes for the WAL to reach archive location from the primary + > from archive location to the standby, default 0 i.e. disabled. > > I'm attaching a WIP patch. I've tested it on my dev system and the > recovery regression tests are passing with it. I will provide a better > version later, probably with a test case. > > Thoughts?
It looks like starting a server in non-hot standby mode only fetching from archive. The only difference is it doesn't have timeout. Doesn't that cofiguration meet your requirements? Or, if timeout matters, I agree with Jeff. Retrying in restore_command looks fine. regards. -- Kyotaro Horiguchi NTT Open Source Software Center