On Sun, Aug 15, 2021 at 8:16 PM Michael Paquier <mich...@paquier.xyz> wrote: > > On Fri, Aug 13, 2021 at 05:59:21PM -0700, Soumyadeep Chakraborty wrote: > > and passes with the code change, as expected. I can't explain why the > > test doesn't freeze up in v3 in wait_for_catchup() at the end. > > It took me some some to understand why. If I am right, that's because > of the intermediate test block working on $standby_2 and the two > INSERT queries of the primary. In v1 and v4, we have no activity on > the primary between the first set of tests and yours, meaning that > $standby has nothing to do. In v2 and v3, the two INSERT queries run > on the primary for the purpose of the recovery pause make $standby_1 > wait for the default value of recovery_min_apply_delay, aka 3s, in > parallel. If the set of tests for $standby_2 is faster than that, > we'd bump on the phase where the code still waited for 3s, not the 2 > hours set, visibly.
I see, thanks a lot for the explanation. Thanks to your investigation, I can now kind of reuse some of the test mechanisms for the other patch that I am working on [1]. There, we don't have multiple standbys getting in the way, thankfully. > After considering this stuff, the order dependency we'd introduce in > this test makes the whole thing more brittle than it should. And such > an edge case does not seem worth spending extra cycles testing anyway, > as if things break we'd finish with a test stuck for an unnecessary > long time by relying on wait_for_catchup("replay"). We could use > something else, say based on a lookup of pg_stat_activity but this > still requires extra run time for the wait phases needed. So at the > end I have dropped the test, but backpatched the fix. > -- Fair. Regards, Soumyadeep (VMware) [1] https://www.postgresql.org/message-id/flat/CANXE4Tc3FNvZ_xAimempJWv_RH9pCvsZH7Yq93o1VuNLjUT-mQ%40mail.gmail.com