On Wed, Sep 29, 2021 at 01:18:40PM +0200, Daniel Gustafsson wrote: > So there is one mention of a background WAL receiver already in there, but > it's > pretty inconsistent as to what we call it. For now I've changed the messaging > in this patch to say "background process", leaving making this all consistent > for a follow-up patch. > > The attached fixes the above, as well as the typo mentioned off-list and is > rebased on top of todays HEAD.
I have been looking a bit at this patch, and did some tests on Windows to find out that this is able to catch the failure of the thread streaming the WAL segments in pg_basebackup, avoiding a completion of the base backup, while HEAD waits until the backup finishes. Testing this scenario is actually simple by issuing pg_terminate_backend() on the WAL sender that streams the WAL with START_REPLICATION, while throttling the base backup. Could you add a test to automate this scenario? As far as I can see, something like the following should be stable even for Windows: 1) Run a pg_basebackup in the background with IPC::Run, using --max-rate with a minimal value to slow down the base backup, for slow machines. 013_crash_restart.pl does that as one example with $killme. 2) Find out the WAL sender doing START_REPLICATION in the backend, and issue pg_terminate_backend() on it. 3) Use a variant of pump_until() on the pg_basebackup process and check after one or more failure patterns. We should refactor this part, actually. If this new test uses the same logic, that would make three tests doing that with 022_crash_temp_files.pl and 013_crash_restart.pl. The CI should be fine to provide any feedback with the test in place, though I am fine to test things also in my box. -- Michael
signature.asc
Description: PGP signature