> On 16 Feb 2022, at 08:27, Michael Paquier <mich...@paquier.xyz> wrote: > > On Wed, Sep 29, 2021 at 01:18:40PM +0200, Daniel Gustafsson wrote: >> So there is one mention of a background WAL receiver already in there, but >> it's >> pretty inconsistent as to what we call it. For now I've changed the >> messaging >> in this patch to say "background process", leaving making this all consistent >> for a follow-up patch. >> >> The attached fixes the above, as well as the typo mentioned off-list and is >> rebased on top of todays HEAD. > > I have been looking a bit at this patch, and did some tests on Windows > to find out that this is able to catch the failure of the thread > streaming the WAL segments in pg_basebackup, avoiding a completion of > the base backup, while HEAD waits until the backup finishes. Testing > this scenario is actually simple by issuing pg_terminate_backend() on > the WAL sender that streams the WAL with START_REPLICATION, while > throttling the base backup.
Great, thanks! > Could you add a test to automate this scenario? As far as I can see, > something like the following should be stable even for Windows: > 1) Run a pg_basebackup in the background with IPC::Run, using > --max-rate with a minimal value to slow down the base backup, for slow > machines. 013_crash_restart.pl does that as one example with $killme. > 2) Find out the WAL sender doing START_REPLICATION in the backend, and > issue pg_terminate_backend() on it. > 3) Use a variant of pump_until() on the pg_basebackup process and > check after one or more failure patterns. We should refactor this > part, actually. If this new test uses the same logic, that would make > three tests doing that with 022_crash_temp_files.pl and > 013_crash_restart.pl. The CI should be fine to provide any feedback > with the test in place, though I am fine to test things also in my > box. This is good idea, I was going in a different direction earlier with a test but this is cleaner. The attached 0001 refactors pump_until; 0002 fixes a trivial spelling error found while hacking; and 0003 is the previous patch complete with a test that passes on Cirrus CI. -- Daniel Gustafsson https://vmware.com/
v5-0001-Add-function-to-pump-IPC-process-until-string-mat.patch
Description: Binary data
v5-0002-Remove-duplicated-word-in-comment.patch
Description: Binary data
v5-0003-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch
Description: Binary data