Hi, On 2022-11-08 01:16:09 +1300, Thomas Munro wrote: > So [1] on its own didn't fix this. My next guess is that the attached > might help. > > Hmm. Following Michael's clue that this might involve log files and > pg_ctl, I noticed one thing: pg_ctl implements > wait_for_postmaster_stop() by waiting for kill(pid, 0) to fail, and > our kill emulation does CallNamedPipe(). If the server is in the > process of exiting and the kernel is cleaning up all the handles we > didn't close, is there any reason to expect the signal pipe to be > closed after the log file?
What is our plan here? This afaict is the most common "false positive" for cfbot in the last weeks. E.g.: https://api.cirrus-ci.com/v1/artifact/task/5462686092230656/testrun/build/testrun/pg_upgrade/002_pg_upgrade/log/regress_log_002_pg_upgrade ... [00:02:58.761](93.859s) ok 10 - run of pg_upgrade for new instance [00:02:58.808](0.047s) not ok 11 - pg_upgrade_output.d/ removed after pg_upgrade success [00:02:58.815](0.007s) # Failed test 'pg_upgrade_output.d/ removed after pg_upgrade success' # at C:/cirrus/src/bin/pg_upgrade/t/002_pg_upgrade.pl line 288. Michael: Why does 002_pg_upgrade.pl try to filter the list of files in pg_upgrade_output.d for files ending in .log? And why does it print those only after starting the new node? How about moving the iteration through the pg_upgrade_output.d to before the ->start and printing all the files, but only slurp_file() if the filename ends with *.log? Minor nit: It seems off to quite so many copies of $newnode->data_dir . "/pg_upgrade_output.d" particularly where the test defines $log_path, but then still builds it from scratch after (line 304). Greetings, Andres Freund