Hi, A recent cfbot run caused CI on windows to crash - on a patch that could not conceivably cause this issue: https://cirrus-ci.com/task/5646021133336576 the patch is just: https://github.com/postgresql-cfbot/postgresql/commit/dbd4afa6e7583c036b86abe2e3d27b508d335c2b
regression.diffs: https://api.cirrus-ci.com/v1/artifact/task/5646021133336576/testrun/build/testrun/regress/regress/regression.diffs postmaster.log: https://api.cirrus-ci.com/v1/artifact/task/5646021133336576/testrun/build/testrun/regress/regress/log/postmaster.log crash info: https://api.cirrus-ci.com/v1/artifact/task/5646021133336576/crashlog/crashlog-postgres.exe_1af0_2023-02-08_00-53-23-997.txt 00000085`f03ffa40 00007ff6`fd89faa8 ucrtbased!abort(void)+0x5a [minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 77] 00000085`f03ffa80 00007ff6`fd6474dc postgres!ExceptionalCondition( char * conditionName = 0x00007ff6`fdd03ca8 "PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED", char * fileName = 0x00007ff6`fdd03c80 "../src/backend/storage/ipc/pmsignal.c", int lineNumber = 0n329)+0x78 [c:\cirrus\src\backend\utils\error\assert.c @ 67] 00000085`f03ffac0 00007ff6`fd676eff postgres!MarkPostmasterChildActive(void)+0x7c [c:\cirrus\src\backend\storage\ipc\pmsignal.c @ 329] 00000085`f03ffb00 00007ff6`fd59aa3a postgres!InitProcess(void)+0x2ef [c:\cirrus\src\backend\storage\lmgr\proc.c @ 375] 00000085`f03ffb60 00007ff6`fd467689 postgres!SubPostmasterMain( int argc = 0n3, char ** argv = 0x000001c6`f3814e80)+0x33a [c:\cirrus\src\backend\postmaster\postmaster.c @ 4962] 00000085`f03ffd90 00007ff6`fda0e1c9 postgres!main( int argc = 0n3, char ** argv = 0x000001c6`f3814e80)+0x2f9 [c:\cirrus\src\backend\main\main.c @ 192] So, somehow we ended up a pmsignal slot for a new backend that's not currently in PM_CHILD_ASSIGNED state. Obviously the first idea is to wonder whether this is a problem introduced as part of the the recent postmaster-latchification work. At first I thought we were failing to terminate running processes, due to the following output: parallel group (20 tests): name char txid text varchar enum float8 regproc int2 boolean bit oid pg_lsn int8 int4 float4 uuid rangetypes numeric money boolean ... ok 684 ms char ... ok 517 ms name ... ok 354 ms varchar ... ok 604 ms text ... ok 603 ms int2 ... ok 676 ms int4 ... ok 818 ms int8 ... ok 779 ms oid ... ok 720 ms float4 ... ok 823 ms float8 ... ok 628 ms bit ... ok 666 ms numeric ... ok 1132 ms txid ... ok 497 ms uuid ... ok 818 ms enum ... ok 619 ms money ... FAILED (test process exited with exit code 2) 7337 ms rangetypes ... ok 813 ms pg_lsn ... ok 762 ms regproc ... ok 632 ms But now I realize the reason none of the other tests failed, is because the crash took a long time, presumably due to the debugger creating the above information, so no other tests failed. 2023-02-08 00:53:20.257 GMT client backend[4584] pg_regress/rangetypes STATEMENT: select '-[a,z)'::textrange; TRAP: failed Assert("PMSignalState->PMChildFlags[slot] == PM_CHILD_ASSIGNED"), File: "../src/backend/storage/ipc/pmsignal.c", Line: 329, PID: 5948 [ quite a few lines ] 2023-02-08 00:53:27.420 GMT postmaster[872] LOG: server process (PID 5948) was terminated by exception 0xC0000354 2023-02-08 00:53:27.420 GMT postmaster[872] HINT: See C include file "ntstatus.h" for a description of the hexadecimal value. 2023-02-08 00:53:27.420 GMT postmaster[872] LOG: terminating any other active server processes 2023-02-08 00:53:27.434 GMT postmaster[872] LOG: all server processes terminated; reinitializing 2023-02-08 00:53:27.459 GMT startup[5800] LOG: database system was interrupted; last known up at 2023-02-08 00:53:19 GMT 2023-02-08 00:53:27.459 GMT startup[5800] LOG: database system was not properly shut down; automatic recovery in progress 2023-02-08 00:53:27.462 GMT startup[5800] LOG: redo starts at 0/20DCF08 2023-02-08 00:53:27.484 GMT startup[5800] LOG: could not stat file "pg_tblspc/16502": No such file or directory 2023-02-08 00:53:27.484 GMT startup[5800] CONTEXT: WAL redo at 0/20DCFB8 for Tablespace/DROP: 16502 2023-02-08 00:53:27.614 GMT startup[5800] LOG: invalid record length at 0/25353E8: wanted 24, got 0 2023-02-08 00:53:27.614 GMT startup[5800] LOG: redo done at 0/2534FE0 system usage: CPU: user: 0.04 s, system: 0.04 s, elapsed: 0.15 s Nevertheless, clearly this should never be reached. Greetings, Andres Freund