Hello hackers, 1. In a nearby thread, I misdiagnosed a problem reported[1] by Justin Pryzby (though my misdiagnosis is probably still a thing to be fixed; see next). I think I just spotted the real problem he saw: if you execute a parallel query after a smart shutdown has been initiated, you wait forever in gather_readnext()! Maybe parallel workers can't be launched in this state, but we lack code to detect this case? I haven't dug into the exact mechanism or figured out what to do about it yet, and I'm tied up with something else for a bit, but I will come back to this later if nobody beats me to it.
2. Commit cfdf4dc4 on the master branch fixed up all known waits that didn't respond to postmaster death, and added an assertion to that effect. One of the cases fixed was in gather_readnext(), and initially I thought that's what Justin was telling us about (his report was from 11.x), until I reread his message and saw that it was SIGTERM and not eg SIGKILL. I should probably go and back-patch a fix for that case anyway... but now I'm wondering, was there a reason for that omission, and likewise for mq_putmessage()? (Another case of missing PM death detection in the back-branches is postgres_fdw.) [1] https://www.postgresql.org/message-id/CAEepm%3D0kMunPC0hhuT0VC-5dfMT3K-xsToJHkTznA6yrSARsPg%40mail.gmail.com -- Thomas Munro https://enterprisedb.com