Re: We shouldn't signal process groups with SIGQUIT

2023-03-01 Thread Thomas Munro
On Thu, Mar 2, 2023 at 1:09 PM Andres Freund wrote: > On 2023-03-02 12:29:28 +1300, Thomas Munro wrote: > > ... Huh... what am I missing? I > > thought the only risk was handlers running in the opposite of send > > order because they 'overlapped', not non-handler code being allowed to > > run in

Re: We shouldn't signal process groups with SIGQUIT

2023-03-01 Thread Michael Paquier
On Wed, Mar 01, 2023 at 03:34:30PM -0800, Andres Freund wrote: > On 2023-02-28 13:45:41 +0900, Michael Paquier wrote: >> From what I can see, SIGTERM is actually received by the backends >> before SIGQUIT, and I can also see that the backends have enough room >> to process CFIs in some cases, espec

Re: We shouldn't signal process groups with SIGQUIT

2023-03-01 Thread Andres Freund
Hi, On 2023-03-02 12:29:28 +1300, Thomas Munro wrote: > In theory you could straighten this out by asking what else is pending > so that we imposed our own priority, if that were a problem, but there > is something I don't understand: you said we could handle SIGTERM and > then make it all the way

Re: We shouldn't signal process groups with SIGQUIT

2023-03-01 Thread Andres Freund
Hi, On 2023-02-28 13:45:41 +0900, Michael Paquier wrote: > On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > > Just naively hacking this behaviour change into the current code, would > > yield > > sending SIGQUIT to postgres, and then SIGTERM to the whole process > > group. Which s

Re: We shouldn't signal process groups with SIGQUIT

2023-03-01 Thread Thomas Munro
On Tue, Feb 28, 2023 at 5:45 PM Michael Paquier wrote: > On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > > Just naively hacking this behaviour change into the current code, would > > yield > > sending SIGQUIT to postgres, and then SIGTERM to the whole process > > group. Which see

Re: We shouldn't signal process groups with SIGQUIT

2023-02-27 Thread Michael Paquier
On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > Just naively hacking this behaviour change into the current code, would yield > sending SIGQUIT to postgres, and then SIGTERM to the whole process > group. Which seems like a reasonable order? quickdie() should _exit() > immediately

Re: We shouldn't signal process groups with SIGQUIT

2023-02-22 Thread Michael Paquier
On Wed, Feb 22, 2023 at 09:39:55AM -0500, Tom Lane wrote: > Michael Paquier writes: >> What would be the advantage of doing that for groups other than >> -StartupPID and -PgArchPID? These are the two groups of processes we >> need to worry about, AFAIK. > > No, we have the issue for regular back

Re: We shouldn't signal process groups with SIGQUIT

2023-02-22 Thread Tom Lane
Michael Paquier writes: > What would be the advantage of doing that for groups other than > -StartupPID and -PgArchPID? These are the two groups of processes we > need to worry about, AFAIK. No, we have the issue for regular backends too, since they could be executing COPY FROM PROGRAM or the li

Re: We shouldn't signal process groups with SIGQUIT

2023-02-21 Thread Michael Paquier
On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > Just naively hacking this behaviour change into the current code, would yield > sending SIGQUIT to postgres, and then SIGTERM to the whole process > group. Which seems like a reasonable order? quickdie() should _exit() > immediately

Re: We shouldn't signal process groups with SIGQUIT

2023-02-15 Thread Nathan Bossart
On Wed, Feb 15, 2023 at 10:12:58AM -0800, Andres Freund wrote: > On 2023-02-15 09:57:41 -0800, Nathan Bossart wrote: >> Oh, that's nifty. Any reason not to enable send_abort_for_crash, too? > > I think it'd be too noisy. Right now you get just a core dump of the crashed > process, but if we set s

Re: We shouldn't signal process groups with SIGQUIT

2023-02-15 Thread Andres Freund
Hi, On 2023-02-15 09:57:41 -0800, Nathan Bossart wrote: > On Tue, Feb 14, 2023 at 04:20:59PM -0800, Andres Freund wrote: > > On 2023-02-14 14:23:32 -0800, Nathan Bossart wrote: > >> On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > >> > Not really related: I do wonder how often we e

Re: We shouldn't signal process groups with SIGQUIT

2023-02-15 Thread Nathan Bossart
On Tue, Feb 14, 2023 at 04:20:59PM -0800, Andres Freund wrote: > On 2023-02-14 14:23:32 -0800, Nathan Bossart wrote: >> On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: >> > Not really related: I do wonder how often we end up self deadlocking in >> > quickdie(), due to the ereport() n

Re: We shouldn't signal process groups with SIGQUIT

2023-02-14 Thread Andres Freund
Hi, On 2023-02-14 14:23:32 -0800, Nathan Bossart wrote: > On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > > Not really related: I do wonder how often we end up self deadlocking in > > quickdie(), due to the ereport() not beeing reentrant. We'll "fix" it soon > > after, due to post

Re: We shouldn't signal process groups with SIGQUIT

2023-02-14 Thread Nathan Bossart
On Tue, Feb 14, 2023 at 12:47:12PM -0800, Andres Freund wrote: > Not really related: I do wonder how often we end up self deadlocking in > quickdie(), due to the ereport() not beeing reentrant. We'll "fix" it soon > after, due to postmasters SIGKILL. Perhaps we should turn on > send_abort_for_kill

Re: We shouldn't signal process groups with SIGQUIT

2023-02-14 Thread Andres Freund
Hi, On 2023-02-14 15:38:24 -0500, Tom Lane wrote: > Andres Freund writes: > > ISTM that signal_child() should downgrade SIGQUIT to SIGTERM when sending to > > the process group. That way we'd maintain the current behaviour for postgres > > itself, but stop core-dumping archive/restore scripts (as

Re: We shouldn't signal process groups with SIGQUIT

2023-02-14 Thread Tom Lane
Andres Freund writes: > ISTM that signal_child() should downgrade SIGQUIT to SIGTERM when sending to > the process group. That way we'd maintain the current behaviour for postgres > itself, but stop core-dumping archive/restore scripts (as well as other > subprocesses that e.g. trusted PLs might c

We shouldn't signal process groups with SIGQUIT

2023-02-14 Thread Andres Freund
Hi, The default reaction to SIGQUIT is to create core dumps. We use SIGQUIT to implement immediate shutdowns. We send the signal to the entire process group. The result of that is that we regularly produce core dumps for binaries like sh/cp. I regularly see this on my local system, I've seen it o