On 20/07/18 18:03, Andres Freund wrote:
On 2018-07-20 11:53:32 +0300, Heikki Linnakangas wrote:
But I think we have consensus on replacing the exit(2) calls
with _exit(2). If we do just that, it would be better than the status quo,
even if it doesn't completely fix the problem. This would prevent the case
that Asim reported, for starters.
Right. You're planning to backpatch?
Yeah. Committed and back-patched.
With _exit(), I think we wouldn't really need the "PG_SETMASK(&BlockSig);"
calls in the aux process *_quickdie handlers that don't do anything else
than call _exit(). But I didn't dare to remove them yet.
I think we should remove it at least in master.
Removed.
It's much less the exit() that's unsafe, than the callbacks themselves,
right? Perhaps just restate that we wouldn't want to trigger atexit
processing due to signal safety?
Well, in principle exit() is unsafe too, although you're right that in
practice it's more likely the callbacks that would cause trouble. I
reworded the comment to put more emphasis on the callbacks.
On 23/07/18 17:34, Robert Haas wrote:
+1 for trying to improve this by using _exit rather than exit, and for
not letting the perfect be the enemy of the good.
But -1 for copying the language "if some idiot DBA sends a manual
SIGQUIT to a random backend". I think that phrase could be deleted
from this comment -- and all of the other places where this comment
appears already today -- without losing any useful informational
content. Or it could be rephrased to "if this process receives a
SIGQUIT". It's just not necessary to call somebody an idiot to
communicate the point.
Rephrased to be less offensive.
So, with this commit, the various SIGQUIT quickdie handlers are in a
better shape. The regular backend's quickdie() handler still calls
ereport(), which is not safe, but it's a step in the right direction.
And we haven't addressed the original complaint, which was actually
about startup_die(), not quickdie().
What should we do about startup_die()? Ideas?
- Heikki