Andres Freund <and...@anarazel.de> writes: > On 2019-10-11 14:56:41 -0400, Tom Lane wrote: >> ... So it's really hard to explain >> that as anything except a kernel bug: sometimes, the kernel >> doesn't give us as much stack as it promised it would. And the >> machine is not loaded enough for there to be any rational >> resource-exhaustion excuse for that.
> Linux expands stack space only on demand, thus it's possible to run out > of stack space while there ought to be stack space. Unfortunately that > during a stack expansion, which means there's no easy place to report > that. I've seen this be hit in production on busy machines. As I said, this machine doesn't seem busy enough for that to be a tenable excuse; there's nobody but me logged in, and the buildfarm critter isn't running. > I wonder if the machine is configured with overcommit_memory=2, > i.e. don't overcommit. cat /proc/sys/vm/overcommit_memory would tell. $ cat /proc/sys/vm/overcommit_memory 0 > What does grep -E '^(Mem|Commit)' /proc/meminfo show while it's > happening? idle: $ grep -E '^(Mem|Commit)' /proc/meminfo MemTotal: 2074816 kB MemFree: 36864 kB MemAvailable: 1779584 kB CommitLimit: 1037376 kB Committed_AS: 412480 kB a few captures while regression tests are running: $ grep -E '^(Mem|Commit)' /proc/meminfo MemTotal: 2074816 kB MemFree: 8512 kB MemAvailable: 1819264 kB CommitLimit: 1037376 kB Committed_AS: 371904 kB $ grep -E '^(Mem|Commit)' /proc/meminfo MemTotal: 2074816 kB MemFree: 32640 kB MemAvailable: 1753792 kB CommitLimit: 1037376 kB Committed_AS: 585984 kB $ grep -E '^(Mem|Commit)' /proc/meminfo MemTotal: 2074816 kB MemFree: 56640 kB MemAvailable: 1695744 kB CommitLimit: 1037376 kB Committed_AS: 568768 kB > What does the signal information say? You can see it with > p $_siginfo > after receiving the signal. A SIGSEGV here, I assume. (gdb) p $_siginfo $1 = {si_signo = 11, si_errno = 0, si_code = 128, _sifields = {_pad = {0 <repeats 28 times>}, _kill = {si_pid = 0, si_uid = 0}, _timer = {si_tid = 0, si_overrun = 0, si_sigval = {sival_int = 0, sival_ptr = 0x0}}, _rt = {si_pid = 0, si_uid = 0, si_sigval = { sival_int = 0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0, si_uid = 0, si_status = 0, si_utime = 0, si_stime = 0}, _sigfault = { si_addr = 0x0}, _sigpoll = {si_band = 0, si_fd = 0}}} > Yea, that seems like it might be good. But we have to be careful too, as > there's some thing were do want to be interruptable from within a signal > handler. We start some processes from within one after all... The proposed patch has zero effect on what the signal mask will be inside a signal handler, only on the transient state during handler entry/exit. regards, tom lane