(2018/11/09 14:39), Kyotaro HORIGUCHI wrote:
At Thu, 08 Nov 2018 21:52:31 +0900, Etsuro Fujita<fujita.ets...@lab.ntt.co.jp> wrote
in<5be4318f.4040...@lab.ntt.co.jp>
(2018/11/08 10:50), Thomas Munro wrote:
I take back what I said earlier about false positives from other
pipes. I think it's only traditional Unix programs designed for use
in pipelines and naive programs that let SIGPIPE terminate the
process. The accepted answer here gives a good way to think about
it:
https://stackoverflow.com/questions/8369506/why-does-sigpipe-exist
Thanks for the information!
A program sophisticated enough to be writing to other pipes is no
longer in that category and should be setting up signal dispositions
itself, so I agree that we should enable the default disposition and
ignore WTERMSIG(exit_code) == SIGPIPE, as proposed. That is pretty
close to the intended purpose of that signal AFAICS.
Great!
In the sense of "We don't care the reason", negligible reasons
are necessariry restricted to SIGPIPE, evan SIGSEGV could be
theoretically ignored safely. "theoretically" here means it is
another issue whether we rely on the output from a program which
causes SEGV (or any reason other than SIGPIPE, which we caused).
For the SIGSEGV case, I think it would be better that we don't rely on
the output data, IMO, because I think there might be a possibility
that
the program have generated that data incorrectly/unexpectedly.
+1
I don't think we should ignore termination by signals other than
SIGPIPE: that could hide serious problems from users. I want to know
if my program is crashing with SIGBUS, SIGTERM, SIGFPE etc, even if it
happens after we read enough data; there is a major problem that a
human needs to investigate!
I think so too.
Ok, I can live with that with no problem.
OK
As the result it doesn't report an error for SELECT * FROM ft2
LIMIT 1 on "main(void){puts("test1"); return 1;}".
=# select * from ft limit 1;
a
-------
test1
(1 row)
limit 2 reports the error.
=# select * from ft limit 2;
ERROR: program "/home/horiguti/work/exprog" failed
DETAIL: child process exited with exit code 1
I think this would be contrary to users expectations: if the SELECT
command works for limit 1, they would expect that the command would
work
for limit 2 as well. So, I think it would be better to error out that
command for limit 1 as well, as-is.
I think it's correct that LIMIT 1 gives no error but LIMIT 2 gives an
error. For LIMIT 1, we got all the rows we wanted, and then we closed
the pipe. If we got a non-zero non-signal exit code, or a signal exit
code and it was SIGPIPE (not any other signal!), then we should
consider that to be expected.
Maybe I'm missing something, but the non-zero non-signal exit code
means that there was something wrong with the called program, so I
think a human had better investigate that as well IMO, which would
probably be a minor problem, though. Too restrictive?
I think Thomas just saying that reading more lines can develop
problems. According to the current discussion, we should error
out if we had SEGV when limit 1.
Ah, I misread that. Sorry for the noise.
On Wed, Nov 7, 2018 at 4:44 PM Etsuro Fujita
<fujita.ets...@lab.ntt.co.jp> wrote:
(2018/11/06 19:50), Thomas Munro wrote:
On my FreeBSD system, I compared the output of procstat -i (= show
signal disposition) for two "sleep 60" processes, one invoked from the
shell and the other from COPY ... FROM PROGRAM. The differences were:
PIPE, TTIN, TTOU and USR2. For the first and last of those, the
default action would be to terminate the process, but the COPY PROGRAM
child ignored them; for TTIN and TTOU, the default action would be to
stop the process, but again they are ignored. Why do bgwriter.c,
startup.c, ... set SIGTTIN and SIGTTOU back to SIG_DFL, but not
regular backends?
So, we should revert SIGUSR2 as well to default processing?
I don't think it matters in practice, but it might be nice to restore
that just for consistency.
Agreed.
I'm not sure what to think about the TTIN,
TTOU stuff; I don't understand job control well right now but I don't
think it really applies to programs run by a PostgreSQL backend, so if
we restore those it'd probably again be only for consistency. Then
again, there may be a reason someone decided to ignore those in the
postmaster + regular backends but not the various auxiliary processes.
Anyone?
I don't have any idea about that.
In my understanding processes not connected to a
terminal(tty/pts) cannot receive TTIN/TTOU (unless someone sent
it artifically). Since child processes are detached by setsid()
(on Linux), programs called in that way also won't have a
controlling terminal at the start time and I suppose they have no
means to connect to one since they are no longer on the same
session with postmaster.
For TTIN and TTOU, we would first need to make clear the reason for the
inconsistency Thomas pointed out. I'm wondering if we should leave the
TTIN/TTOU stuff for future work.
Best regards,
Etsuro Fujita