Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT

Etsuro Fujita Tue, 06 Nov 2018 04:09:49 -0800

(2018/11/06 12:53), Kyotaro HORIGUCHI wrote:

At Fri, 02 Nov 2018 22:05:36 +0900, Etsuro Fujita<fujita.ets...@lab.ntt.co.jp>  wrote 
in<5bdc4ba0.7050...@lab.ntt.co.jp>

(2018/10/29 15:58), Kyotaro HORIGUCHI wrote:

At Tue, 23 Oct 2018 13:21:31 +0100, Tom Lane<t...@sss.pgh.pa.us>  wrote
in<18397.1540297...@sss.pgh.pa.us>

It's just a POC because there are some things that need more thought
than I've given them:


1. Is it OK to revert SIGPIPE to default processing for *all* programs
launched through OpenPipeStream?  If not, what infrastructure do we
need to add to control that?  In particular, is it sane to revert
SIGPIPE for a pipe program that we will write to not read from?
(I thought probably yes, because that is the normal Unix behavior,
but it could be debated.)

ISTM that it would be OK to inherit SIG_DFL in both cases, because I
think it would be the responsibility of the called program to handle
SIGPIPE properly (if necessary) in both cases.  Maybe I'm missing
something, though.


So, I think we *should* (not just OK to) restore SIGPIPE to
SIG_DFL in any case here to prevent undetermined situation for
the program.

OK

2. Likewise, is it always OK for ClosePipeToProgram to ignore a
SIGPIPE failure?  (For ordinary COPY, maybe it shouldn't, since
we don't intend to terminate that early.)


Since the SIGPIPE may come from another pipe, I think we
shouldn't generally.


Agreed; if ClosePipeToProgram ignores that failure, we would fail to
get a better error message in CopySendEndOfRow if the called program
(inheriting SIG_DFL for SIGPIPE) was terminated on SIGPIPE:

                 if (cstate->is_program)
                 {
                     if (errno == EPIPE)
                     {
                         /*
                          * The pipe will be closed automatically on error at
                          * the end of transaction, but we might get a better
                          * error message from the subprocess' exit code than
                          * just "Broken Pipe"
                          */
                         ClosePipeToProgram(cstate);

                         /*
                          * If ClosePipeToProgram() didn't throw an error, the
                          * program terminated normally, but closed the pipe
                          * first. Restore errno, and throw an error.
                          */
                         errno = EPIPE;
                     }
                     ereport(ERROR,
                             (errcode_for_file_access(),
                              errmsg("could not write to COPY program: %m")));
                 }


Mmm, that's EPIPE of fwrite, not SIGNALLED&SIGPIPE of called
program's exit status. So it is irrelevant to called program's
SIGPIPE setup. It requires SIGPIPE to be kept to SIG_IGN on the
backend side.

My explanation might not be enough. Let me explain. If the calledprogram that inherited SIG_DFL for SIGPIPE terminated on SIGPIPE forsome reason, ClosePipeToProgram *as-is* would create an error messagefrom that program's exit code. But if we modify ClosePipeToProgram likethe original POC patch, that function would not create that message forthat termination. To avoid that, I think it would be better forClosePipeToProgram to ignore the SIGPIPE failure only in the case wherethe caller is a COPY FROM PROGRAM that is allowed to terminate early.(Maybe we could specify that as a new argument for BeginCopyFrom.)

But iff we are explicitly stop reading from
the pipe before detecting an error, it can be ignored since we
aren't interested in further failure.


You mean that we should ignore other failures of the called program
when we stop reading from the pipe early?


Yes, we have received sufficient data from the pipe then closed
it successfully. The program may want write more but we don't
want it. We ragher should ignore SIGPIPE exit code of the program
since closing our writing end of a pipe is likely to cause it and


I think so too.

even if it comes from another pipe, we can assume that the
SIGPIPE immediately stopped the program before returning any
garbage to us.


Sorry, I don't understand this.

4. Are there any other signals we ought to be reverting to default
behavior before launching a COPY TO/FROM PROGRAM?


All handlers other than SIG_DEF and IGN are reset on exec().
Only SIGUSR2 seems to be set to SIG_IGN, but it's not likely to
harm anything. Perhaps it would be safer against future changes
if we explicitly reset all changed actions to SIG_DEF, but it
might be too-much..


Not sure, but reverting SIGPIPE to default would be enough as a fix
for the original issue, if we go with the POC patch.


Agreed. I wonder why there's no API that resets all handlers to
SIG_DFL at once or some flag telling to exec() that it should
start with default signal handler set.

Such API would be an improvement, but IMO I think that would go beyondthe scope of this fix.

Finally in the attached PoC, I set cstate->eof_reached on failure
of NextCopyFromRawFields then if eof_reached we don't ingore
SIGPIPE. For a program like
"puts("test1");fflush(stdout);kill(getpid(), SIGPIPE);", I had
the following behavior. (I got an error without the fflush() but
it is inevitable).

=# select * from ft2 limit 1;
    a
-------
  test1

=# select * from ft2 limit 2;
ERROR:  program "/home/horiguti/work/exprog" failed
DETAIL:  child process was terminated by signal 13: Broken pipe

For the original case:

=# select * from ft1 limit 1;
   a
------
  test
=# select * from ft1 limit 2;
   a
------
  test
(1 row)


I didn't confirmed whether it is complete.

Sorry, I don't understand this fully, but the reason to add theeof_reached stuff is to avoid ignoring the SIGPIPE failure in normalCOPY FROM PROGRAM cases?


Best regards,
Etsuro Fujita

Re: BUG #15449: file_fdw using program cause exit code error when using LIMIT

Reply via email to