Oh and, ...
On Mon, 28 Nov 2022, Pádraig Brady wrote:
I'm presuming the input generator is depending on the compiler runs
(written to by tee) to exit cleanly, before exiting / generating more?
Hence the hangs?
If that was the case then there still might be the potential for hangs
even if tee detected closed pipes. I.e. if the compiler runs hung rather
than exited, this not be distinguishable from tee as the pipe outputs
would remain.
This is a good point too.
So perhaps the generally useful cases are (1) if the intermittent input is
from a tty, but also (2) if the input program (that writes to a pipe) has
similar logic to what we are proposing to add to tee here; ie, that it
detects when its output becomes a broken pipe.
More generally for (2), if the whole command pipeline is written that way,
this will propagate all the way back. In that sense there might even be
some merit in adding this type of logic to all coreutils programs that
filter stdin to stdout.
For instance, take this oversimplified example with cat:
cat | cat | cat | cat | true
If this is run on a terminal, you have to hit Enter *4 times* before
control returns to the shell, as it takes that many separate writes to
cause all the pipes to break in sequence (right-to-left) and finally get
the write failure in the left-most cat.
If this kind of detect-broken-output-pipe logic were added to filter utils
generally, the above example (with 4 cats) would return to the shell
immediately.
Carl
On Tue, 29 Nov 2022, Carl Edquist via GNU coreutils General Discussion wrote:
Hi all,
On Mon, 28 Nov 2022, Arsen Arsenović wrote:
Pádraig Brady <p...@draigbrady.com> writes:
Trying to understand your use case better,
...
The bug we observed is that on occasion, for instance when running with a
tty, or with a script that (for some reason) has a pipe on stdin, the
tee-based "compiler" would hang. To replicate this, try:
tee > (gcc test.c -o a.out.1) >(gcc test.c -o a.out.2)
in a tty (here, the stdin is meant to be irrelevant).
If I may try to provide a simple example of the problem, consider the
following command line:
tee > (sleep 3) | sleep 5
Let tee's stdin be a terminal to supply the "intermittent input".
You'll see that after 5 seconds, this will hang indefinitely until you hit
Enter.
For the first 3 seconds, when hitting the Enter key, tee will successfully
write the line to each pipe. Between 3 and 5 seconds, the pipe to "sleep 3"
will be broken, which tee will notice, and then tee will continue writing the
lines to the "sleep 5" pipe.
But after 5 seconds, when "sleep 5" has terminated and that pipe becomes
broken, tee will continue to "hang" waiting for input (in this case the
intermittent input from the terminal) indefinitely, despite the fact that all
of tee's outputs are now broken pipes. tee will only "notice" that the
"sleep 5" pipe is broken when it receives input after that point, because
then the write to that pipe fails with EPIPE (and/or a SIGPIPE is delivered).
...
It seems the ideal thing to happen here is for tee to terminate once it
determines that all of its outputs are broken pipes. It comes close to this
already, but it only learns about this when write attempts fail, and it only
attempts a write when it has input to tee.
As I suppose was suggested in the patch, perhaps poll(2) could be used to
wait for POLLIN from fd 0, and POLLHUP for outputs (perhaps limited to pipes
/ sockets).
The patch subject suggests adding --pipe-check as an option, but on first
blush it seems like this would actually be a good thing to do by default...
Cheers,
Carl