severity 34488 wishlist
retitle 34488 doc: sort: expand on "broken pipe" (SIGPIPE) behavior
stop
Hello,
On 2019-02-15 7:43 a.m., 積丹尼 Dan Jacobson wrote:
Things start out cheery, but quickly get ugly,
$ for i in 9 99 999 9999 99999; do seq $i|sort -n|sed 5q|wc -l; done
5
5
5
5
sort: write failed: 'standard output': Broken pipe
sort: write error
5
sort: write failed: 'standard output': Broken pipe
sort: write error
Therefore, kindly add a sort --limit=n,
I don't think this is wise, as "head -n5" does exactly that in much more
generic way.
and/or on (info "(coreutils) sort invocation")
admit the problem, and give some workarounds, lest
our scripts occasionally spew error messages seemingly randomly,
just when the boss is looking.
Just to clarify: why do you think this a "problem" ?
This is the intended behavior of most proper programs:
Upon receiving SIGPIPE they should terminal with an error,
unless SIGPIPE is explicitly ignored.
The errors are not "random" - they happen because you explicitly
cut short the output of a program.
It is an important indication about how your pipe works,
and sort is not to blame, e.g.:
$ seq 100000 | head -n1
1
seq: write error: Broken pipe
$ seq 1000000| cat | head -n1
1
cat: write error: Broken pipe
seq: write error: Broken pipe
This is a good indication that the entire output was not consumed,
and is very useful and important in some cases, e.g. when a program
crashes before consuming all input.
Here's a contrived example:
$ seq 1000000 | sort -S 200 -T /foo/bar
sort: cannot create temporary file in '/foo/bar': No such file or
directory
seq: write error: Broken pipe
I force "sort" to fail (limiting it's memory usage and pointing it to
non-existing temporarily directory).
It is then good to know that seq's output was cut short and not consumed.
If you know in advance you will trim the output of a program,
either hide the stderr with "2>/dev/null",
or use the shell's "trap PIPE" mechanism.
And no fair saying "just save the output" (could be big) "into a file
first, and do head(1) or sed(1) on that."
If you want to consume all input and just print the first 5 lines,
you can use "sed -n 1,5p" instead of "sed 5q" - no need for a temporary
file.
I'm marking this as a documentation "wishlist" item,
and patches are always welcomed.
regards,
- assaf