On 2022-04-25 13:03, Alexey via Bug reports for the GNU Bourne Again
SHell wrote:
There is one more problem with pipes — they are extremely slow.
Examples:
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
1) Preparation: create two files with ASCII content:
one for to be file, another to be pipe in redirections
constructions.
rm /tmp/testaP; for i in {1..65535}; do echo -n a >>
/tmp/testaP; done
rm /tmp/testaF; for i in {1..65536}; do echo -n a >>
/tmp/testaF; done
2) read with here-string:
a) from auto-pipe:
time for i in {1..100}; do read -r a <<<"$(cat /tmp/testaP)";
done
real 0m1.224s
b) from auto-file:
time for i in {1..100}; do read -r a <<<"$(cat /tmp/testaF)";
done
real 0m0.403s
2) read from process substitution (forced pipe):
time for i in {1..100}; do read -r a < <(cat /tmp/testaF);
done
real 0m1.165s
Loop just to see more precise time.
I see similar results in Bash (that is, reading from pipe appears
significantly slower than reading from a file) but in ksh (93u+) reading
from a pipe or socket appears a bit faster than reading from a file (and
all the ksh tests appear significantly faster than Bash)
I know ksh does some optimizations with sockets (buffer peeking, etc.)
and there are trade-offs with software that does stuff like check
whether the I/O streams are pipes (specifically) to detect whether
they're run as part of a pipeline... But I was surprised to see ksh
results on pipes are actually faster than sockets, and that both are
faster than temp files (/tmp on this system is part of the root
filesystem, not a tmpfs)
Point being, unless I've gotten something wrong in how I've setup my
test, it doesn't appear to be an inherent limitation of pipes, but
rather in how Bash works with them.
Test data follows:
(test_data_2 is a 250000 byte file)
ksh:
#socket
$ time for i in {1..100}; do cat test_data_2 | read -r a; done
real 0m0.76s
user 0m0.32s
sys 0m0.53s
# pipe
$ time for i in {1..100}; do read -r a < <(cat test_data_2); done
real 0m0.63s
user 0m0.26s
sys 0m0.50s
# file
$ time for i in {1..100}; do read -r a <<<"$(cat test_data_2)"; done
real 0m2.00s
user 0m1.60s
sys 0m0.40s
bash:
# pipe: takes approx 80 times longer than ksh
$ time for i in {1..100}; do cat test_data_2 | read -r a; done
real 0m55.002s
user 0m15.676s
sys 0m39.696s
# file: Takes 4.5 times longer than ksh
$ time for i in {1..100}; do read -r a <<<"$(cat test_data_2)"; done
real 0m8.986s
user 0m7.558s
sys 0m1.563s