On 03/20/2014 02:39 AM, Eldon wrote: > Hello all, > I am attempting the following in a bash shell on a 3.8.13 linux kernel: > sudo tcpdump -nn > |grep --line-buffered NTPv2 > |split -u --lines=10 --filter=date > > Clearly date would be replaced with some more useful script, but for the > mean time I am trying to use it to debug what I see as unexpected > buffering. Since the traffic I am looking at is fairly consistent (when > I pipe to cat instead, I see a steady stream), I would expect to see > regular ticks as date executions each time 10 lines are sent to > split. Instead I see flashes of many executions that seem to me to be a > buffer flush. I looked in the code in the git repo, and it seems that > the buffer is in fact filled prior to execution. Is it worth making a > patch or expanding the meaning of -u to pass this through in an > unbuffered fashion? Would there be a reason to reject such a patch if > well-formed? > > Thoughts?
split(1) doesn't use stdio so we're not hitting that buffering. What's happening is that we're explicitly buffering input (into a 64K buffer on my x86_64 system) when using full_read() since coreutils 4.5.8 with this commit: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=23f6d41 Now we really shouldn't be delaying processing like that, so I propose we switch back to safe_read(). With the attached, the following command line gives immediate output: while true; do seq 5; sleep 1; done | src/split --lines=5 --filter=date I'll do a full patch later after checking if safe_read() is appropriate elsewhere, and adding NEWS and maybe a test. thanks, Pádraig.
>From 61036dc0d7565f24d5dbb7ded217dc411e7e3840 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]> Date: Thu, 20 Mar 2014 10:00:13 +0000 Subject: [PATCH] split: with --lines, process input immediately * src/split.c (lines_split): s/full_read/safe_read/. --- src/split.c | 6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/src/split.c b/src/split.c index 29d3dbf..7ff97e3 100644 --- a/src/split.c +++ b/src/split.c @@ -584,8 +584,8 @@ lines_split (uintmax_t n_lines, char *buf, size_t bufsize) do { - n_read = full_read (STDIN_FILENO, buf, bufsize); - if (n_read < bufsize && errno) + n_read = safe_read (STDIN_FILENO, buf, bufsize); + if (n_read == SAFE_READ_ERROR) error (EXIT_FAILURE, errno, "%s", infile); bp = bp_out = buf; eob = bp + n_read; @@ -614,7 +614,7 @@ lines_split (uintmax_t n_lines, char *buf, size_t bufsize) } } } - while (n_read == bufsize); + while (n_read); } /* Split into pieces that are as large as possible while still not more -- 1.7.7.6
