On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku <k...@kylheku.com> wrote:
> stdbuf is a hack/workaround for programs that ignore the
> issue of buffering. Specifically, programs which send information
> to one of the three standard streams, such that the information
> is required in a timely way.  Those streams become fully buffered
> when not connected to a terminal.

When we're talking about very simple programs, like expand, stdbuf is
probably the best solution we're ever going to actually get.

> There can be a performance issue also, though! Suppose
> we run "find" to find certain files over a large file tree.
> It finds only a small number of files: all the file paths
> identified fit into a single buffer, which is not flushed
> until the program terminates (when sent to a pipe).
>
> We pipe this to some program which does some processing
> on those files. We would like the processing to start as
> soon as the first file has been identified, not when find is done!
> It could be that find discovers all the relevant files
> early in its execution and then spends a minute finding
> nothing else. That minute is added to the processing time
> of the files that were found.
>
> That is the compelling reason for wanting file names to
> be flushed individually, whether they are newline terminated
> or null terminated.

An ideal solution for this situation, from the perspective of a
relative layperson, would be to flush a sized buffer after a given
time period of containing data but having not been flushed. So, if a
buffer gets filled very quickly, it just gets flushed upon being
filled. If data sits in the buffer for a few too many processor cycles
or what have you, it gets flushed right then. I imagine there would be
some overhead to implementing that, which I don't have a good feel
for.

Reply via email to