On Tue, Mar 12, 2024 at 12:42 PM Kaz Kylheku <k...@kylheku.com> wrote: > stdbuf is a hack/workaround for programs that ignore the > issue of buffering. Specifically, programs which send information > to one of the three standard streams, such that the information > is required in a timely way. Those streams become fully buffered > when not connected to a terminal.
When we're talking about very simple programs, like expand, stdbuf is probably the best solution we're ever going to actually get. > There can be a performance issue also, though! Suppose > we run "find" to find certain files over a large file tree. > It finds only a small number of files: all the file paths > identified fit into a single buffer, which is not flushed > until the program terminates (when sent to a pipe). > > We pipe this to some program which does some processing > on those files. We would like the processing to start as > soon as the first file has been identified, not when find is done! > It could be that find discovers all the relevant files > early in its execution and then spends a minute finding > nothing else. That minute is added to the processing time > of the files that were found. > > That is the compelling reason for wanting file names to > be flushed individually, whether they are newline terminated > or null terminated. An ideal solution for this situation, from the perspective of a relative layperson, would be to flush a sized buffer after a given time period of containing data but having not been flushed. So, if a buffer gets filled very quickly, it just gets flushed upon being filled. If data sits in the buffer for a few too many processor cycles or what have you, it gets flushed right then. I imagine there would be some overhead to implementing that, which I don't have a good feel for.