On 30/06/2025 00:42, Bruno Haible via GNU coreutils Bug Reports wrote:
Jim Meyering wrote:
That is an option no GNU system needs, since they've all had tac since
before 1992-era textutils.
But 'tac' does not have a line-number-limit argument.
The POSIX rationale [1] has
"While both
tail -n$n | tac
and
tac | head -n$n
can be used to output a fixed length of reversed line output, the
standard developers decided that it was preferable to have a single
utility tail -r -n$n for the same purpose."
Right these are equivalent, so it's only worth considering
the more efficient tail -n$n | tac
The second of these alternatives, 'tac | head -n$n' will not work well
with non-seekable files: it requires 'tac' to buffer the *entire* input
(as huge as it may be), before extracting a few lines of it.
The first alternative looks better: 'tail -n$n | tac'. But thinking
through it, it seems the logic that 'tail' uses for 'tail -n$n' is
also nearly suitable for 'tail -r -n$n':
- In function file_lines(), instead of calling dump_remainder at the
end, the loop would call xwrite_stdout once for each line (with
special considerations for lines that span more than 1 buffer).
- In function pipe_lines(), all the relevant data is in memory at
the end. It's only a question of doing the xwrite_stdout calls
on smaller pieces and in reverse order.
When implemented this way, this will be more efficient than to spawn
'tac' as a separate subprocess.
That's not really the unix model though.
Having separate processes also implicitly leverages multiple processors
so you'd have to account for that.
Saying all that I'm not strongly against it,
especially since POSIX standardised it,
but I'm just surprised they standardised it.
Note there are cases where merging functionality can have algorithmic
advantages,
in which case there is a much stronger argument for merging.
For example we have previously mentioned sort --tail=$n or --head=$n
would be useful (and more commonly required) functionality.
See: https://lists.gnu.org/archive/html/bug-coreutils/2004-04/msg00157.html
It would be especially useful if implemented in O(n log n) complexity.
cheers,
Padraig