* Paul Eggert (egg...@cs.ucla.edu) wrote:
> On 9/8/22 14:23, Andreas Dilger wrote:
> > GNU "ls" has this unfortunate misfeature of reading the whole directory to 
> > read+stat every entry in a directory to determine field width before 
> > displaying anything (even if unsorted), and this is a real problem for 
> > large directories on a network filesystem. Please do not inflict this on 
> > tar.
> 
> Yes. In ls's defense, ls output is normally sorted, and the 'ls -U' use case
> is so rare that I expect we haven't bothered to optimize it. Tar is
> different, though, since it never sorts its output.
> 
> The algorithm Tar uses is: output "user/group size" in a field of width 19.
> If that's too narrow, widen the field width to the minimum necessary. This
> may yield stairstepping, but if we increased 19 to a larger value we'd be
> more likley to waste valuable screen real estate. See Tar's
> simple_print_header function for details.

Why 'user/group size' and not two separate fields of 'user/group' and 'size'?
For user/group I'd think remembering the largest size seen might be
a reasonable heuristic.

Dave

> For what it's worth, people who want reproducible builds should be running
> tar with options like "--owner=0 --group=0 --numeric-owner" anyway, and
> tarballs created that way avoid the stairstepping problem.
> 
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

Reply via email to