On 9/8/22 14:23, Andreas Dilger wrote:
GNU "ls" has this unfortunate misfeature of reading the whole directory to 
read+stat every entry in a directory to determine field width before displaying anything 
(even if unsorted), and this is a real problem for large directories on a network 
filesystem. Please do not inflict this on tar.

Yes. In ls's defense, ls output is normally sorted, and the 'ls -U' use case is so rare that I expect we haven't bothered to optimize it. Tar is different, though, since it never sorts its output.

The algorithm Tar uses is: output "user/group size" in a field of width 19. If that's too narrow, widen the field width to the minimum necessary. This may yield stairstepping, but if we increased 19 to a larger value we'd be more likley to waste valuable screen real estate. See Tar's simple_print_header function for details.

For what it's worth, people who want reproducible builds should be running tar with options like "--owner=0 --group=0 --numeric-owner" anyway, and tarballs created that way avoid the stairstepping problem.


Reply via email to