On 9/8/22 14:23, Andreas Dilger wrote:
GNU "ls" has this unfortunate misfeature of reading the whole directory to read+stat every entry in a directory to determine field width before displaying anything (even if unsorted), and this is a real problem for large directories on a network filesystem. Please do not inflict this on tar.
Yes. In ls's defense, ls output is normally sorted, and the 'ls -U' use case is so rare that I expect we haven't bothered to optimize it. Tar is different, though, since it never sorts its output.
The algorithm Tar uses is: output "user/group size" in a field of width 19. If that's too narrow, widen the field width to the minimum necessary. This may yield stairstepping, but if we increased 19 to a larger value we'd be more likley to waste valuable screen real estate. See Tar's simple_print_header function for details.
For what it's worth, people who want reproducible builds should be running tar with options like "--owner=0 --group=0 --numeric-owner" anyway, and tarballs created that way avoid the stairstepping problem.