Hi all,

I'm trying to create deterministic tarballs across different systems (Linux
and macOS primarily) using GNU tar 1.29. It looks like many people do this
successfully with a few options like --owner=0, --group=0, and
--numeric-owner (among others like --mtime that aren't relevant to this
email).

That usually works fine, but when long filenames come into play,
--numeric-owner seems to stop being as effective at suppressing user and
group names. Here's a simple example:

touch $(printf 't%.0s' {1..101}) && printf 't%.0s' {1..101} | tar -T -
--owner=0 --group=0 --numeric-owner -c | grep root

This should create a simple tar with a single file in it whose name is 101
t characters in a row. Given the --numeric-owner flag I pass in, I'd expect
"root" not to appear in the result, but the pipeline above actually does
produce a match. And indeed, if I change my /etc/group to map 0 to another
name, that name will appear in the output. This starts getting painful on
macOS, where group 0 is actually called "wheel".

Poking around at the tar source (
https://github.com/Distrotech/tar/blob/release_1_29/src/create.c#L540), I
see that the write_gnu_long_link function doesn't seem to pay any attention
to my --numeric-owner flags and just blindly inserts the name of the 0
user/group into the LONGLINK struct. Is there a reason for that? It
introduces slightly context-sensitive behavior to a process that is
otherwise pretty much pure and fully reproducible (with the flags I mention
above).

Thanks,
Dan

Reply via email to