On 12/26/24 17:06, Bruno Haible via Gnulib discussion list wrote:
Rather than assign a silly
EPOCH_DATE or some other nonsense [1], the idea would be to take
timestamp(Y) := max (vmtime(X1), vmtime(X2), ..., vmtime(Xn)).
where vmtime(X) is defined as:
- if X is under version control and not modified locally:
$(git log -1 --format=%ct X),
that is, the last time file X was modified under version control,
- otherwise: mtime(X).
This is roughly what I use for TZDB <https://iana.org/tz>, except with
the following additions:
* timestamp(Y) is 1 + max (...), not merely max (...). This is for
portability to 'make' implementations that consider equal timestamps to
mean that the target needs rebuilding (POSIX suggests this behavior).
* If X is built and is not under version control, then vmtime(X) is
timestamp(X) not mtime(X). This is needed for reproducibility when there
are long dependency chains.
I generate TZDB tarballs with the the following GNU Tar options, and
nobody has complained about the tarballs for years so they appear to be
portable:
--format=pax --pax-option=delete=atime,delete=ctime
--numeric-owner --owner=0 --group=0
--mode=go+u,go-w --sort=name
--format=ustar stops working in the year 2242, which is why I changed
TZDB to use --format=pax. All files in the tarball have timestamps with
1-second resolution so the resulting tarball does not use any extensions
to ustar format when all files predate the year 2242.
My experience with TZDB reproducibility is what led me to write the GNU
Tar manual section quoted at the start of this thread.