Hi Paul, > Thanks for looking into this. My first reaction is that it's a bit > complicated and I would omit --set-mtime-format as being more trouble > than it's worth,
The idea was to use strptime, which is faster than parse_datetime. But comparing to the time consumed by git itself (see below), the saving is negligible, so I agree. > > 1. When COMMAND returns empty output (e.g. when given a file that is > > not in the repository), tar issues a warning message. > > In the tzdb example this warning is unnecessary. Some files in a > tarball are generated by 'Make' and these should use their own > timestamps (with mtime truncated to integer seconds). We don't need to > see a warning about this. I expect other projects using this approach > to be similar. Yes, perhaps so. > This is all getting a bit complicated. > > Some time ago Bruno Haible suggested a simpler approach, which may be > better for GNU tar's users. Here's the idea, which requires two passes > over the input files, the first to collect timestamp info: I like the idea, although I doubt that its implementation would be less complicated than the above. > If GNU Tar had a --reproducible-mtime flag that did the above, that > would be convenient to use. Ideally I wouldn't have to specify > --set-mtime-command for the special case of Git; GNU Tar would just do > the right thing by default for Git. So we assume that --reproducible-mtime option would work only for git, right? Returning to git timings, I noticed that running "make tarballs" in tz tree using git version of tar runs three times slower than using tar 1.35. Profiling showed that additional time was consumed by git log. Trying to get timings on a more complex repository, I tried gnulib and got stupefied by the results: while simple archiving of the entire working tree took 0.01 second, archiving it with --set-mtime-command=git log -1 --format='tformat:%cI' took more than 15 minutes. Finally, to prove that the problem was on git side, I did: time while read name do git log -1 --format='tformat:%cI' $name done < flist >/dev/null and got this: real 15m0.698s user 13m2.866s sys 1m58.233s In short, it looks like we have to find a better way to extract mtimes from git. Regards, Sergey