On Sat, Nov 16, 2024 at 09:35:57PM +0000, Gavin Smith wrote: > > > > The downside is that if tar is different from the tar that generated the > > tarball and the tarball is different even though it contains the same > > information it will lead to spurious differences in git. I have no idea > > to what extent different tar give byte-compatible output. Most > > developpers will be using a recent GNU tar, but we do not know for sure. > > Good point. If the file continually shows differences in git then > we should untrack it.
I regenerated the tar on macos and they are different. I looked at a binary diff and it looks like the format is different (ustar from GNU/Linux, pax from macos), the files orders are different too and the user identifiers too. I did the same on a debian testing and the format is ustar too, but there are differences in the headers (for the first files that are the same files), the user names are different and the files orders too. I am wondering if we will have to compare the content of the tarball when there is an attempt to do a new one and there is an existing one. There is also the issue of reproducible distributed sources. If we can make sure that the distributed sources can be reproduced, at least with a specified tar that is present on every platform. And also we may have to allow the user to specify customize the tar program name, be it only to use the same as the maintainers for a reproductible distribution. -- Pat