"Theodore Ts'o" <ty...@mit.edu> writes:

> On Tue, Nov 26, 2024 at 04:27:37PM +0100, Simon Josefsson wrote:
>> I have never understood what value there is in duplicating the uploaded
>> tarball in the git repository.
>
> The actual cost of storing the pristine tarball is quite small.

I think this is a good example of us talking past each other in this
thread: some people question the value of pristine in the first place
(and I've been compelled by those arguments), and some people argue that
the cost is small and there are no bugs (or at least lack of bug
reports).

> For example:
>
> commit 91c7ab39337da63371b4814bef2b2aaf85a9e37c (origin/pristine-tar, 
> pristine-tar)
> Author: Theodore Ts'o <ty...@mit.edu>
> Date:   Mon May 20 23:12:54 2024 -0400
>
>     pristine-tar data for e2fsprogs_1.47.1.orig.tar.gz
>
>  e2fsprogs_1.47.1.orig.tar.gz.asc   |  11 +++++++++++
>  e2fsprogs_1.47.1.orig.tar.gz.delta | Bin 0 -> 63961 bytes
>  e2fsprogs_1.47.1.orig.tar.gz.id    |   1 +
>  3 files changed, 12 insertions(+)
>
> Compare if I had to keep all of the old release tarballs around:
>
>     9.5M e2fsprogs-1.47.1.tar.gz
>
> The reason why I find pristine-tar *super* valuable is because it
> stashes the signed tarball and tarball in a highly efficient way, and
> which can be easily backed up by just doing a "git push" to github /
> git.kernel.org / salsa.  I can then just kick off a git-buildpackage
> in a super-convenient way, so the tooling is quite mature and
> convenient for development velocity.
>
> I could imagine an alternate way of generating data for
> git-buildpackage, by replacing the pristine with something that stores
> the detached GPG signature, and then a shell script which generates
> the orig.tar.gz, for example at [1].  But now we'd have third-party
> users who want to rebuild the debian packages from source executing an
> arbitrary shell script found in the git repository to generate the
> orig.tar.gz file, which would be a security nightmare.  Pristine-tar
> is a much better from that perspective.
>
> [1] https://github.com/tytso/e2fsprogs/blob/master/util/gen-git-tarball

Yeah, this is nice, but I appear to have all of that with
git-pbuildpackage, uscan, origtargz etc downloading the upstream tarball
automatically already today.

If we are worried about malicious upstreams replacing tarballs, or
man-in-the-middle attacks, I think my debian/upstream/*SUMS approach is
a more effective solution to that problem.  Pristine-tar seems like a
tool-centric solution that isn't used elsewhere in the FOSS ecosystem.
Hash checksums are widely used to solve the security concerns, and
people know about those concepts even without learning anything about
Debian let alone git-buildpackage or pristine-tar.

If we are worried about upstreams going away so the tarball URLs doesn't
work, I like the Guix approach to 1) store hash checksums and 2) a
mirror system that fall back to the Software Heritage.  That also uses
known established concepts (SHA256 hashes + URL list) to solve the
problem, without having to learn git-buildpackage or pristine-tar.

/Simon

Attachment: signature.asc
Description: PGP signature

Reply via email to