Re: Security review of tag2upload

2024-06-26 Thread Salvo Tomaselli
> 
> Building a source package is a lot more opaque and gives the attacker a
> lot more room to hide.  Adding malicious code to tar to inject something
> into source packages is a lot quieter

How many packages have a pubkey for the orig file?

Perhaps we should encourage upstreams to sign more?

I guess that means giving up pypi as a place to download from, since they have 
removed support for signatures.

But for example kde tarballs are all signed.

-- 
Salvo Tomaselli

"Io non mi sento obbligato a credere che lo stesso Dio che ci ha dotato di
senso, ragione ed intelletto intendesse che noi ne facessimo a meno."
-- Galileo Galilei

https://ltworf.codeberg.page/




Re: Security review of tag2upload

2024-06-26 Thread Jonas Smedegaard
Quoting Salvo Tomaselli (2024-06-26 09:25:37)
> > 
> > Building a source package is a lot more opaque and gives the attacker a
> > lot more room to hide.  Adding malicious code to tar to inject something
> > into source packages is a lot quieter
> 
> How many packages have a pubkey for the orig file?
> 
> Perhaps we should encourage upstreams to sign more?

What I have learned from all this, is that we should not encourage to
sign more, but encourage to cautiously sign more.

Both ourselves and our upstreams.

My point being that signatures have little value if automated or done
manually without related examination.

I am sure that's also what you meant, Salvo, I just find it quite
relevant to be explicit that it is the care that need a boost, not the
amount of signatures.

 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


Re: Security review of tag2upload

2024-06-26 Thread Salvo Tomaselli
> I am sure that's also what you meant, Salvo, I just find it quite
> relevant to be explicit that it is the care that need a boost, not the
> amount of signatures.

Well if you manually check very carefully every line and then don't sign… it's 
harder to discover it got modified.

-- 
Salvo Tomaselli

"Io non mi sento obbligato a credere che lo stesso Dio che ci ha dotato di
senso, ragione ed intelletto intendesse che noi ne facessimo a meno."
-- Galileo Galilei

https://ltworf.codeberg.page/




Re: Summary of the current state of the tag2upload discussion

2024-06-26 Thread Matthias Urlichs

On 25.06.24 23:14, Salvo Tomaselli wrote:

I think that the very same people who never check what's in a tarball are very
unlikely to start checking diffs.


IMHO you're mistaken.

(a) checking the source package is not a one-liner. You need to untar to 
someplace temporary, run a recursive diff (remembering to not skip new 
files), then clean up the tempdir.


On the other hand, "git log --patch up..deb" is one simple command; you 
even can add a shell alias or git alias for it.


(b) people (both the maintainer and others) routinely look at git 
changelogs, including with --patch or --stat.


I have no idea how unlikely my personal preferred workflow is, being a 
sample size of one, but I have literally never examined a just-assembled 
source package. On the other hand I run various "git log" commands 
habitually, and based on the nonsense I did find on several of those 
occasions I believe I'd notice strange changes pretty soon(ish).


--
-- mit freundlichen Grüßen
--
-- Matthias Urlichs



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Summary of the current state of the tag2upload discussion

2024-06-26 Thread Didier 'OdyX' Raboud
Le mardi, 25 juin 2024, 19.14:53 h CEST Russ Allbery a écrit :
> tho...@goirand.fr writes:
> > Watch the Kosovo lightning talk where Didier shows what he did. It is a
> > proven concept.
> 
> If this is the proof of concept where the *.dsc file is encoded in a Git
> tag (sorry, there have been several proofs of concept and I lose track of
> which person was associated with which one), I understand it and have
> already said why I don't think it will work reliably enough in its current
> form.  (Summary: It relies on the reproducibility of tar and compression
> programs.)  We should measure reproducible source packages by comparing
> the unpacked source packages.  It's a lot more robust.

For the record: I agree with Russ.

My PoC was written out of the desire to let Salsa CI upload _after successful 
tests_, and I took the "inline the source package as built locally in a git 
object" route for that. It inlines the .dsc and the _source.changes. As Russ 
(and others) have demonstrated, this kind of concept is going to be very 
fragile and very susceptible to breakage (with origins in varying versions of 
dpkg, tar, git-archive, pristine-tar, source building workflows, etc etc): in 
other words, it relies on the reproduciblity of _source packages_, 
reproducibility of the conversion from a git repository to a source package 
(which is out of scope of the reproducible builds' team, last I checked).

It also falls short on its original promise: given that (a modified, inlined 
version of) the signed source package is pushed to Salsa, the uploader has 
already vouched for the upload to Debian, even before the "required" tests 
have ran.

The workflow I proposed was to delegate the actual upload to a CI job, but it 
has no additional git-centric or security-centric properties than an uploader 
pushing a git tag and uploading to DELAYED/1 simultaneously, and then 
promoting, or cancelling depending on the CI test results (besides a bit less 
manual work).

Another caveat with that approach, is that it clearly needs a Debian-specific 
tool to build the inline data and add it to git (and then another to unpack it 
and upload). As I wanted something auditable, I inlined text signatures and 
refused to inline things out of the source tree, but it would be easy to 
commit the actual signed source binaries to git and get the same process (and 
effect). But then there's no point in using git, just upload it!

I think the tag2upload is a much better and stronger approach than my PoC for 
these reasons:
- the dgit tools cover way more ground in terms of supported git workflows, to 
get closer to reproducible source package building from git
- the PoC's promise to "only upload after successful CI" is nice, and not 
covered by tag2upload (yet?), but is not really important in the big picture; 
it's not such a fundamental property for a git workflow that it is worth 
blocking the tag2upload proposal for. (and it's likely quite simple to get 
that with tag2upload)
- tag2upload (especially with multiple, parallel instances of it) would test 
reproducibility of the git-to-source process that is currently absolutely 
untested. I totally concur with the impression that almost nobody is actually 
dissecting source packages (as signed, I really mean the orig tar and 
.debian.tar files) before upload; as well as the fact that even those doing so 
are very likely to miss anything unusual in there, making the check mostly 
useless.

OdyX




Re: Summary of the current state of the tag2upload discussion

2024-06-26 Thread Didier 'OdyX' Raboud
Le mardi, 25 juin 2024, 22.13:53 h CEST Philip Hands a écrit :
> Aigars Mahinovs  writes:
> > Do you actually check that the contents of the source *package* (after all
> > operations done by dpkg-source and possibly other tools) actually match
> > what you were looking at before in your source work tree folder?
> 
> Until this thread, the idea that doing so might be prudent had not even
> occured to me TBH.
> 
> Now that it has, it also occurs to me that if I actually were subject to
> an attack that was attempting to sneak something in at this point, my
> system might well have been tampered with to render it unable to detect
> the change (by replacing diff with a version blind to the changes etc.)

Following on the red team idea from Russ; if dpkg-source added a "# report a 
bug to dpkg-source if you see me" comment in debian/rules at build time 
(hidden in the .debian.tar, but not present in the local directory), I would 
not be surprised if this was only detected by casual readers of 
sources.debian.org, or NMUers, but not by any uploaders. And I'd bet that this 
would span several hundreds of uploads before being detected (and of course, 
this would affect tag2upload similarly).

But if this is done not as an attack on the dpkg-source package, but just as a 
local compromise of $PATH on a DD's laptop, who would detect it? I certainly 
wouldn't have.

-- 
OdyX




Re: Security review of tag2upload

2024-06-26 Thread Jonas Smedegaard
Quoting Salvo Tomaselli (2024-06-26 10:29:53)
> > I am sure that's also what you meant, Salvo, I just find it quite
> > relevant to be explicit that it is the care that need a boost, not the
> > amount of signatures.
> 
> Well if you manually check very carefully every line and then don't sign… 
> it's 
> harder to discover it got modified.

Yes, I agree.

We can encourage upstream to...

a) sign
b) sign what is carefully checked
c) carefully check (and not sign)

Assuming good faith, I believe that in your earlier email you meant b)
although you only wrote a).  Now in your followup email you say that c)
is not as good.

I agree with your earlier, explicit point that a) is good.

I agree with your earlier (assumed) implied point, that b) is good.

I agree with you new point, that c) is not as good.

My point is that *explicitly* encouraging b) is better than *implicitly*
encouraging it by explicitly saying only a).


 - Jonas

-- 
 * Jonas Smedegaard - idealist & Internet-arkitekt
 * Tlf.: +45 40843136  Website: http://dr.jones.dk/
 * Sponsorship: https://ko-fi.com/drjones

 [x] quote me freely  [ ] ask before reusing  [ ] keep private

signature.asc
Description: signature


tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Simon Richter

Hi,

We basically have three cases:

1. upstream has an official .orig.tar.* file we can use

In my opinion, we'd want to use this because we don't need to explain 
how it was generated, and any signature from upstream can be used to 
verify that we are shipping exactly their release.


I'm aware that there is disagreement over this point, and there is a 
faction that would like us to rebuild upstream archives from git tags to 
avoid problems like we had with xz-utils, but without an easy way for 
users to verify that an archive corresponds to upstream git, we're 
mainly introducing an explanation why signatures do not match and should 
be disregarded.


In this case, I'd like a tag2upload service to have a mechanism to 
ensure the upload will use the correct file -- i.e. a mismatch in 
pristine-tar settings will not cause the file to be rebuilt differently 
and subsequently uploaded because there is no verification step between 
constructing the source package and uploading it.


2. upstream has no official release, but a git tag we can use

Here, it obviously makes sense to use git-archive.

3. upstream needs to be repacked for dfsg reasons

So far, I believe this has no good representation in git, and the 
packages that do this basically generate a dfsg orig.tar.* file and 
reimport this into git -- which is pretty much the least ideal 
situation, because we have no links to the upstream repo.


For 1. and 2., what I'd like to kind of see as part of the interface to 
a tag2upload service is a way to explicitly specify what kind of .orig 
archive should be constructed, and this needs to become a condition for 
actually uploading, so the magic tag containing maintainer intent would 
explicitly say "the .orig archive needs to have this sha256sum" for the 
pristine-tar case, and "the .orig archive needs to have a git extended 
pax header containing this sha1sum/sha256sum" for the second.


I have no good idea for dfsg repacks so far.

   Simon



Re: Summary of the current state of the tag2upload discussion

2024-06-26 Thread Sam Hartman
> "Matthias" == Matthias Urlichs  writes:

Matthias> A reproducibility checker for t2u seems like child's play,
Matthias> compared to that effort. While no t2u checker currently
Matthias> exists, somebody might be motivated enough to write
Matthias> one. (Hint, hint …)

You don't even need a reproducibility checker; you just need to have a
verification checker.  I.E. some independent tool that takes the dsc and
git tag and confirms that the transformations between the two of them
are acceptable.

This is I think the security property you actually want, not
reproducibility.
Assuming no undesired changes have been introduced into tag2upload, It's
reasonably easy to argue that reproducibility gives you this property.
It is not the only way to approach verification though.


signature.asc
Description: PGP signature


Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Matthias Urlichs

On 26.06.24 13:18, Simon Richter wrote:

3. upstream needs to be repacked for dfsg reasons

So far, I believe this has no good representation in git […]


IMHO this is a mostly-solved problem.

You can feed hashes of the offenders to "git filter-repo 
--strip-blobs-with-ids ‹filename›". This operation is idempotent and 
deterministic.


If we add these hashes to a file, let's say d/source/dfsg-filtered, we 
can thus reproducibly generate a dfsg-compliant version of whichever 
upstream commit or tag we want, and of course generate a tarball from 
there if required.


--
-- regards
--
-- Matthias Urlichs



OpenPGP_signature.asc
Description: OpenPGP digital signature


Any reference of ftpmaster does not want to accept tag2upload (Was: [RFC] General Resolution to deploy tag2upload)

2024-06-26 Thread Andreas Tille
Hi folks,

answering to some "random" mail in this thread since I do not manage to
follow everything.  My pragmatic thought when browsing my malbox is that
I'm honestly wondering how many source packages the contributors in this
thread could have created manually in the time that was used to write
those emails.  Its possibly the price we have to pay for some progress.

IMHO we should make really sure the return of (time-) investment should
be less than 10 years.  So what about a short-cut in this discussion?


Am Sun, Jun 23, 2024 at 10:16:38AM -0700 schrieb Russ Allbery:
> So far as I know, no one in this discussion has ever asked for the FTP
> team to deploy tag2upload.  The only hard request of the FTP team is to
> not block uploads made with it.  If the FTP team refuses to do any work
> whatsoever on anything related to tag2upload, it is still possible to
> deploy it (with some assistance from other teams such as DSA, of course).

I would really love to see some mails / logs of discussion between
tag2upload developers and ftpmaster team.  Is there any chance that we
could bring the involved parties in one (virtual) room and discuss the
thing directly?  I'd love to serve as moderator in this room (with my
DPL hat on).

Kind regards
   Andreas. 

-- 
https://fam-tille.de


signature.asc
Description: PGP signature


Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Russ Allbery
Simon Richter  writes:

> 1. upstream has an official .orig.tar.* file we can use

> In my opinion, we'd want to use this because we don't need to explain
> how it was generated, and any signature from upstream can be used to
> verify that we are shipping exactly their release.

> I'm aware that there is disagreement over this point, and there is a
> faction that would like us to rebuild upstream archives from git tags to
> avoid problems like we had with xz-utils, but without an easy way for
> users to verify that an archive corresponds to upstream git, we're
> mainly introducing an explanation why signatures do not match and should
> be disregarded.

I think there are two cases here: upstream produces a tarball release as
their official release artifact and produces a Git tag as a side effect or
doesn't make a Git tag at all, or upstream produces both a tarball release
and a Git tag and treats them both as first-class release artifacts.

The first case is the weakest case for tag2upload until it has support for
upstream tarballs.  I think there are various ways to add that support
that aren't too bad (git-lfs for instance) and don't require pristine-tar,
but it's future work and they're not supported now.

The second case seems fine with tag2upload?  Particularly if upstream
signs the Git tag.  Instead of pointing to a possibly signed release
tarball, the tag2upload tag points to a signed upstream Git tag.  We get
basically the same properties and avoid dealing with opaque upstream
tarballs.

Obviously this depends on what things are added to the release tarball,
and there are a bunch of cases with gnulib, etc., where it's difficult to
reproduce what upstream does during the release process for one reason or
another.  But there are a lot of upstreams for which this is not the case.

-- 
Russ Allbery (r...@debian.org)  



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Ian Jackson
Simon Richter writes ("tag2upload, reproducible .orig and dfsg repacks"):
> For 1. and 2., what I'd like to kind of see as part of the interface to 
> a tag2upload service is a way to explicitly specify what kind of .orig 
> archive should be constructed,

This is already there.

Firstly, everything I'm about to say applies only to a new upstream
version upload.  For an existing upstream version, the existing .orig
from the archive is obtained and reused, so no orig construction takes
place.

Secondsly, there is only currently one supported way to generate an
orig: git-deborig aka git-archive.  So the protocol in this area is
fairly simple, and the possible extension to support other orig
generation modes is not described in the document.

The relevant protocol elements are the `upstream=` and `upstream-tag=`
keywords in the tag2upload tag metadata.  tag2upload(5) says:

 | The orig tarball will be generated with "git archive", as invoked
 | by "git deborig".

If we were to support pristine-tar we would include that information
in the tag.  Probably, we'd specify the pristine-tar branch commitid,
and a ref to obtain it from (maybe the pristine-tar brancy).  But, I
haven't designed this in detail.

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Simon Richter

Hi,

On 6/27/24 00:41, Russ Allbery wrote:


The second case seems fine with tag2upload?  Particularly if upstream
signs the Git tag.  Instead of pointing to a possibly signed release
tarball, the tag2upload tag points to a signed upstream Git tag.  We get
basically the same properties and avoid dealing with opaque upstream
tarballs.


The one property we don't get is "our orig archive is bitwise identical 
with what is on upstream's release page" -- which is a *very* important 
property if I'm being asked to sponsor a package, as it saves me a long 
investigation.



Obviously this depends on what things are added to the release tarball,
and there are a bunch of cases with gnulib, etc., where it's difficult to
reproduce what upstream does during the release process for one reason or
another.  But there are a lot of upstreams for which this is not the case.


In my packages the git tree does not contain any autogenerated files, 
which means that people using it will have to run autogen.sh. I think 
pretty much everyone else using autotools is doing the same.


   Simon



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Ian Jackson
Ian Jackson writes ("Re: tag2upload, reproducible .orig and dfsg repacks"):
...
> Secondsly, there is only currently one supported way to generate an
> orig: git-deborig aka git-archive.

An implication, which I should atate explicitly, is that if you want
something else (eg, to be sure use unmodified upstream tarballs), you
cannot use tag2upload for your new-upstream-version uploads, until
some kind of support for that scenario is implemented.

(While I'm here, what Russ said in his reply is correct.)

Ian.

-- 
Ian JacksonThese opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Simon Richter

Hi,

On 6/27/24 01:16, Ian Jackson wrote:


Secondsly, there is only currently one supported way to generate an
orig: git-deborig aka git-archive.  So the protocol in this area is
fairly simple, and the possible extension to support other orig
generation modes is not described in the document.


So if I use pristine-tar, it is very important that new upstream 
versions are not uploaded through tag2upload, or future uploads until 
the next upstream release also have to go through tag2upload, and the 
.orig archive will fail validation if we later on get a service to check 
the archive contents against git?


Would it make sense to lock this out for the time being so we don't 
accidentally upload a repacked .orig after taking a lot of care to store 
the upstream archive in pristine-tar?


That happens way too often already when I do this manually -- my git 
workflow includes building the source package and verifying that it is 
indeed reproduced correctly, and that is one of the reasons I find the 
git workflows tedious if I'm the sole maintainer of a package, and why I 
end up force-pushing quite a lot to salsa.


   Simon



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Russ Allbery
Simon Richter  writes:
> On 6/27/24 00:41, Russ Allbery wrote:

>> The second case seems fine with tag2upload?  Particularly if upstream
>> signs the Git tag.  Instead of pointing to a possibly signed release
>> tarball, the tag2upload tag points to a signed upstream Git tag.  We
>> get basically the same properties and avoid dealing with opaque
>> upstream tarballs.

> The one property we don't get is "our orig archive is bitwise identical
> with what is on upstream's release page" -- which is a *very* important
> property if I'm being asked to sponsor a package, as it saves me a long
> investigation.

Instead, we have "our orig archive is treesame to the upstream signed Git
tag."  This seems equivalent?  We don't have as simple of tools right now
to *check* this property, but that's a fixable problem, and the amount of
*information* is the same.

By all means, don't use tag2upload if you don't like its upstream tarball
handling, and I think supporting pristine-lfs in tag2upload is a good idea
to handle the cases where upstream is tarball-centric.  (I personally
would not be eager to support pristine-tar only because I don't think it's
sustainable, but it is very widely used, so my personal opinion may be
wrong.)  But I do think using the signed upstream Git tag even when they
also have a signed tarball release is defensible and a matter of personal
preference.

> In my packages the git tree does not contain any autogenerated files,
> which means that people using it will have to run autogen.sh. I think
> pretty much everyone else using autotools is doing the same.

Right, but that's a feature, not a bug.  If it's just a matter of running
the autotools, it's *better*, in my opinion, to start from the Git tag so
that you don't have the already-generated files that you are trying to
discard sitting around obscuring the situation and possibly still
lingering because we have some bug in the code that's supposed to move
them aside.

I do not currently practice what I preach and currently base the Debian
packages for code for which I'm also upstream on signed tarballs, but
that's because I'm a creature of habit and haven't gotten around to
changing that yet.

-- 
Russ Allbery (r...@debian.org)  



Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Brian May
Matthias Urlichs  writes:

> IMHO this is a mostly-solved problem.
>
> You can feed hashes of the offenders to "git filter-repo 
> --strip-blobs-with-ids ‹filename›". This operation is idempotent and 
> deterministic.
>
> If we add these hashes to a file, let's say d/source/dfsg-filtered, we 
> can thus reproducibly generate a dfsg-compliant version of whichever 
> upstream commit or tag we want, and of course generate a tarball from 
> there if required.

Sometimes files have to be edited and/or created in order to make the
tar ball DFSG complaint and not fail build. Just deleted a list of files
is not sufficient.

For example, if an individual file contains a mixture of non-dfsg stuff
and dfsg stuff that is required for building.

For more details, see this really old discussion, from 2008.
https://lists.debian.org/debian-devel/2008/06/msg00233.html

I hope I haven't just opened a can of worms here :-)
-- 
Brian May @ Debian



Re: Any reference of ftpmaster does not want to accept tag2upload

2024-06-26 Thread Sean Whitton
Hello Andreas,

On Wed 26 Jun 2024 at 04:17pm +02, Andreas Tille wrote:

> Is there any chance that we could bring the involved parties in one
> (virtual) room and discuss the thing directly?  I'd love to serve as
> moderator in this room (with my DPL hat on).

Thanks for the idea, but we don't think this would be effective at this
late stage.  Thanks to the mailing list discussion, we've improved our
collective understanding of the key points of disagreement, and it
relied on a lot of details that are more easily expressed in writing.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Re: tag2upload, reproducible .orig and dfsg repacks

2024-06-26 Thread Andreas Tille
Hi Matthias,

Am Wed, Jun 26, 2024 at 03:25:34PM +0200 schrieb Matthias Urlichs:
> You can feed hashes of the offenders to "git filter-repo
> --strip-blobs-with-ids ‹filename›". This operation is idempotent and
> deterministic.
> 
> If we add these hashes to a file, let's say d/source/dfsg-filtered, we can
> thus reproducibly generate a dfsg-compliant version of whichever upstream
> commit or tag we want, and of course generate a tarball from there if
> required.

Your suggestion sounds sensible.  However, I'd prefer if we would not
invent a file that might duplicate the content of the d/copyright
Files-Excluded field - but this seems to be some implementation detail.

Kind regards
Andreas.

-- 
https://fam-tille.de