On 5/2/05, Matt Zimmerman <[EMAIL PROTECTED]> wrote: > Another option would be to leave the source package maintainer the same (to > retain proper credit, etc.), but override the binary package maintainer > during the build (to reflect that it is a different build, and also display > a more appropriate name in "apt-cache show" etc.). > > What do you think about this approach?
Personally, when I rebuild a package that might get handed to someone else -- even if I didn't touch the source, but am rebuilding in a known environment so I can reproduce it later -- I change the Maintainer field to an e-mail address that reaches me, and add a debian/changelog entry with an explanation of why it was rebuilt and an appropriate suffix on the version number. Otherwise, I'm risking: 1) Implying that the Debian maintainer is part of my organization, since it appears that he/she was the last person to touch the package; 2) Suggesting that bug reports should be sent directly to the Debian maintainer and/or BTS, possibly annoying him/her and probably leaving me and my organization out of an interaction that we ought to know about; 3) Violating some licenses (the GPL, for instance), at least in spirit, by making it hard to determine who is responsible for meeting obligations to provide source code (and, again at least in spirit, detailed instructions about reproducing the build environment). When I am distributing unaltered Debian source packages alone, or bit-exact copies of Debian binary packages, I don't worry as much about these things. Actually, in principle I ought to have a cache of the source packages associated with all binary packages I distribute, although for one-offs I usually assume I can get it from snapshot.debian.net if I need it. (snapshot has saved my bacon more than once -- thank you Ukai-san and FSIJ!) If I had Ubuntu's resources, I'd handle it differently. Relying on people (or even an automated process) to touch up debian/control and debian/changelog on rebuild is so 1990's. A Debian upload isn't acceptable without a signed changes file, and an autobuilt package doesn't make it onto ftpmaster without a signed buildd log (as I understand it, anyway). Soon it will be practical to install only signed binary packages (what gets signed in apt 0.6, actually? md5sums?) on a Debian / Debian-derived system. I would like to see all binary packages accompanied by information equivalent to the contents of a changes file, signed in a way that allows bug reporting tools to check the chain of trust and choose a bug report destination accordingly. I believe that the right way to handle this (no, I don't have code in my back pocket -- yet) is to use a token for package integrity that can be multiply signed, and on which those signatures can be revoked, so that an organization can easily delegate release engineering / update tracking to an internal guru or a consultant they trust, or spread it across multiple roles and automated processes. These integrity tokens should be distributed using a mechanism that makes it easy to check the current signature set of the token and to add and revoke signatures in any order, and this mechanism should be proven to scale to millions of tokens with thousands of signatures on each. Stating it this way should make it obvious that I have in mind using single-use GPG keys as integrity tokens and distributing them with a network of keyservers. (Not, obviously, the public keyservers, on which keys that represent things rather than people have no place.) Single-use keys would be generated at the conclusion of the package build cycle, similar to a changes file except one per .deb. The sha1sums of the .deb and .dsc would appear in the key's userid, and full vital data for the binary package, others built in the same dpkg-buildpackage run, and the source from which it was built go in, say, the Notation field for the self-signature. The sha1sum of a .deb can thus be used to look up sha1sums for its parent source and its sibling .debs, and given an sha1sum index to snapshot.debian.net or the Debian derivative's equivalent, the single-use public key is a reliable clue to fetch the packages themselves. The sha1sum of the .dsc is also in the userid to make it easy to find other binary packages built from the same source, which facilitates use cases like the "M out of N security experts" mentioned below, in which some roleplayers' auditing of packages built from the same source is "good enough". Once the public half of the key is self-signed with vital data embedded, the private half is discarded, and the public half is uploaded to the package keyserver network. Thereafter, its primary function is to accumulate signatures (and revocations), which represent the audit trail through whatever processes, human and automated, anyone who cares to use the package sees fit. (Note that the key isn't used to sign anything but itself, and the sha1sums in its userid make leakage of the private key harmless except for possibly tampering with the self-signature on the public half, which isn't that big a deal anyway; see below.) A sysadmin for a large network of machines might have an automated regression test setup that pulls the package as soon as the build is done, before any human bothers to audit it; that system can autosign with a machine-level key, which is signed by the sysadmin; the sysadmin's signature can be revoked later if it is discovered that the system was compromised in some way. A paranoid organization might want to security audit source code changes (presumably also running their own autobuilder), and would require that the signatures of M out of N known security experts (and no revocations from the other N-M) be present on keys with matching source package sha1sums before a production machine will install that package. (This obviously involves more complex lookups than just "chain of trust", but the keyserver has the necessary data in an easily gathered form.) And so forth. Role-level keys and/or signature notations can specify what aspect of the package's integrity is being signed off on. A "Report-Bugs-To" key/notation could guide bug tracking tools in selecting the appropriate destinations; a Debian maintainer who is interested in getting bug reports for the Ubuntu versions of his/her packages could sign a role key stewarded by an Ubuntu person (authorized to feed it to a system that signs autobuilder output at a specified stage of QA), and if he/she decides that the Ubuntu package has diverged too much, the signature on the role key can be revoked. Signature-aware bug reporting tools would automatically pick up the current set of appropriate recipients by using the sha1sum of the package (stashed by dpkg at install time) to fetch its integrity token and its chain of trust. Note that there is no need for an additional data signature covering the .deb, or its md5sums, either as a separate file or as an appendix to the .deb; the tokens can of course be put on install CDs for convenience of installation, but you really want integrity tokens, and the keys used to sign them, hot off the keyserver most of the time. And now that it's got the sha1sum in the userid (which I left out the last time I suggested something like this), I see no reason to object to the single-use key as a form of "detached signature", because the token -> deb mapping is practically impossible to subvert. When people or auditing tools sign the single-use key, what they're really approving is the sha1sum in its userid. So the worst that an attacker could do (assuming that SHA1 is not incredibly broken) is to substitute a key with the same userid (and hence the same sha1sum) but different data in the self-signature notation. But presumably one of the "role" signatures I require in my trust analysis is the autobuilder maintainer's (signed in turn by an ftpmaster), and that's my guarantee that the auxiliary data in the self-signature is correct (and that the private key was discarded so that the self-signature can't be tampered with). Is this in some sense an abuse of GPG/PGP and keyservers? Not any more than serving HTTP/1.0 via apache run from inetd is an abuse of TCP and inetd. Yes, the use cases for which PGP/GPG and keyservers were conceived and designed involved long-lived keys that represented real people, just as TCP and inetd were conceived for long telnet/FTP sessions. But there's no particular reason not to use the same robust design and mature implementations for a different set of application-level use cases, as long as you use a different port. Implementations welcome. :-) Given the number and variety of urgent things at the day job, and the amount of time and attention that I have found my nine-month-old daughter needs, it looks like it may take me a while longer to create enough breathing room to get around to it myself. Cheers, - Michael