Hi,

On 6/14/24 00:50, Russ Allbery wrote:

We have several 90% solutions of mapping Debian packaging onto git, but
all of these are incomplete and annoying to use because we disagree with
git on what constitutes data, and what constitutes metadata, so the data
model does not match reality or requirements, and from a security
standpoint that concerns me more than improved forensics.

This is why people are working on incremental improvements.  I think such
improvements are more likely to get us closer to where we want to be than
a boil-the-ocean approach that attempts wholesale change to how Debian
works.  It's easy to come up with new designs that in theory would be more
coherent and straightforward, and very hard in practice to avoid that
turning into <https://xkcd.com/927/>.

The reason we have multiple git workflows is because they are incremental designs that do not try to change the way Debian works, or the way git works.

With the current Debian archive we have a well-defined (documented in Policy) interface for uploads, and the git workflows are implementation details that the archive need not be concerned with. This has allowed us to use git in the first place.

By creating an upload service, we elevate git to "interface" status. That would be a good thing if there was a single interface. However, we have three (that I know of), none of these were designed to talk to anything but itself, and the service uses a heuristic to determine which one is used.

At the very least, we need to make it explicit which repository layout is to be used, and version and document that interface, then support it for several years in the future even as we make incremental changes, because we want to be able to regenerate packages from the git archive.

Tag2upload is an increment over an increment over something that was not designed as an interface, and while each increment is technically sound, the overall design needs to be revisited because it needs to support all these incremental changes.

I think that with existing git it is difficult to represent the history of packages well, because we need to record a history of what are effectively rebases, and representing them as a merge paints a wrong picture for git, because it assumes that everything upstream of a merge is already accounted for.

One _incremental_ change I'd like to see would be archive support for .orig.bundle.* (containing a shallow copy of the upstream commit) and .debian.bundle.* (containing the differences between the upstream commit and the package), which would be an absolute game changer for git integration, the archive side would probably be fairly simple to implement, and it would allow us to ship the "preferred form for modification" for a lot of projects more easily.

Mirrors would still get a size-minimal representation, this format does not impose a particular workflow and can be easily generated from and validated against the full tree.

   Simon

Reply via email to