On Tue, Feb 02, 2021 at 01:03:23PM +0000, Robie Basak wrote: > Question: before I land this branch, I'd like to make sure that there > aren't any issues with the "spec", as follows. > > For (developer) users, you'd run something like: > > dpkg-buildpackage $(git ubuntu push-for-upload) <your usual flags here> > > The push-for-upload[1] command would push your branch to Launchpad and > also output some -Dfield=value arguments that would get passed through > to dpkg-genchanges. > > Further/better wrappers could come later - especially for newcomers > where I'd like to wrap away the dpkg-buildpackage stuff. > > Technically, the mechanism is: > > 1) The uploader pushes their commits somewhere. > > 2) The uploader includes a reference to the commits in the changes > file. > > 3) The uploader dputs as normal. > > 4) When git-ubuntu sees the upload, it pulls the commits from the > repository listed in the changes file. > > 5) If the commits pass sanity checks (eg. the final commit matches the > upload exactly), then it uses the commits provided instead of > synthesizing its own. > > What goes into the changes file is three fields. Example: > https://launchpadlibrarian.net/516799033/hello_2.10-2ubuntu3~ppa1_source.changes > > Vcs-Git: https://git.launchpad.net/~racb/ubuntu/+source/hello > Vcs-Git-Commit: 4511fdfc01cbfd5bc351e1da294d6acb44e8a4a2 > Vcs-Git-Refs: refs/heads/test > > We need the Refs field because git is designed not to be able to fetch a > commit by hash, but by a ref that can reach it only. So Vcs-Git-Refs > must specify what ref(s), when fetched will make the commit given to be > reachable. In practice this could just be the branch name prefixed by > 'refs/heads/' as in this example.
I don't see why Vcs-Git-Refs would ever need to be plural: if a commit isn't reachable from any single ref, then adding another ref will never be helpful. This should just be Vcs-Git-Ref, singular, which also simplifies code that uses it (since it will in practice always be a single ref, any code to handle more than one ref there would be poorly-tested). The spec should probably also say that the commit in question isn't guaranteed to be reachable from that ref in the long term, but only by the git-ubuntu importer in the short term (however exactly you define that). Since the repository may be owned by the uploader, it doesn't really seem practical to impose stronger lifetime constraints. What happens if the repository is temporarily unreachable? Presumably you back off and retry later, but that does mean you need a reasonably robust way to distinguish temporary failures from permanent ones. What happens if the repository is renamed before git-ubuntu gets to it? For example, the uploader might choose to change their Launchpad username, which would invalidate the original repository URL. (In practice this will be rare, partly because renames are fairly rare to start with, and partly because users with PPAs currently can't change their username due to technical constraints in Launchpad; but on principle I believe that users should generally be free to change their usernames with minimal bureaucracy and I would like to avoid adding further technical constraints that we need to solve before allowing them to do so.) What happens if the repository is private, as might be the case if the upload is a security upload whose contents are embargoed before it hits the archive? dgit avoids all these problems by having a push model rather than a pull model: "dgit push-source" pushes the appropriate commit to a specialized git server, which can then make sure not to lose it. Now, at present in Debian this has the side-effect of restricting the set of people who can push, and it's certainly not entirely obvious how we would go about such a thing in Ubuntu with Launchpad (I think we should avoid any design that requires setting up another git server that needs to know about developer identities and permissions etc.), but I think it's at least worth thinking about before committing ourselves to a pull model. What would we need in Launchpad if we were going to try to do this on a push basis? Brainstorming a bit, these are some approaches that came to mind, bearing in mind that some of these ideas may be terrible: * Might it be practical to tell Launchpad to reserve some kind of token corresponding to the commit in question guaranteeing that that commit would be reachable until the token is consumed, which git-ubuntu could then pass in the .changes file and the importer could consume? * Perhaps the upload could include a git bundle relative to some other version already in the archive? (This could be large, though.) * Could we work out a way to allow any contributor to push to some kind of holding area associated with the importer-owned repository, and then the importer would only point a ref at that once the upload has been processed? (I'm not sure how we would prevent an attacker from being able to force such a repository to grow without bound, though.) * Could we use merge proposals for this somehow? An upload is in some sense a proposal to merge some changes into the primary archive, and I know "git ubuntu submit" already integrates with merge proposals on an experimental basis. That might allow Launchpad to know that a given commit is interesting and should be made available - indeed we already have plans to expose virtual refs that correspond to merge proposals, although I don't think those are quite done yet. If we were to take this approach, then the ref that we point to could be made to appear in the target repository instead, which avoids the collection of issues around the source repository disappearing or moving. Of these, albeit with only half an hour's thought, I think my favourite is the last one: using merge proposals feels quite elegant, and is perhaps only a change in how your spec would be used rather than a format-level change. I may have missed something, though. What do you think? > Notably there's a Dgit field defined by Debian Policy against dsc files, > which is used for a very similar purpose[2]. I'm only a dgit user rather than an expert in its implementation, but I believe that the Dgit field is used by dgit to retrospectively work out the commit that represents a given source package version in the archive as part of preparing a newer version that ought to be a descendant of that commit, rather than as part of an upload instruction. That's why it lives in the .dsc file rather than the .changes: after an upload has been processed, the .changes is not stored in any authenticatable way. But it's true that the current specification of Dgit explicitly relies on the repository being at a well-known and persistent location. -- Colin Watson (he/him) [cjwat...@ubuntu.com] -- ubuntu-devel mailing list ubuntu-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel