After a discussion on IRC, I organized a BoF at DebConf10 to discuss new source formats, specifically 3.0 (git). Below are the notes from that discussion. I tried to take reasonably comprehensive notes, but I'm sure that I missed things. Other participants, please add any additional bits that I forgot!
Notes for source package format ad hoc session Friday, 2010-08-06, 10:30 - 11:30 (EDT) Agenda * 3.0 (git) * ftp-team worries about VCS-embedded source package formats * git push as an upload mechanism * how much does debcheckout make this irrelevant? * 4.0? ftp-team is concerned about doing license checks across the entire git archive Colin points out that we're in the same situation with Alioth for redistributability. However, it is easier to withdraw things from Alioth than from the archive. And redistributability (the legal requirement) is a lot less of a bar than what we check for DFSG. - shallow clones do bound the amount of work that has to be done here * Colin thinks that people may want to upload a lot more than that, but Joey doesn't think they will. - Colin: straw man: why is the answer not a shallow clone containing one revision? * You can pick how much you can include * ftp-master can make their own policy, only allow for native packages, limit to shallow clones - Lintian for that check if possible * Remember internal use where 3.0 (git) may be a lot more attractive * The default should be something ftp-master would accept 10 revisions doesn't really multiply the work by 10; it's equivalent to a 3.0 (quilt) package with 10 patches. Is that enough history? How do we manage the size of the history versus the ftp-master review to keep from imposing lots of additional work on ftp-master? * Having policy in advance is important; decisions on the fly cause arguments and frustration. Concern that 3.0 (git) formats require Git to unpack. Colin has heard from people who are auditing what's in Debian packages didn't want to have to troll through version control repositories and would rather see something more akin to what you get from a patch system. - Question: isn't that what you get from VCS log? - Answer: no, because a single change may be spread across lots of commits stgit/topgit, bzr loom and hg patch queues do something like this, but those are all still vaguely experimental For people who are trying to figure out what we're doing, without following what we're doing with VCS, they may be better served by 3.0 (quilt). Joey says he pretty much agrees right now, but tools will improve and change. You can sign a tag or branch: would it be possible for ftp-master to use a signature or tag to mark where they've done the review to? This would require resigning the source package, so probably not. - It's unlikely that ftp-master would be doing incremental checks Pluses for the format: * Working in a VCS and exporting to patches is really clumsy * TopGit and the like are rather cludgy and kind of annoying * Part of Joey's motivation is that if you look at GitHub, the people using it a lot consider Git to be a source package format, and Debian should think about how we attract those people and how we work with them. That does suggest non-native may be key in the long run, though. - If we're trying to get these people to be Debian contributors, native may be okay. * Colin talked to quite a lot of people trying to understand our source packages and from their point of view they know what a patch is but beyond that it gets fairly variable. - Joey: that may get back to whether we have it for non-native packages. Maybe just start with native packages. rra has tried TopGit and finds it rather annoying, and ends up using single Debian patch with 3.0 (quilt) and essentially having 1.0. - One big patch is not really a seller. - Sponsors have not been looking at the Git repository and just building that; they're in a mode of getting the source package and looking at it, which makes clean 3.0 (quilt) formats important. - Packaging teams review from the VCS. Colin uses a patch system and double-commits, because it becomes more readable in the end. Why not use debcheckout and some simple package format? - I don't trust your repository will be up -- but if it's a shallow clone, you still don't get the history. But at least you get something. * Does that give you anything that useful, though? it's a lot less than what you get from debcheckout. debcheckout doesn't work all that much, though. - Colin wonders why we don't have a central directory of all the source package packaging repositories rather than putting it in package metadata. * Even with that, if you look at stuff in stable, the chances are that a lot of those repositories have gone away. - debcheckout is only really useful if you're about to do development * There's no uniform way to get a particular revision of the package. * It may not be tagged, it may be on another branch, etc. Joey would really rather upload his whole repository for things that he knows are clean, but that's a problem for ftp-master review, and you have to get into who you trust to make that determination. You might be able to do a shallow clone of depth one and include every signed tag that matches an entry in debian/changelog but it may be too bloaty. That might ease the review. - How would topic branches fit into this scheme? - The default is to include only master, but you can include other things if you want. - This might be a good default behavior. Best practices for Git repository layout? - git-buildpackage documentation is closest to that git push as an upload mechanism - Attractive because over time it builds a Git repository for the package - However, it assumes binaryless uploads, which we currently don't allow. - You can't have a smart server that tries the build and rejects it if it fails because then how do you sign the package? * You could do this personally and then use debsign -r. * You would really have to trust this if it were done centrally; easier if you do it yourself with your own upload server. - If anyone is interested, they should develop the software and try it for their own uploads and go from there. If you're implementing 3.0 format, please don't hard-code the extensions that you "know" will be found in source packages, because as we add additional files listed in *.dsc, we may add other types of files. One issue with 3.0 (quilt) is that when you check it out when it's maintained in a VCS, you have two choices: commit the .pc directory and files, or leave it out and then have to run some magic after you fetch the VCS in order to work on the quilt patches. (Assuming you check in to your VCS the results of having all the patches applied.) Colin has been talking with bzr people about having it notice it and collect it on clone. - Why don't you just check in with patches not applied? * Colin really hates that, because the great thing about 3.0 (quilt) for him is that patches are already applied. It means that you can use vcs blame, vcs log, etc., and you get a consistent view of what changed this line in this file. Without that, you don't get anything better than a traditional patch system. * If you use topic branches in the VCS and something like TopGit that generates them, you can get this, but then you can't maintain the patches in the VCS as well. * You can do this with a rebased patch branch, but then you don't get history on modifications to the patch. What about building a unified VCS repository location (with whatever VCSes) for all of Debian that everyone uses, which would simplify this a lot? - We lose a lot of flexibility for doing this, and Debian can't agree on doing this for everything. - We do need to distribute source packages on CDs, so we need a source package format that includes all the source. - We have lots of child distributions, and it's useful to have a source format that's useful when importing their packages when they may not have a central revision control system. What about repository size bloat if revision control history is included? - That's one of the reasons why shallow clones are important. - Also, we do have a reasonable amount of archive space, particularly for source. - The history ftp-master will review is more of a bottleneck. - Are the advantages written down anywhere? - Joey finds that in his usage patterns, git blame is the only case where he really cares about the whole history; in other situations, he mostly looks at recent revisions. Currently in 3.0 (git), origin points to the bundle and doesn't embed the actual repository, but Joey is working on fixing that. (Setting origin based on Vcs-Git.) 3.0 (git) does ensure that you get whatever branch is the Debian build branch, not possibly the upstream branch that you might get if you debcheckout something with a repository that combines local and upstream. source.debian.org is working on importing source packages into a Git repository and storing the history as one revision per new source package upload. -- Russ Allbery (r...@debian.org) <http://www.eyrie.org/~eagle/> -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/87wrrxhlr7....@windlord.stanford.edu