On Sat, Nov 15, 2014 at 06:15:33PM +0000, Simon McVittie wrote: > On 12/11/14 22:07, Ron wrote: > > I am also interested to hear more > > about whatever the confusion was you had with this was when you > > started working with Tollef's systemd repo that you mentioned > > in the previous thread. > > Having played with gitpkg some more, I'm reminded that the answer to > this is that unlike (AIUI) both gbp-pq and git-dpm, it did not meet my > assumption that the contents of the git tree were in a suitable form to > run dpkg-buildpackage and have a 3.0 (quilt) Debian package fall out. I > realise that's partly a property of 3.0 (quilt).
Ah, yes. That probably is something we could document a bit better. I guess it got overlooked, partly because there was no format 3 when gitpkg was first written (and this does work just fine for format 1, or the single patch style of format 3), and partly because by the time this was an issue the common workflow was probably not to build packages in the (potentially dirty) working tree anyway, so it took someone new coming along to notice it. This is actually the first time anyone has mentioned tripping over this to me. You should be guaranteed to get a functionally correct package if you do this still, but not necessarily a source package with the /patches split out individually. However ... > For gitpkg, you can commit in the normal git way, but the cost is that > you have to build in a way that isn't the normal dpkg thing (exporting > with gitpkg and building the result). ... if you want to build in the local system like this (as opposed to throwing them off to a buildd chroot), you can enable the dpkg-bp hook, at which point something like `gitpkg master` will do the same things as dpkg-bp in the local tree with the added advantage of also respecting your .gitattributes et al. and ignoring any dirty state you might have in the working tree. Which on the one hand, if you know this, is just as easy, if not easier to do, and gives you much stronger guarantees about what you're actually building -- but on the other, if you don't, is something you need to become aware of somehow. The latter of which is probably somewhat unavoidable for just about any tool if you're actually planning to push your changes to the repo and not stuff up later users of the tool - but I agree this is another thing it's worth thinking more about to make things easier for more casual users. Do you have any suggestions for something that might have made this more immediately obvious to you? I can think of a few things we could do, but I really do believe in designing things around actual user experiences rather than trying to guess blindly about problems nobody has actually ever had. With the new tools for exporting a patch series, it would be possible to export one into the (henceforth) 'dirty' working tree and build directly in it with dpkg-bp as you tried. And for people who really wanted to, it would even be possible to commit that to a branch where a 'naive' checkout would work as you expected here -- but avoiding that kind of cruft in the repos was sort of exactly the reason for writing those tools in the first place :) It's generated source, the same as autotools, it doesn't need to be in the VCS unless you really have some personal reason to want it duplicated there. It's not clear to me that any of the above is a better general solution than just getting people on the "right track" for how things work best to begin with - but I mention them because all of those things are possible if you know you can do them. I can think of some other far more "clever" magic that we could do to simply make this work for a naive user that just does git clone && dpkg-bp -- but I don't think I'll mention it, because this is the internet, and Poe's Law means someone might go "Wow! That's a Great Idea, I'm going to do that too" and then it would be My Fault :) I suspect the Best Answer is documenting this somehow and somewhere, but it's not clear to me yet where the best place for that would have been to have avoided what happened to you. Maybe a README.DebianFromGit or something, that's export-ignored so it doesn't clutter the package but is seen in a cloned repo? I can see a few ways that could be less than ideal too though. > gbp-pq and git-dpm are the other way round: the tree can be built with > dpkg-buildpackage, but the cost is that you have to commit in a way that > isn't the normal git thing (either using a specific tool, or for the > gbp-pq layout, dropping in pre-prepared patches and hoping they don't > have conflicts, in the same way you might for svn-buildpackage). > > I think I was also thrown by the fact that gitpkg does not encapsulate > its configuration in what you commit: if two developers build the same > tree, the debdiff might well be rather large, because one developer's > .git/config results in separate git-debcherry patches and the other's > .git/config results in a single large patch. > > git-buildpackage reads both debian/gbp.conf and .git/gbp.conf, with the > latter taking precedence. That lets maintainers provide "executable > documentation, in debian/gbp.conf, for "here is how I intend this repo > to be used", which seems like something that could be rather useful for > gitpkg: for instance, filter patterns for non-DFSG tarball imports can > go in debian/gbp.conf as a way to avoid mistakes. Yeah, there's a whole bunch of tradeoffs in that which we explored in the very early development of gitpkg. Since working in git was still a very new thing to do (both for upstream developers and debian), the questions of what Best Practice (or even common practice) would be were still very open ended. It was quickly apparent that being unable to cope with anything that that was legal to do in git was going to be a fatal flaw in the tool, sooner if not later. It was also fairly quickly apparent that no one workflow or tool was ever going to achieve unquestioned global domination over all others so it was really important that repos managed with gitpkg didn't somehow depend on it to remain useful into an uncertain and infinite future. It couldn't impose its own forced framework structure on them, it shouldn't make them difficult to use with any other ostensibly sane tool that did a similar thing, and because it should be able to export packages from *any* even half-sane repo, including package versions that existed before gitpkg was ever written, and repos that were created without any knowledge that it did exist -- it also couldn't rely on having tool specific config *in* the repo for the version that you wanted to export. In the same way that Debian source packages were designed to not strictly require dpkg to be able to extract and build the source, gitpkg was designed so that it would not be strictly required to extract a source package from the repo. It was just a shortcut you could use to make your life much easier if you did actually have it available. Since many of the packages that I wanted to move to git were things that I was also upstream for, in one degree or another, it was also vitally important that it didn't get in the way of using the power of git to its fullest extent, to make the work I needed to do as efficient and painless as possible. I don't tend to think of 'packaging' as some task that's kind of isolated from upstream (or worse, something upstream should keep their sticky little fingers right out of) -- it's really just an ordinary part of the normal software development process. It's a feature added to the upstream source the same as any other, and can and should be managed in exactly the same way. And git was designed from the ground up to know how to do that, and do it really, really, well. With the advent of distributed VCS, that really became something that *everybody* could do. You didn't need commit access to the upstream repo, you could just clone it and develop your feature branch of it. With cvs-bp and svn-bp, unless you were upstream, you *had* to import tarballs or similar. With git, it was quite clear that working like that was going to be a short term anomaly, not the long term rule - so a tool modelled on those might have some short term familiarity advantages, and be an easy thing to whip up fast, but it wasn't using the feathers git had grown to actually take flight and soar into a new better way of working. To segue back to how this is relevant to your original point from that little side trip down memory lane though (: gitpkg did originally allow some configuration from files inside the repo (and still actually does) but one of the things that next became quickly evident about building packages out of a DVCS was the question of Trust. Since anyone can now become a new upstream just by putting a clone of their repo somewhere public, it's equally important to be able to export a source package from it, that you can then debdiff or otherwise audit, *without* letting the content of that repo execute or influence things in any way that you haven't explicitly allowed it to. A tool that executes code from a random repo just in the act of getting the source out of it is a loose cannon in a truly distributed world. Which reinforced the importance of the decision to avoid needing some special content in the repo for the tool to be able to work correctly. Before format 3 got invented, this was actually really easy to avoid and only the weirdest possible cases might have had any issue with it. The documentation of the config explicitly warns that depending on it to get a correct source package is probably a sign that something about your workflow really isn't quite right and you should probably think a bit harder about what you're really trying to do. Most of the useful config options were just doing things with the package after it was built, they didn't change the exported form or content in any way, so they really were purely "local user preferences". With the advent of format 3 (and to a slightly lesser extent p-t), this did get more unfortunately complicated though. If you are using it, then you have the problem you described above. We were fairly careful about ensuring that you'll still get "a" correct package regardless of your local configuration (it will contain exactly the same source once patched and build exactly the same binaries), but you may not get an identical source package to what someone else does in that format [since format 3 allows the "same" source to be packed in a potentially infinite number of different ways that all extract to the same thing, this is basically just a subset of that problem inherent in its design]. In one line of thinking, this is "a feature" (a local user can get exactly the kind of source package *they* want, regardless of what you prefer - if the maintainer doesn't care about patch series and just uses the single patch mode, you can still export one with a properly split up series if you want it), but I do agree that in another dimension, the uncertainty of which style the "original maintainer" used and prefers could be an issue, and at the very least that is information that may currently be getting 'lost'. There is a certain innate tension between "you can use gitpkg on any repo because it doesn't need to be modified to do that" and "how do I know if a repo was using gitpkg because nothing in it was modified to tell me that" :) Which isn't a problem if you're a habitual gitpkg user anyway, but doesn't give you many clues if you're coming from some other school. In the latest gitpkg release (which is in Jessie), we've added an examples/README.debcherry-export as a template that people might like to include in package repos where they are using this, but it's fairly new, so it wouldn't have helped you back then, and there are still a bunch of open questions about how the best way to handle this will really be. I'd still like to avoid *depending* on magic files in the repo for things to be able to work correctly, but that is a separate question from whether there is *something* we can add to the repo that makes life for someone looking at a clone of it for the first time a bit easier than it currently might be. That does seem like something this document might be able to address. However long it's been since we truly missed the boat for naming conventions to be of much real use at all, I think there is scope for having a standard place to look that summarises the things it would be useful to know about what's in the repo, how to get it out and into a source package, and what to be aware of if you're modifying it. I'd have almost surely already added something like that to my existing repos if it had ever been a FAQ from people trying to use them, but since nobody has ever actually asked that before your feedback here, it seemed like a "blind premature optimisation" problem :) That would be less of a problem if there was a common place we could reasonably expect that people from the future would know to look for it, and what it ought to cover that isn't "self-evident" or "common knowledge" that should be documented more generally elsewhere. Cheers, Ron -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: https://lists.debian.org/20141116090332.gl10...@hex.shelbyville.oz