On Tue, 24 Mar 2009 00:43:48 -0700 Steve Langasek <vor...@debian.org> wrote:
> > I have been reading this discussion a bit and I've been wondering what > > use-case you actually have for machine-readable debian/copyright files. > > This is quite different than having the *license terms* recorded in a > machine-parseable format, which is potentially useful in lots of ways; The format would still need to match source code licence terms with compiled objects that could include a variety of source files and have to deal with changes in the linkages within a package that can change during the lifetime of a package? There is no permanent/reliable link between the licence of a source code file and the licence of a specific compiled binary from the final package, only with the collection of source code and the collection of binaries. Even within a single .deb it might not be possible to identify exactly which licences apply if the source package builds lots of variant binary packages. Also, runtime information can be checked for some simple .deb packages on the basis of all that the package contains but development information about which licence applies to the source code someone copies and pastes from the source package are entirely separate. Other issues affect packages that build twice with different options to ./configure that may or may not omit certain source code files from one or other build. If I have a modular source package that can enable or disable various build options and various components and some of those components have differing licences, it becomes very hard to track subtle changes between builds that may or may not result in source code under licence A being compiled into binary bar in some circumstances but not in others. Individual versions of a package must build the same way on each architecture but subsequent versions can change, making it hard for the maintainer to track what is going on. If such a modular source package (foo) builds a number of different binary packages, how is the checker to know whether binary package bar, linking against libfoo-with-baz is any different to linking against libfoo-without-baz other than relying on the package names? Licence incompatibilities between source packages are not the issue, AFAICT, if only because the offending source code might not actually be being compiled; incompatible licences between binaries linked at runtime are the problem. I'm not sure that any proposed format of debian/copyright would allow a checker to be at all certain that a particular .so from package foo has a compatible licence with a particular .so from package bar where both foo and bar include multiple libraries, multiple binaries and multiple linkages at build time. (debian/libfoo.copyright is a separate idea with different problems, see later.) Yes, the checker might be able to say that source package foo contains code under licence A and source package bar contains code under licence B and that a certain conflict might result but whether that is a real problem or not still depends on exactly how the relevant code is compiled and linked - something that can change much more frequently than the licences themselves. This is the problem with licensecheck - it relies on the source and cannot hope to understand how the source becomes a binary. I fear that such a checker would be very misleading and cause unnecessary work dealing with the 'bugs' that could result. (relocating from the end of Steve's message) > Well, aside from the section header, nothing in Debian Policy actually says > you need to have a per-source debian/copyright file; and you certainly can > have separate per-binary copyright files in your package that get installed > individually if you choose, there's nothing that prevents you from doing > that even though it's clearly not common practice today. Can't help thinking that the packages that would benefit from debian/libfoo.copyright are the very ones where maintaining that file will make the idea rather unappealing due to the issues above. However, it is something I hadn't considered and there could be some mileage in that for some packages, especially those with different licences for the API documentation. It could make the main debian/copyright file much cleaner and easier to read for a small number of packages. I'm still not convinced that machine-parseable formats are genuinely useful or maintainable and I feel that machine-parseable requirements inevitably impair human readability of copyright files. That's not a win, AFAICT. > Please don't reply with arguments why this isn't enough reason to make > maintainers do extra work. I'm not trying to make any maintainers do extra > work; I'm pointing out reasons why having a consistent and machine-parseable > copyright format is useful, which is the question that was asked. That > benefit is there even if only a subset of maintainers opt to use a > machine-parseable format; but given that there is interest in having such a > format, it's important that we come to some agreement on what that format > should be, so that we don't have a dozen incompatible formats running > around. Would you say that debian/libfoo.copyright is a pre-requisite for such checkers to be useful on all but the simplest of packages? How are complex packages going to maintain such files? Is it really useful to have only a subset of packages using the format? Isn't only going to be the small packages that have no particular licence problems that would adopt it because it's almost trivial to do so? Unless maintainers of complex packages or packages where licence problems are likely (those that need exceptions added to the GPL etc.) can implement the format cleanly, is there really any benefit? There are elements of the format that aid human readability but making the format completely machine-parseable means making allowances for so many ifs and buts that the copyright files become only readable by machine. > That's what we should be working on. This thread with people refusing to > use a parseable format for debian/copyright, and arguing about whether using > the format does or does not provide assurances about the copyright status of > a work, is all an irrelevant (and irritating) distraction. Actually, having per-binary-package copyright could help with a lot of packages, merely by making each copyright file smaller - as long as the package has clear licence divisions. e.g. a package that is all GPL with a GFDL documentation package would have a much simpler copyright setup this way. I quite like that idea because it potentially means that individual copyright files become smaller (easier to review) and the .deb only contains copyright information that is relevant to that single binary .deb which would assist in making /usr/share/doc/ smaller for the vast majority of users who don't install every binary package from a particular source package. (That's always handy for those interested in keeping installations small.) There really isn't any need for the copyright details of libfoo-bar to be installed alongside libfoo (or more likely, libfoo-doc alongside libfoo), let alone having the same file installed for both libfoo-bar AND libfoo so that users who do install more than one package from a particular source package get multiple, identical, copies of debian/copyright. (gcc tries to get around this with a -base package but that causes different problems.) The format of the copyright files doesn't matter from that perspective. I might try it for one of my own (smaller) packages. > Once there's a stable spec that has a measure of consensus surrounding it, > instead of a wiki page that someone takes the liberty of rewriting every > month or two, that's when I would expect to see adoption of the format by > more folks writing tools. > > > BTW, the use-case where you don't want to install FDL content and have > > some way for apt to warn you before doing so won't be solved by a new > > format because debian/copyright is written at the source-level and not > > on the binary package level (think -doc packages that have FDL stuff and > > -bin packages that have other-licensed stuff). (not that I've given this > > too much thought) Which is why debian/libfoo-doc.copyright becomes relevant. debian/libfoo1.copyright might not be as useful but where there is a clear dividing line between the licence for the code and the licence for the documentation generated from the code, a separate copyright file could be good, regardless of the format. It is much more difficult to be certain of where the dividing line exists between $(top_srcdir)/src and $(top_srcdir)/lib, especially when that line can shift according to build options or new versions. -- Neil Williams ============= http://www.data-freedom.org/ http://www.nosoftwarepatents.com/ http://www.linux.codehelp.co.uk/
pgpWmQNAkaBug.pgp
Description: PGP signature