On Wed, 10 Aug 2016 at 11:12:55 +0800, Paul Wise wrote: > The only possible way to solve this in general terms is, accurate > document the copyright/license of the source package using the > machine-readable format and during builds, track the transformation of > input files in the source package to output files in the binary > package and then generate the copyright/license information for the > binary package based on which input files from which source/binary > packages ended up in the new binary package.
I'm sure this is a very interesting academic exercise, but pragmatically, why do we want to require ourselves to go to all that effort? For that matter, is everything we require *now* necessary or desirable? Broadly, we have two reasons (that I'm aware of) to do legal stuff: because we want to (it meets some goal that we care about - self-imposed policy), and because lawyers tell us we are at risk of being sued if we don't (it meets our goal of being able to continue making Debian - license compliance). For software with a reasonably helpful upstream and a reasonably sane build system, I've often found that jumping through the necessary hoops to write debian/copyright takes about as long as the rest of the packaging put together. This is demotivating: I didn't join this project to copy copyright information around, I joined this project to make an operating system. If I have to copy copyright information around to meet our project's goals (including the goal of obeying copyright law, both because we want to respect authors' rights and because we don't want to get sued) then I'll put up with it - but I think we should be clear about why we do this work, and only require maintainers to do exactly enough of it to meet its actual goals. In particular, if this thread comes to the conclusion that more needs to be done than what maintainers currently do, then it should be something actionable; and since it will likely create more work for a very large number of people, it should be backed up by *why* that work is needed. If the reason turns out to be a ftp-master saying "we have received legal advice saying that you must do x, y and z, and we are not allowed to explain further", then that would be unsatisfying but better than nothing; and at least it would put a boundary on it. --- Self-imposed policy of DFSG compliance: One core value in Debian is that all of main is DFSG-compliant. If we assume that maintainers (and ftpmasters) check the licenses of our source packages as they are meant to do, and we build all of main's binary packages from source code in main and other binaries from main, then all of main that is distributable is trivially DFSG-compliant. (Some of it might be non-distributable, for instance by being a derived work of both OpenSSL and something GPL'd; but that's license-compliance, and we frequently already detect it.) --- Self-imposed policy of documenting copyright information: <https://www.debian.org/doc/debian-policy/ch-docs.html#s-copyrightfile> says "Every package must be accompanied by a verbatim copy of its copyright information and distribution license", which contains an implicit assumption that each package *has* a single distribution license. This is clearly not actually true in practice. The DEP-5 specification addresses this by allowing the copyright file to specify multiple licenses which must be complied with simultaneously. Optionally, it also lets the maintainer specify the licenses of individual source files matched by filename or glob. The ftp-masters appear to have interpreted "copyright information" to require a verbatim quotation of the license grant, except in cases where there are several similar license grants with trivial differences. I'm still not clear on why this is, and whether it's because we want to or because we'll get sued if we don't. Some of our Policy-compliant copyright files are clearly absurd; adwaita-icon-theme's is 88K and lists at least over 200 (potential) copyright holders, mostly for l10n. I find it hard to believe that all of that is actually necessary or achieving a desirable goal for us. Meanwhile, linux's copyright file resorts to citing "Linus Torvalds and many others" as copyright holders. If the kernel was held to the same standard that is (anecdotally) applied to most other packages, its copyright file would presumably be impractically huge (or perhaps more likely, we would no longer have any volunteers willing to either maintain a Linux kernel package or review it in NEW). DEP-5 notably omits any syntax for describing the copyright or licenses of the contents of the *binary* package, which suggests that its authors (even those who consider it most valuable to specify the licenses of individual source files) did not consider this to be a goal. Are we aiming to go further than this by documenting, for instance, which specific DFSG-compliant license applies to /usr/bin/dbus-daemon, which specific DFSG-compliant license applies to /usr/share/doc/dbus-1-doc/html/api/jquery.js, and who their copyright holders are? If so, why? I'm not convinced that anyone in Debian has both the necessary legal expertise (definition of a derivative work in arbitrary jurisdictions) and the necessary technical expertise (tracing what goes into a binary) to make a reliable statement about who has a copyright interest in, for example, /usr/bin/dbus-daemon. I would hope that Debian does not aim to set policies that mean it will only accept contributions from copyright lawyers who also happen to be software engineering experts. At the moment, the best we can do is to provide an incomplete list of people who have claimed that they *might* have a copyright interest in that binary; hopefully that's more than enough to achieve our goals. --- License compliance in general: One argument for quoting the copyright holders and license information is that it's for license compliance, particularly compliance with the GPL. I'm not sure to what extent this actually holds water: we are willing to say that we satisfy the GPL's requirement to provide copies of the GPL, and the source code, by pointing to the nearby copies of base-files.deb (for the GPL) and the source package (for the GPL and the source code). From a devil's-advocate point of view: can't we apply the same reasoning to the copyright information and the license grant? It is perhaps interesting to observe that Fedora, which is backed by a well-funded US corporation (i.e. an attractive target for lawsuits), limits itself to saying this about (for example) dbus: # The effective license of the majority of the package, including the shared # library, is "GPL-2+ or AFL-2.1". Certain utilities are "GPL-2+" only. License: (GPLv2+ or AFL) and GPLv2+ whereas the corresponding Debian package has a 412-line copyright file. Similarly, ikiwiki has: # ikiwiki is licensed under GPLv2+, the Python code in plugins/ under # BSD (2-clause) License: GPLv2+ and BSD in Fedora, and 386 lines in Debian. --- License compliance in Doxygen's jquery.js specifically: In the case of Doxygen's "jquery.js", if you look at the file itself (for instance dbus-1-doc has a copy), you'll notice that it contains copyright and license information for the libraries that went into it (which is specifically preserved by the minification process). We have not followed Debian's self-imposed requirement to document copyright information centrally, but we have obeyed its (permissive) license. --- Regards, S