On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <deb...@emorrp1.name> wrote: > > TLDR: I think REUSE.software is a bad idea that is worse than what > Debian already invented with Machine-readable debian/copyright file. I > guess if upstream uses it, there's no reason not to ignore that as a > source of copyright assertions.
I expected some concerns about the complexity of the SPDX document, but certainly not about standardized copyright information in source files. Yes, Debian may have invented the machine-readable copyright bill, but not machine-readable copyright information in source files. This is what REUSE is all about, and it greatly reduces manual labor - I don't understand how this can be seen as bad. On Thu, Jan 27, 2022 at 12:39 AM Phil Morrell <deb...@emorrp1.name> wrote: > > I *am* a big fan of SPDX-License-Identifier, but the above being > straightforward is only true for the most trivial of examples. REUSE > advocate for sprinkling .license files around your repo for e.g. logos > and other binaries. Same story with multiple authors, they recommend > using multiple FileCopyrightText's initially, then split it out to a > separate AUTHORS file and use something like "Project X contributors". No, it does not only work for trivial examples. Take any project with a significant amount of code, e.g. [1], and most of the time you will find that every source file has the copyright information in the header. The problem is, there has been no standardized way to parse them. That's why we have tools like licensecheck that try to find it out. With REUSE, it gets much much easier. Wrt to the .license files: yes they're ugly, but still better than no automation at all. With the new yaml spec, I suspect that these will go away. Wrt to multiple authors: this is not the fault of REUSE, but just how copyright works. > Ultimately, when everything becomes too much, REUSE falls back to > recommending Debian's copyright format anyway! So even if upstream sees > the value in taking some copyright busywork off our hands, why not > suggest they just use it in the first place in e.g. the LICENSE file. Sight, yes, because Debian's format is afaik the only standardized, easy to parse format out there. But the reason why it is there is *not* for "when everything becomes", but for files that you cannot and don't want to alter. For example, if you regularly import 3rd-party code that does not follow REUSE and you don't want to edit the header all the time. Note that if everyone would use REUSE, that would not be a problem. Another example is when you have tiny example code or configs that you want to present to a user, but without any distracting comments (think beginner tutorials). However, they want to switch from DEP-5 to a more flexible (i.e. non-central, relocatable) spec [2]. And there is good reason to do so: for example we as Debian can specify the copyright information from our packaging separate from the upstream code, without conflict. DEP-5 does not allow that. > Firstly, I didn't think it was called DEP-5 anymore - it was accepted > into policy in 2012 as "copyright-format" titled "Machine-readable > debian/copyright file", so no longer a proposal for enhancement. This > would be a minor pedantic point (a colloquialism) except for the fact > that REUSE encourages it as part of their interface: `.reuse/dep5`. Yes it is called "Machine-readable debian/copyright file Version 1.0", but everybody knows it _is_ DEP-5, it is even in the spec in the second sentence of the abstract. The spec _is_ still DEP-5, being accepted doesn't change that. > I think this undermines your previous point about it being less prone to > failure - if we could trust upstream assertions on copyright, the NEW > review wouldn't be a problem in the first place. I strongly disagree. First of all, upstream knows way better where they copy the code from than packagers do. And projects that use REUSE are more likely to write that somewhere down as your average NPM package that puts a "under MIT license" in the readme and copies minified code from everywhere. And as a second point, if you write a debian/copyright, you are most likely to trust what is in the header, and I suspect the copyright review in NEW is not different from this regard. I mean how can one even know if the copyright information is wrong? Yes there are cases where copyright information is missing and one can try to search it, I've done this not just once, but if a project uses REUSE headers, this doesn't happen. Regards, Stephan [1] https://gitlab.cern.ch/geant4/geant4 [2] https://github.com/fsfe/reuse-docs/issues/81