On Wed, 3 Aug 2016 18:28:38 +0200 David Craven <da...@craven.ch> wrote:
> How can I tell the difference between a lgpl2.1 and lgpl2.1+ license? "or later" > Is this a job that an automated tool could do? Detecting licenses > included in a tarball? I also wonder about that. Usually, the license text is just copied & pasted anyway, so it should be quite regular. If there isn't one, I could write one which would basically, per source file, - try to find SPDX identifier, if that doesn't work: - ignore newline, "#" or ";" or "*" or "//" at the beginning of the line - lex that into words, where "word" is either [a-zA-Z0-9-]+ or [.,;] - try to 1:1 match with all the licenses similarily mapped - if that didn't work, try to find signal words and guess the license and print the difference in a short form. I could do that program in maybe 2 hours and find and extract all the official license texts in a few more hours. But does such a thing already exist? [Seems like something obvious to have and I'm writing many other things already.] A human would still have to review the non-1:1 things - there could always be strange exceptions in the README or whatever - but the majority of cases should work just fine. See also <https://spdx.org/licenses/> (especially <https://github.com/triplecheck/>), <http://www.sciencedirect.com/science/article/pii/S0164121216300905> (also lists several license checkers; Fossology seems to be a whole webservice which does that).