On Fri, Mar 20, 2009 at 09:09:53AM +0000, Roger Leigh <rle...@codelibre.net> wrote: > Mike Hommey wrote: >> On Thu, Mar 19, 2009 at 11:02:48PM -0700, Daniel Moerner wrote: >>> On Thu, Mar 19, 2009 at 10:19 PM, Mike O'Connor <s...@debian.org> wrote: >>>> To me, it seems like since one has to go through all of the source files >>>> anyway, creating a list of copyright holders while you are doing it is a >>>> trivial task. I don't see why making this list takes any time at all >>>> really. Unless you are not actually looking at the code you upload, >>>> which would worry me for other reasons as well. >>> I agree. The thing that I like about creating packages with the >>> wiki.d.o specification is that it forces you to actually examine the >>> copyrights of all the parts of a new package, instead of just use a >>> lazy link to /usr/share/common-licenses/foo. This is especially >>> important for packages that have many different hidden scripts or >>> architecture-independent libraries that might have different licenses. >>> With the kind of copyright file generated by dh_make, it seems like >>> new maintainers often ignore the risk of a package with a tainted, >>> unredistributable license problem. >>> >>> In shorter words: I think something should be done about the copyright >>> file to encourage developers to actually perform an audit of the >>> license status of files in their packages before they upload. The >>> current copyright template doesn't really encourage this; I like the >>> machine-parseable system because it makes it easy to organize such an >>> audit. >> >> Try doing that on iceweasel or xulrunner. Hint: there are about 30000 >> files and a real lot of copyright holders. >> >> It's already a PITA with webkit, which is about 3000 files and quite a >> lot of copyright holders (the copyright file, which I'm pretty sure is >> not accurate is 809 lines and growing at each new release). >> >> On top of listing copyright holders, I must say listing the individual >> files for each license in the copyright file is also a major PITA. > > Given that copyrights are usually in a standard format, such as > > Copyright (\([cC]\)|©) Year[-Year] Name Email > > It shouldn't be too hard to write a tool to scan the whole source tree > and spit out a completely generated summary of copyright holders. If > this could be added to an existing tool, such as licensecheck, this > would save everyone from reimplementing it in their package (I was > considering doing this).
Licensecheck already checks that, though you have to give it an option for that, but it fails to catch anything that doesn't match such pattern, and I can tell you there are a lot... I invite you to take a look at a few .cpp files from xulrunner or iceweasel, you'll see you won't get much with your pattern, and that you can't reliably get these holders with a pattern. Mike -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org