On Saturday 14 November 2015 16:02:18 Neil Williams wrote: > scan-copyrights must get much better handling of non-text formats. > I tried it with a package containing a lot of png files, the example at > the top of the manpage failed because the output of scan-copyrights was > a binary file. (It's a text-like file which contains binary snippets > pretending to be copyright information.)
scan-copyright uses licensecheck which has some trouble recently to handle files with binary. This issue is tracked by #803724. > So no, detailed copyright files for non-trivial packages cannot be > generated and the tools produce nothing close to the required result. > Trivial packages don't need generation. > > It's not that neither tool is perfect, neither tool seems to have been > tried with actual packages that may need the tool. I tried 'cme update dpkg' on moarvm, libtommath, pan and sdl2. Here's a result: http://anonscm.debian.org/cgit/pkg-rakudo/pkg-moarvm.git/tree/debian/copyright Unfortunately, the latest changes in licensecheck indeed broke scan- copyrights. > Even with a trivial package, scan-copyrights produces output which > if used as debian/copyright would get rejected by lintian and ftpmaster. What trivial package ? I can't fix bugs without details. > Much more work needs to be done, You're right. Especially with licensecheck. I've tried to improve licensecheck to better cope with encoding using `file` to detect mime type. But your mail show that this approach fails. Looks like `file` does not cope with with ascii file embeding binary or with several encodings. I need to rework licensecheck. I'll probably revert my changes and let user deal with inconsistent encodings in owners names. All the best -- https://github.com/dod38fr/ -o- http://search.cpan.org/~ddumont/ http://ddumont.wordpress.com/ -o- irc: dod at irc.debian.org