The licensecheck.Match type holds the start and end offsets in the
file. Can't you use that to extract the license portion and either
check it's length against the length of the license or repeat the Check
with only that portion of the file?

On Thu, 2019-11-14 at 10:24 +0100, fge...@gmail.com wrote:
> Sorry if I was not clear: on walking the file system, that's clear, I
> did not intend to talk about that, only about matching and reporting
> on matching. The example I gave was just to put in context why I
> believe I'd need a different api.
> 
> Using the Options field is good enough in the first example. (That's
> how I used licensecheck first.)
> Although for the second example Cover() does not report what I'd
> need.
> 
> As far as I've seen currently using func Cover(INPUT []byte, opts
> Options) (Coverage, bool) reports 100% MIT if INPUT matches byte for
> byte 100% MIT. If INPUT has more text than the complete 100% matching
> text of  MIT license, for example the MIT license is only in the
> beginning of INPUT and the rest of INPUT is for example Go code, than
> Coverage will report len(INPUT)/len(MIT license) which is less than
> 100%.
> 
> In this case, the new api would report 100%, since input contains
> 100%
> MIT license text (and some programming code, which is not relevant
> here).
> 
> If I understand correctly the current api is for checking _already_
> identified license files, which contain _only_ the license text.
> I believe to look for files containing - complete or possibly broken
> -
> license references a different matching is needed.
> 
> 
> On 11/14/19, Rob Pike <r...@golang.org> wrote:
> > As I understand what you're trying to do, you just need to write a
> > tree
> > walker, perhaps using filepath.Walk, that opens each file and calls
> > Cover
> > on it. You can set the Options field to control the threshold for
> > reporting, and use the result of that to choose which licenses to
> > report.
> > 
> > I don't believe an API change is called for.
> > 
> > -rob
> > 
> > 
> > On Thu, Nov 14, 2019 at 6:14 PM <fge...@gmail.com> wrote:
> > 
> > > func Cover(input []byte, opts Options) (Coverage, bool) in
> > > licensecheck currently reports len(input)/len(one of the
> > > licenses) for
> > > each known license. I'd need for all known licenses len(known
> > > license)/len(license reference in input).
> > > 
> > > I'd like to scan >100000 files (possibly a lot more), where some
> > > of
> > > them (<0.1%) contain full or partial known license texts.
> > > 
> > > An example scenario for an example /src, containing >100000
> > > files:
> > > $ listlicenses /src     # to get an overview of 100% matching
> > > license
> > > references
> > > LGPL-2.1
> > > MIT
> > > $ listlicenses -details /src            # same tree, more
> > > detailed
> > > output,
> > > to
> > > see the details
> > > /src/license refers 100% MIT   # the bytes in /src/license
> > > correspond
> > > one for one for the MIT license
> > > /src/fonts/LICENSE refers 100% MIT   # the bytes in
> > > /src/fonts/LICENSE
> > > correspond one for one for the MIT license
> > > /src/a/Notice refers 100% LGPL-2.1   # same as above with LGPL-
> > > 2.1
> > > /src/a/b/whatever.go refers 94% GPL2   # most probably a broken
> > > license reference in whatever.go, maybe someone inadvertently
> > > deleted
> > > the last word from the lines containing the GPL2 license text.
> > > Needs
> > > human inspection to check what's the license situation with
> > > whatever.go
> > > /src/c/ConfusingLicenseReferences.c refers 7% ZLIB   #
> > > ConfusingLicenseReferences.c has most probably a false positive
> > > report
> > > for reference to ZLIB
> > > /src/c/ConfusingLicenseReferences.c refers 65% MIT    #
> > > ConfusingLicenseReferences.c has only 65% of MIT, the author
> > > intended
> > > to refer to MIT, but some inadvertent edit later broke the
> > > license
> > > reference in ConfusingLicenseReferences.c
> > > 
> > > Command listlicenses iterates over all files in the subtree,
> > > gathering
> > > all full or partial (broken) license references. Command
> > > listlicenses
> > > uses the functionality similar to github.com/google/licensecheck
> > > to
> > > check the files in the file system.
> > > 
> > > 
> > > 
> > > thanks!
> > > 
> > > On 11/13/19, Rob Pike <r...@golang.org> wrote:
> > > > Can you please explain in more detail what you're asking for? I
> > > > don't
> > > > understand the problem you have or why the current package
> > > > cannot
> > > > handle
> > > > it.
> > > > 
> > > > -rob
> > > > 
> > > > 
> > > > On Wed, Nov 13, 2019 at 7:05 PM <fge...@gmail.com> wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > >  "licensecheck classifies license files and heuristically
> > > > > determines
> > > > > how well they correspond to known open source licenses."
> > > > > 
> > > > > I'd like to identify license references in the file system.
> > > > > If I
> > > > > understand correctly package licensecheck in it's current
> > > > > form is not
> > > > > useful to help with this.
> > > > > If it's still possible, could you please share a hint how to
> > > > > do that?
> > > > > (input: byte array, output: license references in the byte
> > > > > array)
> > > > > If I understand correctly and I can't use licensecheck in
> > > > > it's current
> > > > > form, which one is preferred:
> > > > > extend current api, (maybe: func Refers(input []byte)
> > > > > (References,
> > > > > bool) or fork+rename the package? (References{...} being
> > > > > similar to
> > > > > Coverage{...})
> > > > > 
> > > > > thanks,
> > > > > Gergely Födémesi
> > > > > 
> > > > > --
> > > > > You received this message because you are subscribed to the
> > > > > Google
> > > 
> > > Groups
> > > > > "golang-nuts" group.
> > > > > To unsubscribe from this group and stop receiving emails from
> > > > > it, send
> > > 
> > > an
> > > > > email to golang-nuts+unsubscr...@googlegroups.com.
> > > > > To view this discussion on the web visit
> > > > > 
> > > 
> > > 
https://groups.google.com/d/msgid/golang-nuts/CA%2BctqrqKKUPTHihMLhLTH5O-tBm1qENQV6y41Qwde4jHp1kNmA%40mail.gmail.com
> > > > > .
> > > > > 
> 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3d39f8d72e20876ff6a321259a95eb60f89b607a.camel%40kortschak.io.

Reply via email to