Hi,

I took a look at all the LICENSE, NOTCE and DISCLAIMER files in the non 
documentation / non web site github repos of all incubating projects. 

I was assisted by scripts and make a few assumptions for expediency so may of 
missed a couple/included a graduated or retired project.

Some data points:
- 10 repos are missing a LICENSE file
- There's some (very) minor variations of text in the LICENSE appendix
- 39 repos use a boiler plate LICENSE file
- 1 LICENSE file is missing Apache boilerplate test
- 1 repo is missing the LICENSE appendix part
- 2 repos have a non standard LICENSE appendix (filled in copyright line)
- 10 LICENSE files have the long form of MIT/BSD licenses where the short form 
is preferred
- 1 LICENSE file oddly / verbosely lists out the MIT/BSD license of all 
individual files
- at least 1 LICENSE file lists Apache licensed ASF software
- at least 8 LICENSE files list non ASF Apache licensed software
- 14 repos are missing a NOTICE file
- in the NOTICE file 14 repos use the name "Apache XXXX (incubating)”, 55 use 
"Apache XXXX”, and 3 use just “XXX”  (missing Apache)
- 29 repos have a NOTICE file copyright year before 2016
- 2 use the older “developed by” instead of “developed at” in the NOTICE file
- 2 have incorrect text in the NOTICE files
- at least 8 including licensing information in NOTICE that should be in 
LICENSE (IMO from a quick look)
- at least 1 has excessive copyright lines which may be incorrect
- 21 repos are missing DISCLAIMER files
- There's some (minor) variation on the DISCLAIMER wording

Projects are works in progress or may not have made a release or updated the 
files for the next release or the expected files may not be in the 1/2 dozen 
places my scripts looked at. Just take these numbers as a rough indication. I 
really didn’t want to spend too long on this.

A few NOTICE / LICENSE files have TODO’s which is nice to see. I would pass an 
IPMC vote on a release if I saw this.

It looks like a few projects are getting confused with what goes in LICENSE and 
NOTICE. The two issues seem to be adding MIT, BSD or Apache licenses to NOTICE 
when it is not required and adding extra copyright notices to NOTICE. An update 
on policy documentation to make it clearer what goes in both files would help 
here I think - which is already under way.

There also seems be some confusion around what to do with bundled Apache 
licensed software. This existing documentation is not entirely clear on how to 
handle non ASF Apache software and this has come up on the list a few times 
with some differing opionions.

A few questions on incubator policy that may need to be clarified:
- A release must include a NOTICE file, but should a repo include one?
- Likewise should a DISCLAIMER file be present in the repo?
- I thought incubating projects should be named "Apache XXXX (incubating)” but 
the majority are named "Apache XXXX” missing the “(incubating)" in the NOTICE 
file.
- What is the correct way to handle non ASF Apache license software? Currently 
policy (AFAIK) is not to add to LICENSE but not an error if you do so. What 
advice should we give to podlings here?

I think some of these issues are likely to occur from copy and paste from other 
projects files. Would it make sense when creating new source repos to add 
boiler plate LICENSE, NOTICE and DISCLAIMER files?

Anyone have any other views / opionions / insights based on the above data?

Now I don’t want to look at a LICENSE or NOTICE file for a week or so and need 
a stiff drink.

Thanks,
Justin

PS If anyone is interested in the simple scripts/process to get those numbers 
just ask offline. I used grep, wc and sort a fair bit to narrow down which 
files to look at.
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to