Hi,

We’ve recently had a number of incubating projects copy your LICENSE and NOTICE 
files as they see Spark as a popular project and they are a little sad when the 
IPMC votes -1 on their releases.

Now I'm not on your PMC, don’t know your projects history and there may be 
valid reasons for the current LICENSE and NOTICE contents so take this as some 
friendly advice, you can choose to ignore it or not act on it. Looking at your 
latest source release (2.3.1), I can see there seems too much information in 
LICENSE and especially NOTICE for a source release. It may be that the LICENE 
and NOTICE is intended for the binary release? [1] But even if that is teh case 
it also seems to be missing a couple of licenses for bundled software.

But in general my alarm bells start ringing because:
- Category B licenses are listed (which shouldn't be in a source release)
- License information is listed in NOTICE when it should be in LICENSE
- Dependancies are listed rather than what is actually bundled

Taking a look at the release I can see this 3rd party code bundled:

MIT licensed (some is dual licensed):
        dagre-d3
        datatables
        jquery cookies
        SortTable
        Modernizr
        matchMedia polyfill*
        respond*
        dataTables bootstrap*
        jQuery
        jQuery datatables*
        grap lib-dot
        jquery block UI
        anchorJS
        jsonFormatter

Apache licensed:
        vis.js*
        bootstrap*
        bootstrap-tooltip*
        toposort.py*
        TimSort*
        LimitedInputStream.java*

BSD licensed:
        d3
        cloudpickle
        join*

Python licensed
        heapq3

CC0 licensed:
        ./data/mllib/images/kittens/29.5.a_b_EGDP022204.jpg*

* Are currently missing from license

So that would end up with a number of licenses in LICENSE but nothing added to 
a boiler plate NOTICE file. The ALv2 licensed items don’t have NOTICE files so 
there no impact there. I could of course have missed something and could be 
wrong for a number of reasons but I cannot see how the above makes the NOTICE 
file 667 lines long :-)

I also noticed some compiled code in the source release which probably 
shouldn’t be there. [2]
        spark-2.3.1/core/src/test/resources/TestUDTF.jar
        spark-2.3.1/sql/hive/src/test/resources/SPARK-21101-1.0.jar
        spark-2.3.1/sql/hive/src/test/resources/TestUDTF.jar
        spark-2.3.1/sql/hive/src/test/resources/hive-contrib-0.13.1.jar
        spark-2.3.1/sql/hive/src/test/resources/hive-hcatalog-core-0.13.1.jar
        spark-2.3.1/sql/hive/src/test/resources/data/files/TestSerDe.jar
        
spark-2.3.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test-2.10.jar
        
spark-2.3.1/sql/hive/src/test/resources/regression-test-SPARK-8489/test-2.11.jar
        spark-2.3.1/sql/hive-thriftserver/src/test/resources/TestUDTF.jar

Thanks,
Justin

PS please cc me on replies as I’m not subscribed to your mailing list

1. http://www.apache.org/dev/licensing-howto.html#binary

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to