Here is what Apache Spark does for vendored code https://github.com/apache/spark/tree/master/licenses
e.g. py4j is vendored https://github.com/apache/spark/tree/master/python/lib On Thu, Nov 2, 2017 at 1:47 PM, Wes McKinney <wesmck...@gmail.com> wrote: > I don't think there's any issues with vendoring unmodified source code > from the approved list of licenses, so long as the presence of the > code is noted in the LICENSE.txt. Might be worth finding some of the > Apache projects where this has been done to make sure we go by the > books > > I'm the one who originally brought up the prospect of vendoring > jemalloc because we need to ship a patched, as-yet-unrelated version > with a bugfix that Uwe made upstream. So individuals who are doing an > offline build of Arrow would have to install that particular version > of jemalloc instead of a released version. For all of the rest of our > thirdparty libraries, a user creating an offline build could make > packages for each of the thirdparty dependencies at a released > version, but they would have to make a custom build for jemalloc, > which might be a rough edge for some people. > > As a particular data point, I regularly build Arrow on hosts without > outbound internet access (relying on binary packages for thirdparty). > > I'm open to ideas other than vendoring the source code as long as we > provide reasonable support the offline / behind-firewall use case. > > - Wes > > On Thu, Nov 2, 2017 at 1:38 PM, Robert Nishihara > <robertnishih...@gmail.com> wrote: >> Not an answer, but what's wrong with just cloning it and checking out the >> relevant commit when building arrow? >> On Thu, Nov 2, 2017 at 10:30 AM Uwe L. Korn <uw...@xhochy.com> wrote: >> >>> Hello, >>> >>> we would like to vendor the current stable-4.x branch of jemalloc in >>> Arrow C++ as we rely on the current latest commit for working with it. >>> As the performance benefits of jemalloc are quite large, this is a >>> burden, we would be ready to take. As jemalloc is a non-Apache project, >>> we would need to be careful of the IP. jemalloc itself is licensed under >>> the 2-clause BSD license. >>> >>> Would it be ok to include jemalloc sources alongside in the Arrow >>> tarball with LICENSE amended accordingly? Do we need to also put Apache >>> License headers in all of jemalloc files? Is this even possible or would >>> we need a code donation for the whole process? >>> >>> Uwe >>>