Hello, I downloaded spark-2.4.5 source from https://mirrors.ocf.berkeley.edu/apache/spark/spark-2.4.5/spark-2.4.5.tgz After extracting it and running:
./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes It creates a Spark binary distribution named: spark-2.4.5-bin-custom-spark.tgz So this file is supposedly a ready-to-distribute Spark binary file like the one you can download from http://mirror.metrocast.net/apache/spark/spark-2.4.5/spark-2.4.5-bin-hadoop2.7.tgz However, one big difference between this custom build and the official build is that you do not have a LICENSE file in the custom build. I don't know much about Apache license, but I would suppose a custom build distribution should have one. The reason we are missing the file is caused by the following code in make-distribution.sh: [image: image.png] There is no LICENSE-binary file in the official spark-2.4.5.tgz file, therefore there will be no LICENSE file in your custom build. I am aware of two pull requests related to this: https://github.com/apache/spark/pull/22436 started to use LICENSE-binary instead of just the LICENSE. And https://github.com/apache/spark/pull/22840 To avoid failure when there is no LICENSE-binary in spark-2.4.5 source directory. I think we need to change make-distribution.sh to make sure that the LICENSE file is copied over to its corresponding custom build distribution. However, I am not ready to do a pull request, so hopefully we can discuss it here first. -- Sincerely Xiangyu Li <yisky...@gmail.com>