Hi Our team has a 40 node hortonworks Hadoop cluster 2.2.4.2-2 (36 data node) with apache spark 1.2 and 1.4 installed. Each node has 64G RAM and 8 cores.
We are only able to use <= 72 executors with executor-cores=2 So we are only get 144 active tasks running pyspark programs with pyspark. [Stage 1:===============> (596 + 144) / 2042] IF we use larger number for --num-executors, the pyspark program exit with errors: ERROR YarnScheduler: Lost executor 113 on hag017.example.com: remote Rpc client disassociated I tried spark 1.4 and conf.set("dynamicAllocation.enabled", "true"). However it does not help us to increase the number of active tasks. I expect larger number of active tasks with the cluster we have. Could anyone advise on this? Thank you very much! Shaun