I did try creating more partitions by overriding the default number of partitions determined by HDFS splits. Problem is, in this case program will run for ever. I have same set of inputs for map reduce and spark. Where map reduce is taking 2 mins, spark is taking 5 min to complete the job. I thought because all of the executors are not being utilized properly my spark program is running slower than map reduce. I can provide you my code skeleton for your reference. Please help me with this.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Executors-not-utilized-properly-tp7744p7759.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
