Hi Vijay,

Thanks for the follow-up.

The reason why we have 90 HDFS files (causing the parallelism of 90 for HDFS
read stage) is because we load the same HDFS data in different jobs, and
these jobs have parallelisms (executors X cores) of 9, 18, 30. The uneven
assignment problem that we had before could not be explained by modulo
operation/remainder, because we sometimes had only 2 executors active out of
9 (while the remaining 7 would stay completely idle).

We tried to repartition the Kafka stream to 90 partitions, but it led to
even worse disbalance in the load. Seems that keeping the number of
partitions equal to executors X cores reduces the chance of uneven
assignment.

We also tried to repartition the HDFS data to 9 partitions, but it did not
help, because repartition takes into account the initial locality of data,
so 9 partitions may end up on 9 different cores. We also tried to set
spark.shuffle.reduceLocality.enabled=false, but it did not help. Last but
not least, we want to avoid coleasce, because then partitions would depend
on the HDFS block distribution, so they would not be hash partitioned (which
we need for the join).

Please find below the relevant UI snapshots:

<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8940/jobOverview.png>
 
<http://apache-spark-user-list.1001560.n3.nabble.com/file/t8940/jobDetails.png> 

The snapshots refers to the batch when RDD is reloaded (WholeStageCodegen
1147 is gray except in the batch at reload time, which happens every 30
minutes).

Thanks a lot!



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to