My use case was to read 3000 files from 3000 different HDFS directories so i
was reading each file and creating RDD and adding it to array of JavaRDD
then do a union(rdd...). Because of this my prog was very slow(5 minutes).
After i replaced this logic with textFile(path1,path2,path3) it is working
super fast(56 sec). So union() was the overhead.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Executors-not-utilized-properly-tp7744p7785.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to