Hi,
I have set the partitions as 6000, and requested 100 nodes, with 32
cores each node,
and the number of executors is 32 per node

spark-submit --master $SPARKURL --executor-cores 32 --driver-memory
20G --executor-memory 80G single-file-test.py


And I'm reading a 2.2 TB, the code, just has simple two steps,
rdd=sc.read
rdd.count
Then I checked the log file, and history server, it shows that the
count stage has a really large tasks launching range, e.g.,

16/03/19 22:40:17
16/03/19 22:30:56

which is about 10 minutes,
Has anyone experienced this before?
Could you please let me know the reason and internal of Spark relating
to this issue,
and how to resolve it? Thanks much.

Best,
Jialin

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to