toLocalIterator creates as many jobs as # of partitions, and it ends up spamming Spark UI

Mingyu Kim Fri, 13 Mar 2015 00:05:58 -0700

Hi all,

RDD.toLocalIterator() creates as many jobs as # of partitions and it spams 
Spark UI especially when the method is used on an RDD with hundreds or 
thousands of partitions.


Does anyone have a way to work around this issue? What do people think about 
introducing a SparkContext local property (analogous to “spark.scheduler.pool” 
set as a thread-local property) that determines if the job info should be shown 
on the Spark UI?

Thanks,
Mingyu

toLocalIterator creates as many jobs as # of partitions, and it ends up spamming Spark UI

Reply via email to