Do you have a lot of small files? Do you use S3 or similar? It could be that 
Spark does some IO related tasks.

> Am 25.12.2018 um 12:51 schrieb Akshay Mendole <akshaymend...@gmail.com>:
> 
> Hi, 
>       As you can see in the picture below, the application last job finished 
> at around 13:45 and I could see the output directory updated with the 
> results. Yet, the application took a total of 20 min more to change the 
> status. What could be the reason for this? Is this a known fact? The 
> application has 3 jobs with many stages inside each having around 10K tasks. 
> Could the scale be reason for this? What is it exactly spark framework doing 
> during this time?
> 
> <Screen Shot 2018-12-25 at 5.14.26 PM.png>
> 
> Thanks,
> Akshay
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to