Do you have a lot of small files? Do you use S3 or similar? It could be that Spark does some IO related tasks.
> Am 25.12.2018 um 12:51 schrieb Akshay Mendole <akshaymend...@gmail.com>: > > Hi, > As you can see in the picture below, the application last job finished > at around 13:45 and I could see the output directory updated with the > results. Yet, the application took a total of 20 min more to change the > status. What could be the reason for this? Is this a known fact? The > application has 3 jobs with many stages inside each having around 10K tasks. > Could the scale be reason for this? What is it exactly spark framework doing > during this time? > > <Screen Shot 2018-12-25 at 5.14.26 PM.png> > > Thanks, > Akshay > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org