Hi Keith, Can you try including a clean-up step at the end of job, before driver is out of SparkContext, to clean the necessary files through some regex patterns or so, on all nodes in your cluster by default. If files are not available on few nodes, that should not be a problem, isnnt?
On Sun, Jan 22, 2017 at 1:26 AM, Mark Hamstra <[email protected]> wrote: > I wouldn't say that Executors are dumb, but there are some pretty clear > divisions of concepts and responsibilities across the different pieces of > the Spark architecture. A Job is a concept that is completely unknown to an > Executor, which deals instead with just the Tasks that it is given. So you > are correct, Jacek, that any notification of a Job end has to come from the > Driver. > > On Sat, Jan 21, 2017 at 2:10 AM, Jacek Laskowski <[email protected]> wrote: > >> Executors are "dumb", i.e. they execute TaskRunners for tasks >> and...that's it. >> >> Your logic should be on the driver that can intercept events >> and...trigger cleanup. >> >> I don't think there's another way to do it. >> >> Pozdrawiam, >> Jacek Laskowski >> ---- >> https://medium.com/@jaceklaskowski/ >> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark >> Follow me at https://twitter.com/jaceklaskowski >> >> >> On Fri, Jan 20, 2017 at 10:47 PM, Keith Chapman <[email protected]> >> wrote: >> > Hi Jacek, >> > >> > I've looked at SparkListener and tried it, I see it getting fired on the >> > master but I don't see it getting fired on the workers in a cluster. >> > >> > Regards, >> > Keith. >> > >> > http://keith-chapman.com >> > >> > On Fri, Jan 20, 2017 at 11:09 AM, Jacek Laskowski <[email protected]> >> wrote: >> >> >> >> Hi, >> >> >> >> (redirecting to users as it has nothing to do with Spark project >> >> development) >> >> >> >> Monitor jobs and stages using SparkListener and submit cleanup jobs >> where >> >> a condition holds. >> >> >> >> Jacek >> >> >> >> On 20 Jan 2017 3:57 a.m., "Keith Chapman" <[email protected]> >> wrote: >> >>> >> >>> Hi , >> >>> >> >>> Is it possible for an executor (or slave) to know when an actual job >> >>> ends? I'm running spark on a cluster (with yarn) and my workers >> create some >> >>> temporary files that I would like to clean up once the job ends. Is >> there a >> >>> way for the worker to detect that a job has finished? I tried doing >> it in >> >>> the JobProgressListener but it does not seem to work in a cluster. >> The event >> >>> is not triggered in the worker. >> >>> >> >>> Regards, >> >>> Keith. >> >>> >> >>> http://keith-chapman.com >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: [email protected] >> >> >
