Ok, I understand. Yes, I will have to handle them in the main thread.
Thanks! Sumona On Wed, Feb 17, 2016 at 12:24 PM Shixiong(Ryan) Zhu <shixi...@databricks.com> wrote: > `onApplicationEnd` is posted when SparkContext is stopping, and you cannot > submit any job to a stopping SparkContext. In general, SparkListener is > used to monitor the job progress and collect job information, an you should > not submit jobs there. Why not submit your jobs in the main thread? > > On Wed, Feb 17, 2016 at 7:11 AM, Sumona Routh <sumos...@gmail.com> wrote: > >> Can anyone provide some insight into the flow of SparkListeners, >> specifically onApplicationEnd? I'm having issues with the SparkContext >> being stopped before my final processing can complete. >> >> Thanks! >> Sumona >> >> On Mon, Feb 15, 2016 at 8:59 AM Sumona Routh <sumos...@gmail.com> wrote: >> >>> Hi there, >>> I am trying to implement a listener that performs as a post-processor >>> which stores data about what was processed or erred. With this, I use an >>> RDD that may or may not change during the course of the application. >>> >>> My thought was to use onApplicationEnd and then saveToCassandra call to >>> persist this. >>> >>> From what I've gathered in my experiments, onApplicationEnd doesn't >>> get called until sparkContext.stop() is called. If I don't call stop in my >>> code, the listener won't be called. This works fine on my local tests - >>> stop gets called, the listener is called and then persisted to the db, and >>> everything works fine. However when I run this on our server, the code in >>> onApplicationEnd throws the following exception: >>> >>> Task serialization failed: java.lang.IllegalStateException: Cannot call >>> methods on a stopped SparkContext >>> >>> What's the best way to resolve this? I can think of creating a new >>> SparkContext in the listener (I think I have to turn on allowing multiple >>> contexts, in case I try to create one before the other one is stopped). It >>> seems odd but might be doable. Additionally, what if I were to simply add >>> the code into my job in some sort of procedural block: doJob, >>> doPostProcessing, does that guarantee postProcessing will occur after the >>> other? >>> >>> We are currently using Spark 1.2 standalone at the moment. >>> >>> Please let me know if you require more details. Thanks for the >>> assistance! >>> Sumona >>> >>> >