Re: Streaming job, catch exceptions

2019-05-21 Thread bsikander
Ok great. I understood the ideology, thanks. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Streaming job, catch exceptions

2019-05-21 Thread Jason Nerothin
Yes. If the job fails repeatedly (4 times in this case), Spark assumes that there is a problem in the Job and notifies the user. In exchange for this, the engine can go on to serve other jobs with its available resources. I would try the following until things improve: 1. Figure out what's wrong

Re: Streaming job, catch exceptions

2019-05-21 Thread bsikander
umm, i am not sure if I got this fully. It is a design decision to not have context.stop() right after awaitTermination throws exception? So, the ideology is that if after n tries (default 4) a task fails, the spark should fail fast and let user know? Is this correct? As you mentioned there are

Re: Streaming job, catch exceptions

2019-05-21 Thread Jason Nerothin
Correction: The Driver manages the Tasks, the resource manager serves up resources to the Driver or Task. On Tue, May 21, 2019 at 9:11 AM Jason Nerothin wrote: > The behavior is a deliberate design decision by the Spark team. > > If Spark were to "fail fast", it would prevent the system from rec

Re: Streaming job, catch exceptions

2019-05-21 Thread Jason Nerothin
The behavior is a deliberate design decision by the Spark team. If Spark were to "fail fast", it would prevent the system from recovering from many classes of errors that are in principle recoverable (for example if two otherwise unrelated jobs cause a garbage collection spike on the same node). C

Re: Streaming job, catch exceptions

2019-05-21 Thread bsikander
Ok, I found the reason. In my QueueStream example, I have a while(true) which keeps on adding the RDDs, my awaitTermination call if after the while loop. Since, the while loop never exits, awaitTermination never gets fired and never get reported the exceptions. The above was just the problem wit

Re: Streaming job, catch exceptions

2019-05-21 Thread bsikander
Just to add to my previous message. I am using Spark 2.2.2 standalone cluster manager and deploying the jobs in cluster mode. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsu

Re: Streaming job, catch exceptions

2019-05-21 Thread bsikander
I was able to reproduce the problem. In the below repository, I have 2 sample jobs. Both are execution 1/0 (Arithmetic Exception) on the executor sides and but in case of NetworkWordCount job, awaitTerminate throws the same exceptions (Job aborted due to stage failure .) that I can see in the

Re: Streaming job, catch exceptions

2019-05-15 Thread bsikander
Any help would be much appreciated. The error and question is quite generic, i believe that most experienced users will be able to answer. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-

Re: Streaming job, catch exceptions

2019-05-12 Thread bsikander
>> Code would be very helpful, I will try to put together something to post here. >> 1. Writing in Java I am using Scala >> Wrapping the entire app in a try/catch Once the SparkContext object is created, a Future is started where actions and transformations are defined and streaming context is s

Re: Streaming job, catch exceptions

2019-05-12 Thread Jason Nerothin
Code would be very helpful, but it *seems like* you are: 1. Writing in Java 2. Wrapping the *entire app *in a try/catch 3. Executing in local mode The code that is throwing the exceptions is not executed locally in the driver process. Spark is executing the failing code on the cluster. On Sun, M

Re: Streaming job, catch exceptions

2019-05-12 Thread bsikander
Hi, Anyone? This should be a straight forward one :) -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org