Hello,

We have a Spark streaming application (running Spark 1.6.1) that consumes
data from a message queue. The application is running in local[*] mode so
driver and executors are in a single JVM.

The issue that we are dealing with these days is that if any of our lambda
functions throw any Exception, the entire processing hangs and the task
keeps getting retried ad infinitum. According to the documentation here :

http://spark.apache.org/docs/1.6.1/configuration.html

the default value of spark.task.maxFailures is set to 4 and we have not
changed that so I don't quite get why the tasks keep getting retried
forever.

How can we make sure that a single bad record causing a Runtime Exception
does not stop all the processing?

Till now I have tried to add try/catch blocks around the lambda functions
and returning null in case an exception does get thrown but that causes
NullPointerExceptions within Spark with the same net effect of processing
of the subsequent batches getting stopped.

Thanks in advance,
NB

Reply via email to