Issue i am having is similar to the one mentioned here :
http://stackoverflow.com/questions/36911442/how-to-stop-gracefully-a-spark-streaming-application-on-yarn
I am creating a rdd from sequence of 1 to 300 and creating streaming RDD
out of it.
val rdd = ssc.sparkContext.parallelize(1 to 300)
val dstream = new ConstantInputDStream(ssc, rdd)
dstream.foreachRDD{ rdd =>
rdd.foreach{ x =>
log(x)
Thread.sleep(50)
}
}
When i kill this job, i expect elements 1 to 300 to be logged before
shutting down. It is indeed the case when i run it locally. It wait for the
job to finish before shutting down.
But when i launch the job in custer with "yarn-cluster" mode, it abruptly
shuts down.
Executor prints following log
ERROR executor.CoarseGrainedExecutorBackend:
Driver xx.xx.xx.xxx:yyyyy disassociated! Shutting down.
and then it shuts down. It is not a graceful shutdown.
Anybody knows how to do it in yarn ?