You can use a workflow manager, which gives you tools to handle transient failures in data pipelines. I suggest either Luigi or Airflow. They provide DSLs embedded in Python, so if the primitives provided are insufficient, it is easy to customise Spark tasks with restart logic.
Regards, Lars Albertsson Data engineering consultant www.mapflat.com +46 70 7687109 Calendar: https://goo.gl/tV2hWF On Jul 20, 2016 17:12, "unk1102" <umesh.ka...@gmail.com> wrote: > Hi I have multiple long running spark jobs which many times hangs because > of > multi tenant Hadoop cluster and resource scarcity. I am thinking of > restarting spark job within driver itself. For e.g. if spark job does not > write output files for say 30 minutes then I want to restart spark job by > itself I mean submitting itself from scratch to yarn cluster. Is it > possible? Is there any best practices for Spark job restart? Please guide. > Thanks in advance. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Best-practices-to-restart-Spark-jobs-programatically-from-driver-itself-tp27371.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >