Hi Pralabh, I understand that this is a question more related to K8s than to Spark itself. In K8s you have a set of objects to control the state of your pods, which allow you, among other things, to redeploy pods that have failed. This would apply to the pod containing the driver container. https://kubernetes.io/docs/concepts/workloads/
I hope it helps you! Regards On Fri, Feb 4, 2022 at 7:31 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Not as far as I know. If your driver pod fails, then you need to resubmit > the job. I cannot see what else can be done? > > > HTH > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 4 Feb 2022 at 10:22, Pralabh Kumar <pralabhku...@gmail.com> wrote: > >> Hi Spark Team >> >> I am running spark on K8s and looking for a >> property/mechanism similar to yarn.max.application.attempt . I know this >> is not really a spark question , but i thought if anyone have faced the >> similar issue, >> >> Basically I want if my driver pod fails , it should be retried on a >> different machine . Is there a way to do the same . >> >> Regards >> Pralabh Kumar >> >