Thank you. I used the default restart strategy. I'll change that. On Tue, May 18, 2021 at 11:02 PM Yun Gao <yungao...@aliyun.com> wrote:
> Hi Marco, > > Have you configured the restart strategy ? if the restart-strategy [1] is > configuration > into some strategies other than none, Flink should be able to restart the > job automatically > on failover. The restart strategy could also be configuration via > StreamExecutionEnvironment#setRestartStrategy. > > If no restart strategy is configured (the default behavior), the job would > failed and we would > need to re-submit the job to execute it from the scratch. > > Best, > Yun > > > > ------------------Original Mail ------------------ > *Sender:*Marco Villalobos <mvillalo...@kineteque.com> > *Send Date:*Wed May 19 11:27:37 2021 > *Recipients:*user <user@flink.apache.org> > *Subject:*DataStream API Batch Execution Mode restarting... > >> I have a DataStream running in Batch Execution mode within YARN on EMR. >> My job failed an hour into the job two times in a row because the task >> manager heartbeat timed out. >> >> Can somebody point me out how to restart a job in this situation? I can't >> find that section of the documentation. >> >> thank you. >> >