Hi Marco, Have you configured the restart strategy ? if the restart-strategy [1] is configuration into some strategies other than none, Flink should be able to restart the job automatically on failover. The restart strategy could also be configuration via StreamExecutionEnvironment#setRestartStrategy.
If no restart strategy is configured (the default behavior), the job would failed and we would need to re-submit the job to execute it from the scratch. Best, Yun ------------------Original Mail ------------------ Sender:Marco Villalobos <mvillalo...@kineteque.com> Send Date:Wed May 19 11:27:37 2021 Recipients:user <user@flink.apache.org> Subject:DataStream API Batch Execution Mode restarting... I have a DataStream running in Batch Execution mode within YARN on EMR. My job failed an hour into the job two times in a row because the task manager heartbeat timed out. Can somebody point me out how to restart a job in this situation? I can't find that section of the documentation. thank you.