Hi, Yarn won't resubmit the job. In case of a process failure where Yarn restarts the Flink Master, the Master will recover the submitted jobs from a persistent storage system.
Cheers, Till On Thu, May 28, 2020 at 4:05 PM M Singh <mans2si...@yahoo.com> wrote: > Hi Till/Zhu/Yang: Thanks for your replies. > > So just to clarify - the job id remains same if the job restarts have not > been exhausted. Does Yarn also resubmit the job in case of failures and if > so, then is the job id different. > > Thanks > On Wednesday, May 27, 2020, 10:05:40 AM EDT, Till Rohrmann < > trohrm...@apache.org> wrote: > > > Hi, > > if you submit the same job multiple times, then it will get every time a > different JobID assigned. For Flink, different job submissions are > considered to be different jobs. Once a job has been submitted, it will > keep the same JobID which is important in order to retrieve the checkpoints > associated with this job. > > Cheers, > Till > > On Tue, May 26, 2020 at 12:42 PM M Singh <mans2si...@yahoo.com> wrote: > > Hi Zhu Zhu: > > I have another clafication - it looks like if I run the same app multiple > times - it's job id changes. So it looks like even though the graph is the > same the job id is not dependent on the job graph only since with different > runs of the same app it is not the same. > > Please let me know if I've missed anything. > > Thanks > > On Monday, May 25, 2020, 05:32:39 PM EDT, M Singh <mans2si...@yahoo.com> > wrote: > > > Hi Zhu Zhu: > > Just to clarify - from what I understand, EMR also has by default restart > times (I think it is 3). So if the EMR restarts the job - the job id is the > same since the job graph is the same. > > Thanks for the clarification. > > On Monday, May 25, 2020, 04:01:17 AM EDT, Yang Wang <danrtsey...@gmail.com> > wrote: > > > Just share some additional information. > > When deploying Flink application on Yarn and it exhausted restart policy, > then > the whole application will failed. If you start another instance(Yarn > application), > even the high availability is configured, we could not recover from the > latest > checkpoint because the clusterId(i.e. applicationId) has changed. > > > Best, > Yang > > Zhu Zhu <reed...@gmail.com> 于2020年5月25日周一 上午11:17写道: > > Hi M, > > Regarding your questions: > 1. yes. The id is fixed once the job graph is generated. > 2. yes > > Regarding yarn mode: > 1. the job id keeps the same because the job graph will be generated once > at client side and persist in DFS for reuse > 2. yes if high availability is enabled > > Thanks, > Zhu Zhu > > M Singh <mans2si...@yahoo.com> 于2020年5月23日周六 上午4:06写道: > > Hi Flink Folks: > > If I have a Flink Application with 10 restarts, if it fails and restarts, > then: > > 1. Does the job have the same id ? > 2. Does the automatically restarting application, pickup from the last > checkpoint ? I am assuming it does but just want to confirm. > > Also, if it is running on AWS EMR I believe EMR/Yarn is configured to > restart the job 3 times (after it has exhausted it's restart policy) . If > that is the case: > 1. Does the job get a new id ? I believe it does, but just want to confirm. > 2. Does the Yarn restart honor the last checkpoint ? I believe, it does > not, but is there a way to make it restart from the last checkpoint of the > failed job (after it has exhausted its restart policy) ? > > Thanks > > >