Re: Apache Flink - Question about application restart

M Singh Thu, 28 May 2020 07:06:11 -0700

 Hi Till/Zhu/Yang:  Thanks for your replies.
So just to clarify - the job id remains same if the job restarts have not been 
exhausted.  Does Yarn also resubmit the job in case of failures and if so, then 
is the job id different.
Thanks    On Wednesday, May 27, 2020, 10:05:40 AM EDT, Till Rohrmann 
<trohrm...@apache.org> wrote:  
 
 Hi,
if you submit the same job multiple times, then it will get every time a 
different JobID assigned. For Flink, different job submissions are considered 
to be different jobs. Once a job has been submitted, it will keep the same 
JobID which is important in order to retrieve the checkpoints associated with 
this job.
Cheers,Till
On Tue, May 26, 2020 at 12:42 PM M Singh <mans2si...@yahoo.com> wrote:


 Hi Zhu Zhu:
I have another clafication - it looks like if I run the same app multiple times 
- it's job id changes.  So it looks like even though the graph is the same the 
job id is not dependent on the job graph only since with different runs of the 
same app it is not the same.
Please let me know if I've missed anything.
Thanks
    On Monday, May 25, 2020, 05:32:39 PM EDT, M Singh <mans2si...@yahoo.com> 
wrote:  
 
  Hi Zhu Zhu:
Just to clarify - from what I understand, EMR also has by default restart times 
(I think it is 3). So if the EMR restarts the job - the job id is the same 
since the job graph is the same. 
Thanks for the clarification.
    On Monday, May 25, 2020, 04:01:17 AM EDT, Yang Wang <danrtsey...@gmail.com> 
wrote:  
 
 Just share some additional information.
When deploying Flink application on Yarn and it exhausted restart policy, 
thenthe whole application will failed. If you start another instance(Yarn 
application),even the high availability is configured, we could not recover 
from the latestcheckpoint because the clusterId(i.e. applicationId) has changed.

Best,Yang
Zhu Zhu <reed...@gmail.com> 于2020年5月25日周一 上午11:17写道：

Hi M,
Regarding your questions:1. yes. The id is fixed once the job graph is 
generated.2. yes
Regarding yarn mode:1. the job id keeps the same because the job graph will be 
generated once at client side and persist in DFS for reuse2. yes if high 
availability is enabled

Thanks,Zhu Zhu
M Singh <mans2si...@yahoo.com> 于2020年5月23日周六 上午4:06写道：

Hi Flink Folks:
If I have a Flink Application with 10 restarts, if it fails and restarts, then:
1. Does the job have the same id ?2. Does the automatically restarting 
application, pickup from the last checkpoint ? I am assuming it does but just 
want to confirm.
Also, if it is running on AWS EMR I believe EMR/Yarn is configured to restart 
the job 3 times (after it has exhausted it's restart policy) .  If that is the 
case:1. Does the job get a new id ? I believe it does, but just want to 
confirm.2. Does the Yarn restart honor the last checkpoint ?  I believe, it 
does not, but is there a way to make it restart from the last checkpoint of the 
failed job (after it has exhausted its restart policy) ?
Thanks

Re: Apache Flink - Question about application restart

Reply via email to