Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-08 Thread Yun Tang
@flink.apache.org ; Vora, Jainik ; Deshpande, Omkar Subject: Re: Flink not restoring from checkpoint when job manager fails even with HA Hi Yun I'll put my question in other way: 1) First time I deployed my job and got an ID from flink, let's say "abcdef" ( Somehow i remem

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-08 Thread Vijay Bhaskar
nd to resume from retained checkpoint. [1] > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint > > > Best > Yun Tang > ---------- > *From:* Kathula, Sandeep > *Sent:* Sunday, June 7, 2020 4:27 > *To:

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
, 2020 12:42 To: Yun Tang Cc: Kathula, Sandeep ; user@flink.apache.org ; Vora, Jainik ; Deshpande, Omkar Subject: Re: Flink not restoring from checkpoint when job manager fails even with HA Hi Yun If we start using the special Job ID and redeploy the job, then after deployment, will it going

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Vijay Bhaskar
-- > *From:* Kathula, Sandeep > *Sent:* Sunday, June 7, 2020 4:27 > *To:* user@flink.apache.org > *Cc:* Vora, Jainik ; Deshpande, Omkar < > omkar_deshpa...@intuit.com> > *Subject:* Flink not restoring from checkpoint when job manager fails > even with HA > >

Re: Flink not restoring from checkpoint when job manager fails even with HA

2020-06-07 Thread Yun Tang
om: Kathula, Sandeep Sent: Sunday, June 7, 2020 4:27 To: user@flink.apache.org Cc: Vora, Jainik ; Deshpande, Omkar Subject: Flink not restoring from checkpoint when job manager fails even with HA Hi, We are running Flink in K8S. We used https://ci.apache.org/projects/flink/flink-d

Flink not restoring from checkpoint when job manager fails even with HA

2020-06-06 Thread Kathula, Sandeep
Hi, We are running Flink in K8S. We used https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/jobmanager_high_availability.html to set high availability. We set max number of retries for a task to 2. After task fails twice and then the job manager fails. This is expected. But