ernal job monitoring
system to manually recover it.
Best,
Zhanghao Chen
From: Jean-Marc Paulin
Sent: Tuesday, June 11, 2024 16:04
To: Zhanghao Chen ; user@flink.apache.org
Subject: Re: Failed to resume from HA when the checkpoint has been deleted.
Thanks for you
scenario.
But maybe there isn't any.
Best regards
JM
From: Zhanghao Chen
Sent: Tuesday, June 11, 2024 03:56
To: Jean-Marc Paulin ; user@flink.apache.org
Subject: [EXTERNAL] Re: Failed to resume from HA when the checkpoint has been
deleted.
Hi, In this case
resume from HA when the checkpoint has been deleted.
Hi,
We have a 1.19 Flink streaming job, with HA enabled (ZooKeeper),
checkpoint/savepoint in S3. We had an outage and now the jobmanager keeps
restarting. We think it because it read the job id to be restarted from
ZooKeeper, but because we lost
Hi,
We have a 1.19 Flink streaming job, with HA enabled (ZooKeeper),
checkpoint/savepoint in S3. We had an outage and now the jobmanager keeps
restarting. We think it because it read the job id to be restarted from
ZooKeeper, but because we lost our S3 Storage as part of the outage it cannot
f