KevinyhZou created FLINK-28604:
----------------------------------
Summary: job failover and not restore from checkpoint in zookeeper
HA mode
Key: FLINK-28604
URL: https://issues.apache.org/jira/browse/FLINK-28604
Project: Flink
Issue Type: Bug
Components: Runtime / Checkpointing
Affects Versions: 1.14.2
Reporter: KevinyhZou
Attachments: image-2022-07-19-14-30-27-198.png
Run a job with flink 1.14.2 by configure the zookeeper ha
{code:java}
high-availability.storageDir: hdfs://testcluster/app/flink/ha
high-availability: zookeeper
high-availability.zookeeper.quorum: *****
high-availability.zookeeper.path.root: /flink{code}
when the zookeeper node restart, I see the JM failover with log "Close and
clean up all data for ZookeeperHaServices", So the ha data was cleaned when
the first JM shutdown.
when the second JM was started, the log was "No checkpoint found during
restore", and no checkpoint to restored .
>From debug, I find when job failover, it would goto the
>`ClusterEntryPoint.java` line 285
!image-2022-07-19-14-30-27-198.png!
and will set the `cleanupHaData` as true.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)