[ https://issues.apache.org/jira/browse/FLINK-37354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anton Solovev updated FLINK-37354: ---------------------------------- Description: Kubernetes Operator HealthCheck is not aligned with checkpoint interval when it is set via java api. {code:java} var checkpointConfig = env.getCheckpointConfig(); checkpointConfig.setCheckpointInterval(Duration.ofHours(2).toMillis()); {code} will lead to exceptions and therefore restarting the job manager {noformat} 2025-01-28 10:15:32,435 o.a.f.k.o.l.AuditUtils [INFO ][flink-jobs/job-1] >>> Event[Job] | Warning | RESTARTUNHEALTHYJOB | Restarting unhealthy job {noformat} nevertheless there are ways to mitigate this: # disable *kubernetes.operator.cluster.health-check.checkpoint-progress.enabled* # set *kubernetes.operator.cluster.health-check.checkpoint-progress.window* to two ours as well # never use java api for setting checkpoint interval was: Kubernetes Operator HealthCheck is not aligned with checkpoint interval when it is set via java api. {code:java} var checkpointConfig = env.getCheckpointConfig(); checkpointConfig.setCheckpointInterval(Duration.ofHours(2).toMillis()); {code} will lead to exceptions and therefore restarting the job manager {noformat} 2025-01-28 10:15:32,435 o.a.f.k.o.l.AuditUtils [INFO ][flink-jobs/job-1] >>> Event[Job] | Warning | RESTARTUNHEALTHYJOB | Restarting unhealthy job {noformat} nevertheless there are ways to mitigate this: # disable *kubernetes.operator.cluster.health-check.checkpoint-progress.enabled* # set *kubernetes.operator.cluster.health-check.checkpoint-progress.window* to two ours as well # never use java api for setting checkpoint interval > Kubernetes Operator HealthCheck compatibility > --------------------------------------------- > > Key: FLINK-37354 > URL: https://issues.apache.org/jira/browse/FLINK-37354 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator > Affects Versions: 1.10.0 > Reporter: Anton Solovev > Priority: Minor > > Kubernetes Operator HealthCheck is not aligned with checkpoint interval when > it is set via java api. > {code:java} > var checkpointConfig = env.getCheckpointConfig(); > checkpointConfig.setCheckpointInterval(Duration.ofHours(2).toMillis()); > {code} > will lead to exceptions and therefore restarting the job manager > {noformat} > 2025-01-28 10:15:32,435 o.a.f.k.o.l.AuditUtils [INFO > ][flink-jobs/job-1] >>> Event[Job] | Warning | RESTARTUNHEALTHYJOB | > Restarting unhealthy job > {noformat} > nevertheless there are ways to mitigate this: > # disable > *kubernetes.operator.cluster.health-check.checkpoint-progress.enabled* > # set *kubernetes.operator.cluster.health-check.checkpoint-progress.window* > to two ours as well > # never use java api for setting checkpoint interval -- This message was sent by Atlassian Jira (v8.20.10#820010)