[jira] [Updated] (FLINK-30437) State incompatibility issue might cause state loss

Gyula Fora (Jira) Fri, 16 Dec 2022 09:18:10 -0800


     [ 
https://issues.apache.org/jira/browse/FLINK-30437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Gyula Fora updated FLINK-30437:
-------------------------------
    Description: 
Even though we set:
execution.shutdown-on-application-finish: false
execution.submit-failed-job-on-application-error: true
If there is a state incompatibility the jobmanager marks the Job failed, cleans 
up HA metada and restarts itself. This is a very concerning behaviour, but we 
have to fix this on the operator side to at least guarantee no state loss.

The solution is to harden the HA metadata check properly 

  was:
Even though we set:
execution.shutdown-on-application-finish: false
execution.submit-failed-job-on-application-error: true
If there is a state incompatibility the jobmanager marks the Job failed, cleans 
up HA metada and restarts itself. This is a very concerning behaviour, but we 
have to fix this on the operator side to at least guarantee no state loss.

The solution is to harden the HA metadata check properly (like we tried but 
failed in the past :) )


> State incompatibility issue might cause state loss
> --------------------------------------------------
>
>                 Key: FLINK-30437
>                 URL: https://issues.apache.org/jira/browse/FLINK-30437
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.2.0, kubernetes-operator-1.3.0
>            Reporter: Gyula Fora
>            Assignee: Gyula Fora
>            Priority: Blocker
>
> Even though we set:
> execution.shutdown-on-application-finish: false
> execution.submit-failed-job-on-application-error: true
> If there is a state incompatibility the jobmanager marks the Job failed, 
> cleans up HA metada and restarts itself. This is a very concerning behaviour, 
> but we have to fix this on the operator side to at least guarantee no state 
> loss.
> The solution is to harden the HA metadata check properly 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-30437) State incompatibility issue might cause state loss

Reply via email to