Re: Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
I found out someone else reported this and found a workaround: https://issues.apache.org/jira/browse/FLINK-32241 Am Mo., 10. Juli 2023 um 16:45 Uhr schrieb Alexis Sarda-Espinosa < sarda.espin...@gmail.com>: > Hi again, > > I have found out that this issue occurred in 3 different clusters, and 2 >

Re: Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
Hi again, I have found out that this issue occurred in 3 different clusters, and 2 of them could not recover after restarting pods, it seems state was completely corrupted afterwards and was thus lost. I had never seen this before 1.17.1, so it might be a newly introduced problem. Regards, Alexis

Checkpointing and savepoints can never complete after inconsistency

2023-07-10 Thread Alexis Sarda-Espinosa
Hello, we have just experienced a weird issue in one of our Flink clusters which might be difficult to reproduce, but I figured I would document it in case some of you know what could have gone wrong. This cluster had been running with Flink 1.16.1 for a long time and was recently updated to 1.17.