Hi Jason, you got it perfectly right. So everything that is not in an explicit state (or checkpointed in CheckpointedFunction#snapshotState) is lost on recovery. However, Flink applications always go through the complete life-cycle.
Note that I'd look into CheckpointedFunction if the side-information that you fetch from S3 is not changing and rather small. Best, Arvid On Tue, Feb 2, 2021 at 5:42 AM Raghavendar T S <raghav280...@gmail.com> wrote: > Flink is aware of all the tasks running in the cluster. If any of the > tasks fails, the failed task is restored using the checkpoint (only If the > task uses Flink Operator State). This scenario will not use savepoints. > Savepoints are same as checkpoints and the difference is that the > savepoints are created manually or when we manually cancel/stop a job. We > can then start the same job again by pointing to the savepoint. If we start > a job without a savepoint, the job will start with an empty operator state. > > Correct me If I am wrong. > > Other references: > > https://stackoverflow.com/questions/62935269/apache-flink-how-checkpoint-savepoint-works-if-we-run-duplicate-jobs-multi-te > > https://stackoverflow.com/questions/64605940/apache-flink-fsstatebackend-how-state-is-recovered-in-case-of-ta+sk-manager-f > > https://stackoverflow.com/questions/55613112/is-it-possible-to-recover-after-losing-the-checkpoint-coordinator/55615858#55615858 > > https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#retained-checkpoints > > Thank you > > > > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Virus-free. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#m_2793272209905006169_m_8015168246347643637_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > On Tue, Feb 2, 2021 at 4:07 AM Jason Liu <jasonli...@ucla.edu> wrote: > >> We currently have some logic to load data from S3 into memory in our >> Flink/Kinesis Analytics app. This happens before the RichFunction.open() >> function. >> >> We have two questions here and I can't find too much information in the >> apache.org website: >> >> 1. >> >> (More of a clarification) When Flink does checkpointing/savepointing >> only the state and the stream positions are saved right? The memory >> content >> won't be saved and restored. >> 2. >> >> When Flink restores from checkpoint/savepoint, does it still go >> through the application initialization phase? Basically will the code >> before the RichFunction' *open()* be run? If not, would the >> operators.open() functions run, when Flink restore from >> checkpoint/savepoint? >> >> Thanks, >> Jason >> > > > -- > Raghavendar T S > www.teknosrc.com > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Virus-free. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#m_2793272209905006169_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >