Re: Flink SQL and checkpoints and savepoints

Timo Walther Mon, 18 Jan 2021 01:58:38 -0800

I would check the past Flink Forward conference talks and blog posts. Acouple of companies have developed connectors or modified existingconnectors to make this work. Usually, based on event timestamps or someexternal control stream (DataStream API around the actual SQL pipelinefor handling this).


Also there is FLIP-150 which goes into this direction.


https://cwiki.apache.org/confluence/display/FLINK/FLIP-150%3A+Introduce+Hybrid+Source

Regards,
Timo


On 18.01.21 10:40, Dan Hill wrote:

Thanks Timo!

The reason makes sense.

Do any of the techniques make it easy to support exactly once?

I'm inferring what is meant by dry out. Are there any documentedpatterns for it? E.g. sending data to new kafka topics between releases?

On Mon, Jan 18, 2021, 01:04 Timo Walther <twal...@apache.org<mailto:twal...@apache.org>> wrote:


    Hi Dan,

    currently, we cannot provide any savepoint guarantees between releases.
    Because of the nature of SQL that abstracts away runtime operators, it
    might be that a future execution plan will look completely different
    and
    thus we cannot map state anymore. This is not avoidable because the
    optimizer might get smarter when adding new optimizer rules.

    For such cases, we recommend to dry out the old pipeline and/or warm up
    a new pipeline with historic data when upgrading Flink. A change in
    columns sometimes works but even this depends on the used operators.

    Regards,
    Timo


    On 18.01.21 04:46, Dan Hill wrote:
     > How well does Flink SQL work with checkpoints and savepoints?  I
    tried
     > to find documentation for it in v1.11 but couldn't find it.
     >

> E.g. what happens if the Flink SQL is modified between releases?New

     > columns?  Change columns?  Adding joins?
     >
     >

Re: Flink SQL and checkpoints and savepoints

Reply via email to