Any ideas, guys?

On Mon, May 2, 2022 at 6:11 PM Hemanga Borah <borah.hema...@gmail.com>
wrote:

> Hello,
>  We are attempting to port our Flink applications from one cloud provider
> to another.
>
>  These Flink applications consume data from Kafka topics and output to
> various destinations (Kafka or databases). The applications have states
> stored in them. Some of these stored states are aggregations, for example,
> at times we store hours (or days) worth of data to aggregate over time.
> Some other applications have cached information for data enrichment, for
> example, we store data in Flink state for days, so that we can join them
> with newly arrived data. The amount of data on the input topics is a lot,
> and it will be expensive to reprocess the data from the beginning of the
> topic.
>
>  As such, we want to retain the state of the application when we move to a
> different cloud provider so that we can retain the aggregations and cache,
> and do not have to start from the beginning of the input topics.
>
>  We are replicating the Kafka topics using MirrorMaker 2. This is our
> procedure:
>
>    - Replicate the input topics of each Flink application from source
>    cloud to destination cloud.
>    - Take a savepoint of the Flink application on the source cloud
>    provider.
>    - Start the Flink application on the destination cloud provider using
>    the savepoint from the source cloud provider.
>
>
> However, this does not work as we want because there is a difference in
> offset in the new topics in the new cloud provider (because of MirrorMaker
> implementation). The offsets of the new topic do not match the ones stored
> on the Flink savepoint, hence, Flink cannot map to the offsets of the new
> topic during startup.
>
> Has anyone tried to move clouds while retaining the Flink state?
>
> Thanks,
> Hemanga
>

Reply via email to