Hello Hemanga, MirrorMaker can cause havoc in many respects, for one, it does not have strict exactly-once.semantics…
The way I would tackle this problem (and have done in similar situaltions): * For the source topics that need to be have exactly-once-semantics and that are not intrinsically idempotent: * Add one extra operator after the source that deduplicates events by unique id for a rolling time range (on the source cloud provider) * Take a savepoint after the rolling time-range has passed (at least once completely) * Move your job to the target cloud provider * Reconfigure the resp. source with a new kafka consumer group.id, * Change the uid() of the resp. kafka source, * Configure start-by-timestamp for the resp. source with a timestamp that lies within the rolling time range (of above) * Configure the job to ignore recovery for state that does not have a corresponding operator in the job (the previous kafka source uid()s) * Start the job on new cloud provider, wait for it to pick up/back-fill * Take a savepoint * Remove deduplication operator if that causes too much load/latency/whatever This scheme sounds more complicated than it really is … and has saved my sanity quite a number of times 😊 Good luck and ready to answer more details Thias From: Hemanga Borah <borah.hema...@gmail.com> Sent: Tuesday, May 3, 2022 3:12 AM To: user@flink.apache.org Subject: Migrating Flink apps across cloud with state Hello, We are attempting to port our Flink applications from one cloud provider to another. These Flink applications consume data from Kafka topics and output to various destinations (Kafka or databases). The applications have states stored in them. Some of these stored states are aggregations, for example, at times we store hours (or days) worth of data to aggregate over time. Some other applications have cached information for data enrichment, for example, we store data in Flink state for days, so that we can join them with newly arrived data. The amount of data on the input topics is a lot, and it will be expensive to reprocess the data from the beginning of the topic. As such, we want to retain the state of the application when we move to a different cloud provider so that we can retain the aggregations and cache, and do not have to start from the beginning of the input topics. We are replicating the Kafka topics using MirrorMaker 2. This is our procedure: * Replicate the input topics of each Flink application from source cloud to destination cloud. * Take a savepoint of the Flink application on the source cloud provider. * Start the Flink application on the destination cloud provider using the savepoint from the source cloud provider. However, this does not work as we want because there is a difference in offset in the new topics in the new cloud provider (because of MirrorMaker implementation). The offsets of the new topic do not match the ones stored on the Flink savepoint, hence, Flink cannot map to the offsets of the new topic during startup. Has anyone tried to move clouds while retaining the Flink state? Thanks, Hemanga Diese Nachricht ist ausschliesslich für den Adressaten bestimmt und beinhaltet unter Umständen vertrauliche Mitteilungen. Da die Vertraulichkeit von e-Mail-Nachrichten nicht gewährleistet werden kann, übernehmen wir keine Haftung für die Gewährung der Vertraulichkeit und Unversehrtheit dieser Mitteilung. Bei irrtümlicher Zustellung bitten wir Sie um Benachrichtigung per e-Mail und um Löschung dieser Nachricht sowie eventueller Anhänge. Jegliche unberechtigte Verwendung oder Verbreitung dieser Informationen ist streng verboten. This message is intended only for the named recipient and may contain confidential or privileged information. As the confidentiality of email communication cannot be guaranteed, we do not accept any responsibility for the confidentiality and the intactness of this message. If you have received it in error, please advise the sender by return e-mail and delete this message and any attachments. Any unauthorised use or dissemination of this information is strictly prohibited.