Avoid duplicate messages while restarting a job for an application upgrade

Antoine Philippot Mon, 02 Oct 2017 06:35:57 -0700

Hi,

I'm working on a flink streaming app with a kafka09 to kafka09 use case
which handles around 100k messages per seconds.


To upgrade our application we used to run a flink cancel with savepoint
command followed by a flink run with the previous saved savepoint and the
new application fat jar as parameter. We notice that we can have more than
50k of duplicated messages in the kafka sink wich is not idempotent.

This behaviour is actually problematic for this project and I try to find a
solution / workaround to avoid these duplicated messages.

The JobManager indicates clearly that the cancel call is triggered once the
savepoint is finished, but during the savepoint execution, kafka source
continue to poll new messages which will not be part of the savepoint and
will be replayed on the next application start.

I try to find a solution with the stop command line argument but the kafka
source doesn't implement StoppableFunction (
https://issues.apache.org/jira/browse/FLINK-3404) and the savepoint
generation is not available with stop in contrary to cancel.

Is there an other solution to not process duplicated messages for each
application upgrade or rescaling ?

If no, has someone planned to implement it? Otherwise, I can propose a pull
request after some architecture advices.

The final goal is to stop polling source and trigger a savepoint once
polling stopped.

Thanks

Avoid duplicate messages while restarting a job for an application upgrade

Reply via email to