Hi Everyone,

since kafka 0.11.x supports exactly-once semantics, I want to be sure, that
it is possible to achieve exactly-once delivery across kafka clusters using
MirrorMaker.

We have got two locations with "primary" cluster in each location and for 
each location we have got one "aggregation" cluster which mirrors data from
all primary clusters.

Currently we deduplicate messages after copying data from aggregation kafka
to HDFS by separete YARN application. But in aggregation kafka duplicates 
remains. So I want to ensure that there are no duplicates and data loss in
kafka as well. In this case our deduplication yarn application could not be
use anymore.

If it is possible, how to configure MirrorMaker to achieve exactly-once 
delivery across primary and aggregation clusters?


Thanks and have a nice day, Jiri Humpolicek 

Reply via email to