Purging will never prevent that it does not get replicated for sure. There will be always a case (error to purge etc) and then it is still replicated. You may reduce the probability but it will never be impossible.
Your application should be able to handle duplicated messages. > On 25. May 2018, at 08:54, Shantanu Deshmukh <shantanu...@gmail.com> wrote: > > Hello, > > We have cross data center replication. Using Kafka mirror maker we are > replicating data from our primary cluster to backup cluster. Problem arises > when we start operating from backup cluster, in case of drill or actual > outage. Data gathered at backup cluster needs to be reverse-replicated to > primary. To do that I can only think of two options. 1) Use a different CG > every time for mirror maker 2) Purge topics so that data sent by primary > doesn't get replicated back to primary again due to reverse replication. > > We have opted for purging Kafka topics which are under replication. I use > kafka-topics.sh --alter command to set retention of topic to 5 seconds to > purge data. But this doesn't see to be a fool proof mechanism. Thread > responsible for doing this every minute, and even if it runs it's not sure > to work as there are multiple conditions. That, segment should be full or > certain time should have passed to roll a new segment. It so happened > during one such drill to move to backup cluster, purge command was issued > and we waited for 5 minutes. Still data wasn't purged. Due to this we faced > data duplication when reverse replication started. > > Is there a better way to achieve this?