Not sure if transactional messaging will help in this case, as at least for now it is still targeted within a single DC, i.e. a "transaction" is only defined within a Kafka cluster, not across clusters.
Guozhang On Fri, Mar 20, 2015 at 10:08 AM, Jon Bringhurst < jbringhu...@linkedin.com.invalid> wrote: > Hey Kane, > > When mirrormakers loose offsets on catastrophic failure, you generally > have two options. You can keep auto.offset.reset set to "latest" and handle > the loss of messages, or you can have it set to "earliest" and handle the > duplication of messages. > > Although we try to avoid duplicate messages overall, when failure happens, > we (mostly) take the "earliest" path and deal with the duplication of > messages. > > If your application doesn't treat messages as idempotent, you might be > able to get away with something like couchbase or memcached with a TTL > slightly higher than your Kafka retention time and use that to filter > duplicates. Another pattern may be to deduplicate messages in Hadoop before > taking action on them. > > -Jon > > P.S. An option in the future might be > https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka > > On Mar 19, 2015, at 5:32 PM, Kane Kim <kane.ist...@gmail.com> wrote: > > > Hello, > > > > What's the best strategy for failover when using mirror-maker to > replicate > > across datacenters? As I understand offsets in both datacenters will be > > different, how consumers should be reconfigured to continue reading from > > the same point where they stopped without data loss and/or duplication? > > > > Thanks. > > -- -- Guozhang