Re: kafka mirrormaker cross datacenter replication

Guozhang Wang Fri, 20 Mar 2015 10:20:05 -0700

Not sure if transactional messaging will help in this case, as at least for
now it is still targeted within a single DC, i.e. a "transaction" is only
defined within a Kafka cluster, not across clusters.


Guozhang

On Fri, Mar 20, 2015 at 10:08 AM, Jon Bringhurst <
jbringhu...@linkedin.com.invalid> wrote:

> Hey Kane,
>
> When mirrormakers loose offsets on catastrophic failure, you generally
> have two options. You can keep auto.offset.reset set to "latest" and handle
> the loss of messages, or you can have it set to "earliest" and handle the
> duplication of messages.
>
> Although we try to avoid duplicate messages overall, when failure happens,
> we (mostly) take the "earliest" path and deal with the duplication of
> messages.
>
> If your application doesn't treat messages as idempotent, you might be
> able to get away with something like couchbase or memcached with a TTL
> slightly higher than your Kafka retention time and use that to filter
> duplicates. Another pattern may be to deduplicate messages in Hadoop before
> taking action on them.
>
> -Jon
>
> P.S. An option in the future might be
> https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka
>
> On Mar 19, 2015, at 5:32 PM, Kane Kim <kane.ist...@gmail.com> wrote:
>
> > Hello,
> >
> > What's the best strategy for failover when using mirror-maker to
> replicate
> > across datacenters? As I understand offsets in both datacenters will be
> > different, how consumers should be reconfigured to continue reading from
> > the same point where they stopped without data loss and/or duplication?
> >
> > Thanks.
>
>


-- 
-- Guozhang

Re: kafka mirrormaker cross datacenter replication

Reply via email to