Offset preserving mirroring would be a great addition, allowing for offsite
backups which closely match production.  It would be much cleaner than
running rsync repeatedly.

Regarding the broader discussion of maximizing availability while
minimizing operational complexity, I've been considering the following
(please feel free to share your thoughts):

- multi-datacenter is ideal since a whole range of outage problems can
occur at rack-level or datacenter-level (power, network, natural disaster)

- consider avoiding or augmenting replication since it's intended for
same-datacenter deployment

- deploy Kafka in two datacenters with identical brokers and replication
factor 1

- producers will send to one broker, if connection exception is thrown,
then send to the other

- consumers will implement SimpleConsumer and read from both broker pairs

If the application layer can deal with producing and consuming from Kafka
pairs in this way, it seems to me you get multi-region backed availability
with less instances/moving parts ?


Thanks!




On Tue, Jun 18, 2013 at 4:22 AM, Jun Rao <jun...@gmail.com> wrote:

> We can look into offset preserving mirroring the the future. Note that even
> with this approach, the offsets in the target cluster will be slightly
> behind those in the source cluster since the mirroring will be async. Since
> not all offsets will be preserved.
>
> Thanks,
>
> Jun
>
>
> On Sun, Jun 16, 2013 at 3:02 PM, Ran RanUser <ranuse...@gmail.com> wrote:
>
> > I've been researching Kafka for our requirements and am trying to figure
> > out the best way to implement multi-region failover (lowest complexity).
> >
> > One requirement we have is that the offsets of the backup must match the
> > primary.  As I understand it, MirrorMaker does not (currently) guarantee
> > that the target Kafka instance will have the same log offsets as the
> source
> > Kafka instance.  Our message processing pipeline will be strictly relying
> > on topic-broker-partition-offset to avoid re-processing messages.
> >
> > Here's what I'm leaning towards, please share any crticism or thoughts:
> >
> > Assuming:
> >
> > - Two regions, Region1 (primary) and Region2 (backup)
> >
> > - Region2 must have the same offsets per topic-broker-partition-offset
> > state
> >
> > - A few minutes of lost messages can be tolerated if Region1 is ever
> lost.
> >
> > - That it would be a mistake to attempt Kafka replication across regions
> > and maintain a Zookeeper cluster across regions (because they weren't
> > designed for the higher latency and link-loss issues and that there could
> > be operational edge case bugs we won't catch/understand, etc)
> >
> > - That Region1 has multiple topics, brokers, partitions, replicas and a
> > Zookeeper cluster.  Only Region1 is in use operationally (gets all
> producer
> > and consumer traffic).
> >
> > - That Region2 has the same configuration but receives no operational
> > traffic (no producers, no consumers) but gets periodic rsync from Region1
> >
> > - If Region1 is lost, we will start Kafka in Region2, it should startup
> at
> > the appropriate offset (from last rysnc backup).  Producers will be
> > instructed to use Region2.
> >
> > - Region2 is now the new primary Kafka instance until we decide to switch
> > back to Region1.
> >
> > This is quite simple and there is more data loss than I'd like, but the
> > loss would be acceptable for our use case, considering the loss of
> Region1
> > should be a rare event (if ever).
> >
> > Questions:
> >
> > 1. Do you see any pitfalls or better ways to proceed?  It seems this
> Kafka
> > feature request would be a better solution (adding a MirrorMaker mode to
> > maintain offsets https://issues.apache.org/jira/browse/KAFKA-658 ) one
> > day.
> >
> > 2. What is the Rsync backup is interrupted when Region1 is lost?  Is
> there
> > the possibility the 2nd Kafka instance could be left in an un-workable
> > state?  For example, if a .log file is copied, but the corresponding
> .index
> > is not completed.  Can the .index file be re-created?  It appears it can
> in
> > 8.1
> >
> >
> https://issues.apache.org/jira/browse/KAFKA-561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> >
> >
> > Thank you!
> >
>

Reply via email to