Re: MirrorMaker improvements

Jun Rao Wed, 25 Mar 2015 09:54:12 -0700

To amortize the long RTT across data centers, you can tune the TCP window
size by configuring a larger socket.receive.buffer.bytes in the consumer.


For the last one, it seems that you want identical mirroring. The tricky
thing is to figure out how to avoid duplicates when there is a failure. We
had some related discussion in the context of transactional messaging (
https://cwiki.apache.org/confluence/display/KAFKA/Transactional+Messaging+in+Kafka
).

Thanks,

Jun

On Tue, Mar 24, 2015 at 11:44 AM, vlad...@gmail.com <vlad...@gmail.com>
wrote:

> Dear all,
>
> I had a short discussion with Jay yesterday at the ACM meetup and he
> suggested writing an email regarding a few possible MirrorMaker
> improvements.
>
> At Turn, we have been using MirrorMaker for a a few months now to
> asynchronously replicate our key/value store data between our datacenters.
> In a way, our system is similar to Linkedin's Databus, but it uses Kafka
> clusters and MirrorMaker as its building blocks. Our overall message rate
> peaks at about 650K/sec and, when pushing data over high bandwidth delay
> product links, we have found some minor bottlenecks.
>
> The MirrorMaker process uses a standard consumer to pull data from a remote
> datacenter. This implies that it opens a single TCP connection to each of
> the remote brokers and muxes requests for different topics and partitions
> over this connection. While this is a good thing in terms of maintaining
> the congestion window open, over long RTT lines with rather high loss rate
> the congestion window will cap, in our case at just a few Mbps. While the
> overall line bandwidth is much higher, this means that we have to start
> multiple MirrorMaker processes (somewhere in the hundreds), in order to
> completely use the line capacity. Being able to pool multiple TCP
> connections from a single consumer to a broker would solve this
> complication.
>
> The standard consumer also uses the remote ZooKeeper in order to manage the
> consumer group. While consumer group management is moving closer to the
> brokers, it might make sense to move the group management to the local
> datacenter, since that would avoid using the long-distance connection for
> this purpose.
>
> Another possible improvement assumes a further constraint, namely that the
> number of partitions for a topic in both datacenters is the same. In my
> opinion, this is a sane constraint, since it preserves the Kafka ordering
> guarantees (per partition), instead of a simple guarantee per key. This
> kind of guarantee can be for example useful in a system that compares
> partition contents to reach eventual consistency using Merkle trees. If the
> number of partitions is equal, then offsets have the same meaning for the
> same partition in both clusters, since the data for both partitions is
> identical before the offset. This allows a simple consumer to just inquire
> the local broker and the remote broker for their current offsets and, in
> case the remote broker is ahead, copy the extra data to the local cluster.
> Since the consumer offsets are no longer bound to the specific partitioning
> of a single remote cluster, the consumer could pull from one of any number
> of remote clusters, BitTorrent-style, if their offsets are ahead of the
> local offset. The group management problem would reduce to assigning local
> partitions to different MirrorMaker processes, so the group management
> could be done locally also in this situation.
>
> Regards,
> Vlad
>

Re: MirrorMaker improvements

Reply via email to