Ah, good question we really should add this to the documentation. We run a cluster per data center. All writes always go to the data-center local cluster. Replication to aggregate clusters that provide the "world wide" view is done with mirror maker.
It is also fine to write to or read from a kafka cluster in a remote colo, though obviously you have to think about the case where the cluster is not accessible due to network access. Kafka is not designed to run a single cluster spread across geographically disparate colos and you would see a few problems in that scenario. The first is that, as you noted, the latency will be terrible as it will block on the slowest response from all datacenters. This could be avoided if you lowered the request.required.acks to 1, but that would impact durability guarantees. The second problem is that Kafka will not remain available in the presence of network partitions so if the inter-datacenter link failed one datacenter would lose its cluster. Finally we have not done anything to attempt to optimize partition placement by colo so you would not actually have redundancy between colos because we would often place all replicas in a single colo. -Jay On Tue, Jul 9, 2013 at 9:34 PM, Calvin Lei <ckp...@gmail.com> wrote: > Folks, > Our application has multiple producers globally (region1, region2, > region3). If we group all the brokers together into one cluster, we notice > an obvious network latency if a broker replicates regionally with the > request.required.acks = -1. > > Is there any best practice for combating the network latency in the > deployment topology? Should we segregate the brokers regionally (one kafka > cluster per region) and set up MirrorMaker between the regions (region1 > <--> region2, region2 <--> region3, region1 <--> region3), total of 6 > mirror makes? > > > Thanks. >