To publish to a remote data center just configure the producers with the host/port of the remote datacenter. To ensure good throughput you may want to tune the socket send and receive buffers on the client and server to avoid small roundtrips: http://en.wikipedia.org/wiki/Bandwidth-delay_product
-Jay On Wed, Jul 10, 2013 at 6:57 AM, Calvin Lei <ckp...@gmail.com> wrote: > Thanks Jay. I thought of using the worldview architecture you suggested. > But since our consumers are also globally deployed, which means any new > messages arrive the worldview needs to be replicated back to the local DCs, > making the topology a bit complicated. > > Would you please elaborate on the remote write? How do I achieve it? > On Jul 10, 2013 1:08 AM, "Jay Kreps" <jay.kr...@gmail.com> wrote: > > > Ah, good question we really should add this to the documentation. > > > > We run a cluster per data center. All writes always go to the data-center > > local cluster. Replication to aggregate clusters that provide the "world > > wide" view is done with mirror maker. > > > > It is also fine to write to or read from a kafka cluster in a remote > colo, > > though obviously you have to think about the case where the cluster is > not > > accessible due to network access. > > > > Kafka is not designed to run a single cluster spread across > geographically > > disparate colos and you would see a few problems in that scenario. The > > first is that, as you noted, the latency will be terrible as it will > block > > on the slowest response from all datacenters. This could be avoided if > you > > lowered the request.required.acks to 1, but that would impact durability > > guarantees. The second problem is that Kafka will not remain available in > > the presence of network partitions so if the inter-datacenter link failed > > one datacenter would lose its cluster. Finally we have not done anything > to > > attempt to optimize partition placement by colo so you would not actually > > have redundancy between colos because we would often place all replicas > in > > a single colo. > > > > -Jay > > > > > > On Tue, Jul 9, 2013 at 9:34 PM, Calvin Lei <ckp...@gmail.com> wrote: > > > > > Folks, > > > Our application has multiple producers globally (region1, region2, > > > region3). If we group all the brokers together into one cluster, we > > notice > > > an obvious network latency if a broker replicates regionally with the > > > request.required.acks = -1. > > > > > > Is there any best practice for combating the network latency in the > > > deployment topology? Should we segregate the brokers regionally (one > > kafka > > > cluster per region) and set up MirrorMaker between the regions (region1 > > > <--> region2, region2 <--> region3, region1 <--> region3), total of 6 > > > mirror makes? > > > > > > > > > Thanks. > > > > > >