We do something like A (though I'm not sure I understand B): http://kafka.apache.org/documentation.html#datacenters
Essentially what we wanted was that each datacenter stood alone so that we would not lose data if the datacenters became disconnected. Network partitions within our data centers are extremely rare but between datacenters relatively common. -Jay On Tue, Aug 20, 2013 at 10:35 AM, Andrew Otto <o...@wikimedia.org> wrote: > Hi all! > > Wikimedia is investigating how best to set up Broker clusters in multiple > data centers. Our main analytics Broker cluster is currently in our main > datacenter. It is possible for all of the main DC's frontend producers to > produce directly to our analytics cluster, but we're not sure if this is a > best practice. So! What does LinkedIn recommend? > > Option A: N + 1 clusters. > - N production Broker Clusters (1 for each DC). > - +1 aggregator/analytics Broker cluster that mirrors all of the > production clusters. > > - Option B: N total Broker clusters. > - Frontend producers in the main cluster produce directly to the > aggregator/analytics cluster. > - Other DC's clusters are mirrored to the aggregator/analytics cluster. > > Thanks! > -Andrew