I don't think replication is ideal for creating single clusters spanning DCs for at least a couple reasons: the replica assignment strategy is currently not rack or DC-aware although that can be addressed by manually creating topics. Also, network glitches and latencies which are more likely in a cross-DC link could result in more frequent and prolonged periods of under-replication, higher latencies in the controller-broker RPCs, etc. A better approach would be to set up a mirror cluster - see https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring+%28MirrorMaker%29(needs a few updates for 0.8)
Also, for your other question on the producer: a producer can only deliver messages to the leader of a partition. Thanks, Joel On Sun, Jun 23, 2013 at 6:10 PM, Mark Farnan <devm...@petrolink.com> wrote: > Howdy, > > Is the replication factor system in Kafka 0.8 suitable for creating > single clusters which span across data centers ? (up to 3) > > I am looking for a system where I don't loose messages, and can > effectively 'fail over' to a different datacenter for processing if/when > the primary goes down. If I read correctly, any message delivered to a > Kafka broker, will be copied to its Replica/s. And a producer could > deliver messages to any broker in the same replica set. > > Is that correct ? > > > I am aware there are several zookeeper issues around multi DC support > which I need to sort out, so this question is specific for the Kafka > portion. > > Note: My main consumer from Kafka will be STORM. > > Regard > > Mark. > > > > >