Thanks Jun, I see your point. I will update this thread if I find anything interesting.
On Tue, Mar 4, 2014 at 8:30 PM, Jun Rao <jun...@gmail.com> wrote: > ZK observers only help the latency on the reads, but not writes. Kafka > consumer need to write to ZK for owning partitions and committing offsets. > > Consuming from a remote DC is actually fine though. Kafka consumer only > uses ZK during rebalances, which should be rare. You probably want to set a > larger ZK session timeout. The real issue is on cross DC bandwidth. If each > topic is only going to be consumed once, running remote consumers is fine. > However, if a topic is going to be consumed multiple times, it would be > better to run a MirrorMaker to move the remote data to a local Kafka > cluster and let all consumers consume locally. > > Thanks, > > Jun > > > On Tue, Mar 4, 2014 at 7:54 PM, Arup Malakar <amala...@gmail.com> wrote: > > > Thanks for the reply Guozhang. See inline comments: > > > > > > On Tue, Mar 4, 2014 at 5:54 PM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > > > > 1) is hard to do since ZK stores both broker and consumer metadata, and > > > consumers use ZK to also discover brokers. > > > > > > > Thanks for the confirmation. > > > > Not sure if I understand 2), > > > > > > The concern of connecting to zookeeper in a different datacenter is that > > zookeeper is very sensitive to network latencies. > > If heartbeats are not sent by the client within the timeout period the > > session expires. So the client would require to reconnect, this was my > main > > concern of consumign from a kafka cluster across data centers. Zookeeper > > has a notion of observer node, which acts as a bridge between datacenters > > for zookeeper. In the use case mentioned I could have a zookeeper > observer > > node in the same datacenter node as the consumers. > > > > More: http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html > > > > I was wondering if anyone has used zookeeper observer node for kafka > > consumers and if there is any downside to it. > > > > > > > 3) is essentially equivalent to having a > > > remote consumer since the KafakMirror actually have a consumer > consuming > > > from the source (mostly remote) brokers and a producer producing to > dest > > > (mostly local) brokers. I would recommend you first trying to have > remote > > > consumers as is and see if the latency is acceptable. > > > > > > > Does MirrorMaker have same level of interaction with zookeeper as > consumer > > groups? If yes then you are right, using MirrorMaker would be same as > > having remote consumers. Yes I will check the latency of consuming from a > > different datacenter acceptable or not for our usecase. > > > > Thanks, > > Arup > > > -- Arup Malakar