Hi Padma, a lot of this depends on what your existing systems are doing now and which are producing and who will be consuming or what new consumers and producer will do.
One approach is to use Apache Mesos http://mesos.apache.org/ to manage producer and consumer applications. The Kafka brokers could get their own machines or enforce some data locality setup for them. Some folks also (or instead) use YARN. Again, depends on how you are consuming and producing. If your not already running zookeeper then running it local on each broker is an IOPS calculation around your hardware and usage pattern. You also have more chance of multi mode data loss. If you lose a disk in one node and one service gets some load up and knocks the other out then you might cause a problem. So you may want to run zookeeper isolated (LXC is an option but so is KVM). Check out https://cwiki.apache.org/confluence/display/KAFKA/Performance+testing and http://zookeeper.apache.org/doc/r3.4.5/zookeeperAdmin.html and https://github.com/mesosphere/marathon /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop ********************************************/ On Dec 28, 2013, at 6:15 AM, padma priya chitturi <padmapriy...@gmail.com> wrote: > Hi All, > > I have a question on kafka setup. Suppose i have 3 node cluster and kafka > brokers running on all the nodes (one on each node), on which node should i > run the zookeeper ? Is it on one of the 3 nodes or on the node outside the > 3 nodes as the zookeeper coordinates brokers. > > Also where the producer and consumer processes need to be started? On the > client machines other than the 3 node cluster or on one of the node which > has kafka broker running ? > What is the appropriate combination of configuring kafka brokers, > producer, consumer and zookeeper on 3 node cluster ? > > Brief explanation from anyone is always appreciated. > > Thanks, > Padma Ch