Hi, I am new with kafka and using kafka 0.8 to build a distributed queuing system in amazon web service cluster.
I have 4 machines Z1, B1, B2 and B3. 1 Zookeeper instance is running on Z1 and 3 different brokers are running on B1,B2 and B3 respectively. I am running 3 producers on 3 broker machines(B1, B2, B3) , one in each machine. Similarly 3 consumers on 3 broker machines, one in each machine. I created a topic , lets say 'test', with 12 partitions (test-0,test-1 ... test-11). 4 partitions in each broker machine. B1 - test-0,test-1,test-2,test-3 B2 - test-4,test-5,test-6,test-7 B3 - test-8,test-9,test-10,test-11 Zookeeper assigned broker in each machine as a leader to the partitions present in the same machine. Partition - leader test-0 - B1 test-1 - B1 test-2 - B1 test-3 - B1 test-4 - B2 test-5 - B2 test-6 - B2 test-7 - B2 test-8 - B3 test-9 - B3 test-10 - B3 test-11 - B3 All 3 producers are producing messages to this topic 'test' and all 3 consumers are trying to consume from the same topic 'test'. What I am trying to achieve here is , whenever a producer send a message to this topic , it should use the broker present in the same machine as producer and ultimately using the partitions in the same machine. Producer 1 ---> B1 ----> (test-0,test-1,test-2,test-3) -----> consumer 1 Producer 2 ---> B2 ----> (test-4,test-5,test-6,test-7) -----> consumer 2 Producer 3 ---> B3 ----> (test-8,test-9,test-10,test-11) -----> consumer 3 I am assuming this will reduce the inter-machine message transfer and will improve the performance. My questions are : 1) Does it really help in improving performance, when message is produced and consumed from same machine in a distributed environment. 2) I read that producer can fetch metadata from broker about all leader-partition mapping for a topic. It will help to pick the leader present in the same machine as producer. How a producer can fetch this metadata ? Could not find any implementation. Thanks in advance, Abhijeet