Hi Emmanuel, You can firstly run a kafka producer perf (bin/kafka-producer-perf-test.sh) test with your storm consumers and kafka consumer perf (bin/ kafka-consumer-perf.test.sh) test with your own producers respectively to see if the bottleneck is really in kafka.
Thanks, Manu Zhang On Mon, Mar 23, 2015 at 6:31 AM Harsha <harsh...@fastmail.fm> wrote: > Hi Emmanuel, > Can you post your kafka server.properties and in your producer are > your distributing your messages into all kafka topic partitions. > > -- > Harsha > > > On March 20, 2015 at 12:33:02 PM, Emmanuel (ele...@msn.com) wrote: > > Kafka on test cluster: > 2 Kafka nodes, 2GB, 2CPUs > 3 Zookeeper nodes, 2GB, 2CPUs > > Storm: > 3 nodes, 3CPUs each, on the same Zookeeper cluster as Kafka. > > 1 topic, 5 partitions, replication x2 > > Whether I use 1 slot for the Kafka Spout or 5 slots (=#partitions), the > throughput seems about the same. > > I can't seem to read much more than 7000 events/sec. > > Same, on writing, I set a generator spout and write to Kafka on 1 > topic/5partitions with a KafkaBolt with parallelism of 5 and I can't seem > to write much more than 7000 events/sec. > > Meanwhile, none of the CPU, IO or MEM seem to be a bottleneck: > In Storm UI the bolts all show capacities <50%, sometimes much less (in > the single digit %) > Top shows CPUs being used at ~30% max > > We have another process moving data from Kafka to Cassandra and it gives > similar throughput, so it seems related to Kafka more than Storm. > > > What could be wrong? > Sorry for the generic question but I would appreciate any hint on where to > start to troubleshoot. > > Thanks