Hey, Guys, I am using the high level consumer. I have a daemon process that checks the lag for a topic. Suppose I have a topic with 5 partitions, and partition 0, 1 has lag of 0, while the other 3 all have lags. In this case, should I best start 3 threads, or 5 threads to read from this topic again to achieve best performance? I am currently using 3 threads in this case, but it seems that each thread still first try to get hold of partition0, 1 first.(which seems unnecessary in my case)
Another question is that I am currently using a signal thread to spawn different thread to read from kafka. So if a topic has 5 partitions, 5 signals will be sent, and 5 different threads will start polling the topic in the following manner: Map<String, Integer> topicCountMap = new HashMap<String, Integer>(); topicCountMap.put(kafkaTopic, new* Integer(1))*; Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer .createMessageStreams(topicCountMap); List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(kafkaTopic); KafkaStream<byte[], byte[]> stream = streams.*get(0);* ConsumerIterator<byte[], byte[]> it = stream.iterator(); As you can see, I specify *Integer(1) to ensure there is only one stream in the polling thread.* But in my testing, I am using: Map<String, Integer> topicCountMap = new HashMap<String, Integer>(); topicCountMap.put(topic, *new Integer(numOfPartitions));* Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer .createMessageStreams(topicCountMap); List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic); executor = Executors.newFixedThreadPool(*numOfPartitions*); for (final KafkaStream stream : streams) { executor.submit(new ConsumerTest(stream, threadNumber,this.targetTopic)); threadNumber++; } Are these two methods fundamentally the same, or the later one is preferred? Thanks much! Chen