I was wonder if the new Kafka Consumer introduced in 0.9.0 ( http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client) allows for a higher number of partitions in a given cluster since it removes the zookeeper dependency. I understand the file descriptor and availability concerns discussed here: http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/.
The reason I ask is because we'd like to use partitioning to limit the impact of a message flood on our downstream consumers. If we can partition by a particular ID, it will isolate message floods from a given source into a single partition, which allows us to allocate a single consume to process that flood without affecting quality of service to the rest of the system. Unfortunately, partitioning this way could create millions of partitions, each only producing a few messages per minute with the exception that a few of the partitions will be sending thousands of messages per minute. I'm also open to suggestions on how others have solved the flooding / noisy neighbor problem in Kafka. Thanks, Dave Shepherd