I was wonder if the new Kafka Consumer introduced in 0.9.0 (
http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client)
allows for a higher number of partitions in a given cluster since it
removes the zookeeper dependency. I understand the file descriptor and
availability concerns discussed here:
http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/.


The reason I ask is because we'd like to use partitioning to limit the
impact of a message flood on our downstream consumers. If we can partition
by a particular ID, it will isolate message floods from a given source into
a single partition, which allows us to allocate a single consume to process
that flood without affecting quality of service to the rest of the system.
Unfortunately, partitioning this way could create millions of partitions,
each only producing a few messages per minute with the exception that a few
of the partitions will be sending thousands of messages per minute.

I'm also open to suggestions on how others have solved the flooding / noisy
neighbor problem in Kafka.

Thanks,
Dave Shepherd

Reply via email to