Re: max number of partitions with v0.9.0

Benjamin Manns Tue, 03 May 2016 20:43:20 -0700

>From my knowledge (beginner's) each partition still requires at least a
file selector on the Kafka brokers. The new consumer structure means
consumers won't store data in Zookeeper, but topics and partitions still do.


What I would do is key by your ID and place a rate limiting stream
processor in front of your heavier processors. This could be a windowed
task that counts how many messages have been sent in the last few seconds
or minutes. For under-limit IDs send to a high priority topic. For over
limit, a lower priority topic.


Ben

On Tuesday, May 3, 2016, David Shepherd <dtsheph...@gmail.com> wrote:

> I was wonder if the new Kafka Consumer introduced in 0.9.0 (
>
> http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
> )
> allows for a higher number of partitions in a given cluster since it
> removes the zookeeper dependency. I understand the file descriptor and
> availability concerns discussed here:
>
> http://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
> .
>
>
> The reason I ask is because we'd like to use partitioning to limit the
> impact of a message flood on our downstream consumers. If we can partition
> by a particular ID, it will isolate message floods from a given source into
> a single partition, which allows us to allocate a single consume to process
> that flood without affecting quality of service to the rest of the system.
> Unfortunately, partitioning this way could create millions of partitions,
> each only producing a few messages per minute with the exception that a few
> of the partitions will be sending thousands of messages per minute.
>
> I'm also open to suggestions on how others have solved the flooding / noisy
> neighbor problem in Kafka.
>
> Thanks,
> Dave Shepherd
>


-- 
Benjamin Manns
benma...@gmail.com
(434) 321-8324

Re: max number of partitions with v0.9.0

Reply via email to