Hi Tom,
Thank you for your help. I have only one broker. I used kafka production
server configuration listed in kafka's documentation page:
http://kafka.apache.org/documentation.html#prodconfig . I have increased
the flush interval and number of messages to prevent the disk from becoming
the bottleneck. For the consumers, I used the following configurations:
Properties props = new Properties();
props.put("enable.auto.commit", "true");
props.put("request.timeout.ms", "500000000");
props.put("session.timeout.ms", "50000000");
props.put("connections.max.idle.ms", "50000000");
props.put("fetch.min.bytes", 1);
props.put("fetch.max.wait.ms", "500");
props.put("group.id", "gid");
props.put("key.deserializer", StringDeserializer.class.getName());
props.put("value.deserializer", StringDeserializer.class.getName());
props.put("max.partition.fetch.bytes", "128");
consumer = new KafkaConsumer<String, String>(props);

I am setting the max.partition.fetch.bytes to 128, because I only want to
process one record for each poll.

Thank a lot for your help. I really appreciate it.

On Tue, May 24, 2016 at 7:51 AM, Tom Crayford <tcrayf...@heroku.com> wrote:

> What's your server setup for the brokers and consumers? Generally I'd
> expect something to be exhausted here and that to end up being the
> bottleneck.
>
> Thanks
>
> Tom Crayford
> Heroku Kafka
>
> On Mon, May 23, 2016 at 7:32 PM, Yazeed Alabdulkarim <
> y.alabdulka...@gmail.com> wrote:
>
> > Hi,
> > I am running simple experiments to evaluate the scalability of Kafka
> > consumers with respect to the number of partitions. I assign every
> consumer
> > to a specific partition. Each consumer polls the records in its assigned
> > partition and print the first one, then polls again from the offset of
> the
> > printed record until all records are printed. Prior to running the test,
> I
> > produce 10 Million records evenly among partitions. After running the
> test,
> > I measure the time it took for the consumers to print all the records. I
> > was expecting Kafka to scale as I increase the number of
> > consumers/partitions. However, the scalability diminishes as I increase
> the
> > number of partitions/consumers, beyond certain number. Going from 1,2,4,8
> > the scalability is great as the duration of the test is reduced by the
> > factor increase of the number of partitions/consumers. However, beyond 8
> > consumers/partitions, the duration of the test reaches a steady state. I
> am
> > monitoring the resources of my server and didn't see any bottleneck. Am I
> > missing something here? Shouldn't Kafka consumers scale with the number
> of
> > partitions?
> > --
> > Best Regards,
> > Yazeed Alabdulkarim
> >
>



-- 
Best Regards,
Yazeed Alabdulkarim

Reply via email to