Re: parallel processing of records in a Kafka consumer

2017-11-24 Thread Matthias J. Sax
Your understanding is correct. The simplest way to get more parallelism is to increase the number of partitions. There is some overhead for this, but it not too much. >> what you're writing is in sharp contrast with what I know... I guess, this target other messaging system: Kafka has a differen

Re: parallel processing of records in a Kafka consumer

2017-11-24 Thread Vincenzo D'Amore
Hi Matthias, what you're writing is in sharp contrast with what I know... I read that: "Kafka consumers are typically part of a consumer group. When multiple consumers are subscribed to a topic and belong to the same consumer group, each consumer in the group will receive messages from a differen

Re: parallel processing of records in a Kafka consumer

2017-11-23 Thread Matthias J. Sax
If you use "consumer groups" it is ensured that a single partitions in processed by one consumer (and one consumer can get multiple partitions assigned). Thus, this work out of the box and is easier to manager than parallelizing record processing in the consumer. Also, this does not work if you ne

Re: parallel processing of records in a Kafka consumer

2017-11-23 Thread cours.syst...@gmail.com
On 2017-11-22 23:15, "Matthias J. Sax" wrote: > I KafkaConsumer itself should be use single threaded. If you want to > parallelize processing, each thread should have it's own KafkaConsumer > instance and all consumers should use the same `group.id` in their > configuration. Load will be shared

Re: parallel processing of records in a Kafka consumer

2017-11-22 Thread Matthias J. Sax
I KafkaConsumer itself should be use single threaded. If you want to parallelize processing, each thread should have it's own KafkaConsumer instance and all consumers should use the same `group.id` in their configuration. Load will be shared over all running consumer automatically for this case.

parallel processing of records in a Kafka consumer

2017-11-22 Thread cours.syst...@gmail.com
I am testing a KafkaConsumer. How can I modify it to process records in parallel?