Hemant,
This behavior might be the result of the version of AK (Apache Kafka)
that you are using. Before AK 2.4 the default behavior for the
DefaultPartitioner was to load balance data production across the
partitions as you described. But it was found that this behavior would
cause performance problems to the batching strategy that each producer
does. Therefore, AK 2.4 introduced a new behavior into the
DefaultPartitioner called sticky partitioning. You can follow up in this
change reading up the KIP that was created for this change: *KIP-480
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner>*.
The only downside that I see in your workaround is if you are handling
connections to the partitions programmatically. That would make your
code fragile because if the # of partitions for the topic changes then
your code would not know this. Instead, just use the
RoundRobinPartitioner
<https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/producer/RoundRobinPartitioner.html>
explicitly in your producer:
```
configs.put("partitioner.class",
"org.apache.kafka.clients.producer.RoundRobinPartitioner");
```
Thanks,
-- Ricardo
On 6/18/20 12:38 AM, Hemant Bairwa wrote:
Hello All
I have a single producer service which is queuing message into a topic with
let say 12 partitions. I want to evenly distribute the messages across all
the partitions in a round robin fashion.
Even after using default partitioning and keeping key 'NULL', the messages
are not getting distributed evenly. Rather some partitions are getting none
of the messages while some are getting multiple.
One reason I found for this behaviour, somewhere, is that if there are
lesser number of producers than the number of partitions, it distributes
the messages to fewer partitions to limit many open sockets.
However I have achieved even distribution through code by first getting
total partition numbers and then passing partition number in the
incremental order along with the message into the producer record. Once the
partition number reaches end of the partition number then again resetting
the next partition number to zero.
Query:
1. Is there can be any downside of above approach used?
2. If yes, how to achieve even distribution of messages in an optimized way?