[ https://issues.apache.org/jira/browse/KAFKA-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335142#comment-14335142 ]
Jun Rao commented on KAFKA-1984: -------------------------------- There are actual two problems. 1. The way that we select an available partition has a bug. We return the index of the partition. However, since partitions are not sorted, the index may not match the actual partition. for (int i = 0; i < numPartitions; i++) { int partition = Utils.abs(counter.getAndIncrement()) % numPartitions; if (partitions.get(partition).leader() != null) { return partition; --> should be changed to return the actual 2. The way that we use counter in Partitioner may cause an unavailable partition to be selected when there are concurrent threads producing data to the same producer instance. Attaching a patch. > java producer may miss an available partition > --------------------------------------------- > > Key: KAFKA-1984 > URL: https://issues.apache.org/jira/browse/KAFKA-1984 > Project: Kafka > Issue Type: Bug > Components: producer > Affects Versions: 0.8.2.0 > Reporter: Jun Rao > Assignee: Jun Rao > Attachments: kafka-1984.patch > > > In Partitioner, we cycle through each partition to find one whose leader is > available. However, since the counter is shared among different caller > threads, the logic may not iterate through every partition. The impact is > that we could return an unavailable partition to the caller when there are > partitions available. If the partition is unavailable for a long time, the > producer may block due to bufferpool being full. -- This message was sent by Atlassian JIRA (v6.3.4#6332)