[ 
https://issues.apache.org/jira/browse/KAFKA-3896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15831285#comment-15831285
 ] 

Guozhang Wang edited comment on KAFKA-3896 at 1/20/17 6:37 AM:
---------------------------------------------------------------

The root cause of this issue is that before 
https://github.com/apache/kafka/pull/2389, we use delete-and-recreate with 
{{StreamsKafkaClient}}, and due to a bad design pattern, we are creating and 
deleting topics one-at-a-time, for this test, there are 31 topics to be 
created, as a result it is possible that the consumer could time out during the 
assignment in rebalance, and the next leader has to do the same again because 
of "makeReady" calls are one-at-a-time.

PR https://github.com/apache/kafka/pull/2389 remedies this problem as we are 
not calling delete any more and that is why we have not seen this issue after 
this PR. However we still need to make the {{InternalTopicManager}} more 
efficient.


was (Author: guozhang):
The root cause of this issue is that before 
https://github.com/apache/kafka/pull/2389, we use delete-and-recreate with 
{{StreamsKafkaClient}}, and due to a bad design pattern, we are creating and 
deleting topics one-at-a-time, for this test, there are 12 topics to be 
created, and each creating call will need to be coupled with a delete call with 
empty topic list as always, as a result it is possible that the consumer could 
time out during the assignment in rebalance, and the next leader has to do the 
same again because of "makeReady" calls are one-at-a-time.

PR https://github.com/apache/kafka/pull/2389 remedies this problem as we are 
not calling delete any more and that is why we have not seen this issue after 
this PR. However we still need to make the {{InternalTopicManager}} more 
efficient.

> Unstable test 
> KStreamRepartitionJoinTest.shouldCorrectlyRepartitionOnJoinOperations
> -----------------------------------------------------------------------------------
>
>                 Key: KAFKA-3896
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3896
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: unit tests
>            Reporter: Ashish K Singh
>            Assignee: Guozhang Wang
>             Fix For: 0.10.1.0
>
>
> {{KStreamRepartitionJoinTest.shouldCorrectlyRepartitionOnJoinOperations}} 
> seems to be unstable. A failure can be found 
> [here|https://builds.apache.org/job/kafka-trunk-git-pr-jdk7/4363/]. Could not 
> reproduce the test failure locally though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to