[ https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15616983#comment-15616983 ]
ASF GitHub Bot commented on KAFKA-3559: --------------------------------------- GitHub user enothereska reopened a pull request: https://github.com/apache/kafka/pull/2032 KAFKA-3559: Recycle old tasks when possible You can merge this pull request into a Git repository by running: $ git pull https://github.com/enothereska/kafka KAFKA-3559-onPartitionAssigned Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2032.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2032 ---- commit 28a50430e4136ba75f0d9b957a67f22e7b1e86a0 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-17T10:46:45Z Recycle old tasks when possible commit b3dc438bf1665b9364b19f5efa908dd35d2b7af3 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-19T15:13:36Z Adjusted based on Damian's comments commit f8cfe74d85e0a8cd5efacca87eced236319c83b9 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-19T17:44:39Z Refactor commit 62bb3fd4a90dd28bc7bb58bf077b7ecb60207c7e Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-24T14:24:48Z Merge remote-tracking branch 'origin/trunk' into KAFKA-3559-onPartitionAssigned commit 841caa3721172d2d89ec16ef6dfd149f25498649 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-24T17:32:05Z Addressed Guozhang's comments commit c4498564907243c35df832407933b8a9cf32f4ef Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-25T11:07:28Z Refactor commit 4ba24c1ecb8c6293adce426a92b6021e86c9e8b7 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-25T12:20:04Z Merge remote-tracking branch 'origin/trunk' into KAFKA-3559-onPartitionAssigned commit 0fe12633b8593eda3b5b7b75bc87244276c95ce2 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-28T20:46:18Z Minor reshuffle commit 7bf5d96cd66ab77130cad39fbff821fccd83aa06 Author: Eno Thereska <eno.there...@gmail.com> Date: 2016-10-28T21:44:48Z Guozhang's suggestion to clear queue ---- > Task creation time taking too long in rebalance callback > -------------------------------------------------------- > > Key: KAFKA-3559 > URL: https://issues.apache.org/jira/browse/KAFKA-3559 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Guozhang Wang > Assignee: Eno Thereska > Labels: architecture > Fix For: 0.10.2.0 > > > Currently in Kafka Streams, we create stream tasks upon getting newly > assigned partitions in rebalance callback function {code} onPartitionAssigned > {code}, which involves initialization of the processor state stores as well > (including opening the rocksDB, restore the store from changelog, etc, which > takes time). > With a large number of state stores, the initialization time itself could > take tens of seconds, which usually is larger than the consumer session > timeout. As a result, when the callback is completed, the consumer is already > treated as failed by the coordinator and rebalance again. > We need to consider if we can optimize the initialization process, or move it > out of the callback function, and while initializing the stores one-by-one, > use poll call to send heartbeats to avoid being kicked out by coordinator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)