lucasbru opened a new pull request, #12904:
URL: https://github.com/apache/kafka/pull/12904

   The original implementation of the state updater could not
   handle double rebalances within one poll phase correctly,
   because it could create tasks more than once if they hadn't
   finished initialization yet. 
   
   In a55071a99fabc9a706afa0e9acddf898c7cd05c4, we 
   moved initialization to the state updater to fix this. However,
   with more testing, I found out that this implementation has
   it's problems as well: There are problems with locking the
   state directory (state updater acquired the lock to the state
   directory, so the main thread wouldn't be able to clear the
   state directory when closing the task), and benchmarks also
   show that this can lead to useless work (tasks are being
   initialized, although they will be taken from the thread soon
   after in a follow-up rebalance).
   
   In this PR, I propose to revert the original change, and fix
   the original problem in a much simpler way: When we 
   receive an assignment, we simply clear out the
   list of tasks pending initialization. This way, no double
   tasks instantiations can happen.
   
   The change was tested in benchmarks, system tests,
   and the existing unit & integration tests. We also add
   the state updater to the smoke integration test, which
   triggered the double task instantiations before.
   
   ### Committer Checklist (excluded from commit message)
   - [x] Verify design and implementation 
   - [x] Verify test coverage and CI build status
   - [x] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to