dajac opened a new pull request #9140: URL: https://github.com/apache/kafka/pull/9140
https://github.com/apache/kafka/pull/8672 introduced a bug leading to crashing the replica fetcher threads. The issue is that https://github.com/apache/kafka/pull/8672 deletes the Partitions prior to stopping the replica fetchers. As the replica fetchers relies access the Partition in the ReplicaManager, they crash with a NotLeaderOrFollowerException that is not handled. This PR reverts the code to the original ordering to avoid this issue. The regression has been caught by our system test: `kafkatest.tests.core.reassign_partitions_test`. I have not managed to reproduce the issue in a unit test without reimplementing the entire system test in Java. I am not sure that makes sense as we already have it in Python. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org