chenyulin0719 opened a new pull request, #17187: URL: https://github.com/apache/kafka/pull/17187
It's regarding [KAFKA-17515](https://issues.apache.org/jira/browse/KAFKA-17515). Found two issues in the flaky tests: (Put the log analysis under Jira comments.) 1. The local state is purged immediately after kafkaStreams.close() timeout (Current timeout = 5 sec). However, the asynchronize close() thread still trigger RocksDBStore.flush() to the local state dir. 2. Racing issue: Task to-be restored in `ks-1` are rebalanced to `ks-2` before entering active restoring state. So no onRestoreSuspend() was triggered. To solve the issues: 1. Extend the kafkaStreams.close() timeout from 5 sec to 60 sec 2. Ensure all tasks in `ks-1` are active restoring before start second KafkaStreams(`ks-2`) ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org