mjd95 commented on PR #15241: URL: https://github.com/apache/kafka/pull/15241#issuecomment-2376460752
We were the ones discussing with @jeqo - the "caching closed channels" issue was happening regularly for us on 3.8 in production, the thread doing a remote read was interrupted while iterating through the transaction index, we get a `ClosedByInterruptException` on some transaction index file channel, but the closed channel remains in the cache. The only way to mitigate was restarting the broker. We were able to reproduce by setting a low `remote.fetch.max.wait.ms` and setting a small segment size in order to generate many transaction index files. We tested that this PR fixes our repro and cherry-picked it onto our production release, we haven't seen the issue since then. We haven't seen the "race during channel close" issue (which is now also handled in this PR) in production. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org