mjd95 commented on PR #15241:
URL: https://github.com/apache/kafka/pull/15241#issuecomment-2376460752

   We were the ones discussing with @jeqo - the "caching closed channels" issue 
was happening regularly for us on 3.8 in production, the thread doing a remote 
read was interrupted while iterating through the transaction index, we get a 
`ClosedByInterruptException` on some transaction index file channel, but the 
closed channel remains in the cache. The only way to mitigate was restarting 
the broker.
   
   We were able to reproduce by setting a low `remote.fetch.max.wait.ms` and 
setting a small segment size in order to generate many transaction index files.
   
   We tested that this PR fixes our repro and cherry-picked it onto our 
production release, we haven't seen the issue since then.
   
   We haven't seen the "race during channel close" issue (which is now also 
handled in this PR) in production.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to