[ https://issues.apache.org/jira/browse/KAFKA-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013235#comment-16013235 ]
ASF GitHub Bot commented on KAFKA-5036: --------------------------------------- GitHub user junrao opened a pull request: https://github.com/apache/kafka/pull/3074 KAFKA-5036: hold onto the leader lock in Partition while serving an O… …ffsetForLeaderEpoch request You can merge this pull request into a Git repository by running: $ git pull https://github.com/junrao/kafka kafka-5036 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/3074.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3074 ---- commit 8aa552372b4bae0cdf381d0ceeff829ea14119ee Author: Jun Rao <jun...@gmail.com> Date: 2017-05-16T22:56:34Z KAFKA-5036: hold onto the leader lock in Partition while serving an OffsetForLeaderEpoch request ---- > Followups from KIP-101 > ---------------------- > > Key: KAFKA-5036 > URL: https://issues.apache.org/jira/browse/KAFKA-5036 > Project: Kafka > Issue Type: Improvement > Affects Versions: 0.11.0.0 > Reporter: Jun Rao > Assignee: Jun Rao > Fix For: 0.11.0.0 > > > 1. It would be safer to hold onto the leader lock in Partition while serving > an OffsetForLeaderEpoch request. > 2. Currently, we update the leader epoch in epochCache after log append in > the follower but before log append in the leader. It would be more consistent > to always do this after log append. This also avoids issues related to > failure in log append. > 3. OffsetsForLeaderEpochRequest/OffsetsForLeaderEpochResponse: > The code that does grouping can probably be replaced by calling > CollectionUtils.groupDataByTopic(). Done: > https://github.com/apache/kafka/commit/359a68510801a22630a7af275c9935fb2d4c8dbf > 4. The following line in LeaderEpochFileCache is hit several times when > LogTest is executed: > {code} > if (cachedLatestEpoch == None) error("Attempt to assign log end offset > to epoch before epoch has been set. This should never happen.") > {code} > This should be an assert (with the tests fixed up) > 5. The constructor of LeaderEpochFileCache has the following: > {code} > lock synchronized { ListBuffer(checkpoint.read(): _*) } > {code} > But everywhere else uses a read or write lock. We should use consistent > locking. -- This message was sent by Atlassian JIRA (v6.3.15#6346)