[ 
https://issues.apache.org/jira/browse/KAFKA-5036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962819#comment-15962819
 ] 

ASF GitHub Bot commented on KAFKA-5036:
---------------------------------------

GitHub user benstopford opened a pull request:

    https://github.com/apache/kafka/pull/2831

    MINOR: KAFKA-5036 (points 2, 5): Refactor caching of Latest Epoch

    This PR covers point (2) and point (5) from KAFKA-5036:
    2. Currently, we update the leader epoch in epochCache after log append in 
the follower but before log append in the leader. It would be more consistent 
to always do this after log append. This also avoids issues related to failure 
in log append.
    5. The constructor of LeaderEpochFileCache has the following:
    lock synchronized { ListBuffer(checkpoint.read(): _*) }
    But everywhere else uses a read or write lock. We should use consistent 
locking.
    
    This is a refactor to the way epochs are cached, replacing the code to 
cache the latest epoch in the LeaderEpochFileCache by reusing the cached value 
in Partition. There is no functional change. 


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/benstopford/kafka KAFKA-5036-part2-second-try

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2831.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2831
    
----
commit 3e9c130672824070968173b2991a43eb9fa139b6
Author: Ben Stopford <benstopf...@gmail.com>
Date:   2017-04-10T12:56:48Z

    KAFKA-5036: Refactor the caching of the latest epoch. Workflow is simpler 
if we resuse the value cached in partition.

----


> Followups from KIP-101
> ----------------------
>
>                 Key: KAFKA-5036
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5036
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.11.0.0
>            Reporter: Jun Rao
>            Assignee: Jun Rao
>             Fix For: 0.11.0.0
>
>
> 1. It would be safer to hold onto the leader lock in Partition while serving 
> an OffsetForLeaderEpoch request.
> 2. Currently, we update the leader epoch in epochCache after log append in 
> the follower but before log append in the leader. It would be more consistent 
> to always do this after log append. This also avoids issues related to 
> failure in log append.
> 3. OffsetsForLeaderEpochRequest/OffsetsForLeaderEpochResponse:
> The code that does grouping can probably be replaced by calling 
> CollectionUtils.groupDataByTopic(). Done: 
> https://github.com/apache/kafka/commit/359a68510801a22630a7af275c9935fb2d4c8dbf
> 4. The following line in LeaderEpochFileCache is hit several times when 
> LogTest is executed:
> {code}
>        if (cachedLatestEpoch == None) error("Attempt to assign log end offset 
> to epoch before epoch has been set. This should never happen.")
> {code}
> 5. The constructor of LeaderEpochFileCache has the following:
> {code}
> lock synchronized { ListBuffer(checkpoint.read(): _*) }
> {code}
> But everywhere else uses a read or write lock. We should use consistent 
> locking.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to