[ https://issues.apache.org/jira/browse/KAFKA-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13805561#comment-13805561 ]
Guozhang Wang commented on KAFKA-1001: -------------------------------------- Updated reviewboard https://reviews.apache.org/r/14730/ > Handle follower transition in batch > ----------------------------------- > > Key: KAFKA-1001 > URL: https://issues.apache.org/jira/browse/KAFKA-1001 > Project: Kafka > Issue Type: Improvement > Reporter: Jay Kreps > Assignee: Guozhang Wang > Fix For: 0.8.1 > > Attachments: KAFKA-1001_2013-10-21_13:35:41.patch, > KAFKA-1001_2013-10-25_11:27:24.patch, KAFKA-1001.patch > > > In KAFKA-615 we made changes to avoid fsync'ing the active segment of the log > due to log roll and maintaining recovery semantics. > One downside of the fix for that issue was that it required checkpointing the > recovery point for the log many times, one for each partition that > transitioned to follower state. > In this ticket I aim to fix that issue by making the following changes: > 1. Add a new API LogManager.truncateTo(m: Map[TopicAndPartition, Long]). This > method will first checkpoint the recovery point, then truncate each of the > given logs to the given offset. This method will have to ensure these two > things happen atomically. > 2. Change ReplicaManager to first stop fetching for all partitions changing > to follower state, then call LogManager.truncateTo then complete the existing > logic. > We think this will, over all, be a good thing. The reason is that the > fetching thread current does something like (a) acquire lock, (b) fetch > partitions, (c) write data to logs, (d) release locks. Since we currently > remove fetchers one at a time this requires acquiring the fetcher lock, and > hence generally blocking for half of the read/write cycle for each partition. > By doing this in bulk we will avoid reacquiring the lock over and over for > each change. -- This message was sent by Atlassian JIRA (v6.1#6144)