[
https://issues.apache.org/jira/browse/KAFKA-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guozhang Wang updated KAFKA-1001:
---------------------------------
Attachment: KAFKA-1001_2013-10-28_15:13:42.patch
> Handle follower transition in batch
> -----------------------------------
>
> Key: KAFKA-1001
> URL: https://issues.apache.org/jira/browse/KAFKA-1001
> Project: Kafka
> Issue Type: Improvement
> Reporter: Jay Kreps
> Assignee: Guozhang Wang
> Fix For: 0.8.1
>
> Attachments: KAFKA-1001_2013-10-21_13:35:41.patch,
> KAFKA-1001_2013-10-25_11:27:24.patch, KAFKA-1001_2013-10-28_11:19:47.patch,
> KAFKA-1001_2013-10-28_15:13:42.patch, KAFKA-1001.patch
>
>
> In KAFKA-615 we made changes to avoid fsync'ing the active segment of the log
> due to log roll and maintaining recovery semantics.
> One downside of the fix for that issue was that it required checkpointing the
> recovery point for the log many times, one for each partition that
> transitioned to follower state.
> In this ticket I aim to fix that issue by making the following changes:
> 1. Add a new API LogManager.truncateTo(m: Map[TopicAndPartition, Long]). This
> method will first checkpoint the recovery point, then truncate each of the
> given logs to the given offset. This method will have to ensure these two
> things happen atomically.
> 2. Change ReplicaManager to first stop fetching for all partitions changing
> to follower state, then call LogManager.truncateTo then complete the existing
> logic.
> We think this will, over all, be a good thing. The reason is that the
> fetching thread current does something like (a) acquire lock, (b) fetch
> partitions, (c) write data to logs, (d) release locks. Since we currently
> remove fetchers one at a time this requires acquiring the fetcher lock, and
> hence generally blocking for half of the read/write cycle for each partition.
> By doing this in bulk we will avoid reacquiring the lock over and over for
> each change.
--
This message was sent by Atlassian JIRA
(v6.1#6144)