[
https://issues.apache.org/jira/browse/KAFKA-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17042167#comment-17042167
]
ASF GitHub Bot commented on KAFKA-9594:
---------------------------------------
omkreddy commented on pull request #8153: KAFKA-9594: Add a separate lock to
pause the follower log append while checking if the log dir could be replaced.
URL: https://github.com/apache/kafka/pull/8153
This PR adds new lock is used to prevent the follower replica from being
updated while ReplicaAlterDirThread is executing
maybeReplaceCurrentWithFutureReplica() to replace follower replica with the
future replica.
Now doAppendRecordsToFollowerOrFutureReplica() doesn't need to hold the lock
on leaderIsrUpdateLock for local replica updation and ongoing log appends on
the follower will not delay the makeFollower() call.
**Benchmark results for Partition. makeFollower **
Old:
```
Benchmark Mode Cnt Score
Error Units
PartitionMakeFollowerBenchmark.testMakeFollower avgt 15 2046.967 ?
22.842 ns/op
```
New:
```
Benchmark Mode Cnt Score Error
Units
PartitionMakeFollowerBenchmark.testMakeFollower avgt 15 1278.525 ?
5.354 ns/op
```
### Committer Checklist (excluded from commit message)
- [ ] Verify design and implementation
- [ ] Verify test coverage and CI build status
- [ ] Verify documentation (including upgrade notes)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> speed up the processing of LeaderAndIsrRequest
> ----------------------------------------------
>
> Key: KAFKA-9594
> URL: https://issues.apache.org/jira/browse/KAFKA-9594
> Project: Kafka
> Issue Type: Improvement
> Reporter: Jun Rao
> Assignee: Manikumar
> Priority: Minor
> Fix For: 2.6.0
>
>
> Observations from [~junrao]
> Currently, Partition.makerFollower() holds a write lock on
> leaderIsrUpdateLock. Partition.doAppendRecordsToFollowerOrFutureReplica()
> holds a read lock on leaderIsrUpdateLock. So, if there is an ongoing log
> append on the follower, the makeFollower() call will be delayed. This path is
> a bit different when serving the Partition.makeLeader() call. Before we make
> a call on Partition.makerLeader(), we first remove the follower from the
> replicaFetcherThread. So, the makerLeader() call won't be delayed because of
> log append. This means that when we change one follower to become leader and
> another follower to follow the new leader during a controlled shutdown, the
> makerLeader() call typically completes faster than the makeFollower() call,
> which can delay the follower fetching from the new leader and cause ISR to
> shrink.
> This only reason that Partition.doAppendRecordsToFollowerOrFutureReplica()
> needs to hold a read lock on leaderIsrUpdateLock is for
> Partiiton.maybeReplaceCurrentWithFutureReplica() to pause the log append
> while checking if the log dir could be replaced. We could potentially add a
> separate lock (sth like futureLogLock) that's synced between
> maybeReplaceCurrentWithFutureReplica() and
> doAppendRecordsToFollowerOrFutureReplica(). Then,
> doAppendRecordsToFollowerOrFutureReplica() doesn't need to hold the lock on
> leaderIsrUpdateLock.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)