Brian Byrne created KAFKA-9395:
----------------------------------

             Summary: Improve Kafka scheduler's periodic maybeShrinkIsr()
                 Key: KAFKA-9395
                 URL: https://issues.apache.org/jira/browse/KAFKA-9395
             Project: Kafka
          Issue Type: Improvement
            Reporter: Brian Byrne
            Assignee: Brian Byrne


The ReplicaManager schedules a periodic call to maybeShrinkIsr() with the 
KafkaScheduler for a period of replica.lag.time.max.ms / 2. While 
replica.lag.time.max.ms defaults to 30s, my setup was 45s, which means 
maybeShrinkIsr() was being called every 22.5 seconds. Normally this is not a 
problem.

Fetch/produce requests hold a partition's leaderIsrUpdateLock in reader mode 
while they are running. When a partition is requested to check whether it 
should shrink its ISR, it acquires a write lock. So there's potential for 
contention here, and if the fetch/produce requests are long running, they may 
block maybeShrinkIsr() for hundreds of ms.

This becomes a problem due to the way the scheduler runnable is set up: it 
calls maybeShrinkIsr() for partition per single scheduler invocation. If 
there's a lot of partitions, this could take many seconds, even minutes. 
However, the runnable is scheduled via 
ScheduledThreadPoolExecutor#scheduleAtFixedRate, which means if it exceeds its 
period, it's immediately scheduled to run again. So it backs up enough that the 
scheduler is always executing this function.

This may cause partitions to periodically check their ISR a lot less frequently 
than intended. This also contributes a huge source of contention for cases 
where the produce/fetch requests are long-running.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to