[ 
https://issues.apache.org/jira/browse/KAFKA-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537667#comment-16537667
 ] 

ASF GitHub Bot commented on KAFKA-7098:
---------------------------------------

hzxa21 opened a new pull request #5350: KAFKA-7098: Improve accuracy of 
throttling by avoiding under-estimating actual rate in Throttler
URL: https://github.com/apache/kafka/pull/5350
 
 
   This PR modifies Throttler.scala by setting the `periodStartNs` to the 
current time instead of the time before the potential `sleep` call when 
throttling is needed. The reason behind is that if we reset `periodStartNs` to 
the time before `sleep`, we will increase the time window in the next actual 
rate calculation, which will underestimate the actual rate and may miss the 
throttling opportunity or sleep for less time. A unit test is also added to 
test the fix.
   
   For example, if we use Throttler to throttle the pre sec rate to 10 with 
checkInterval 1s, in the original implementation:
   1. 15 events happen during [t0, t0+1s]
   2. Throttler will sleep the thread until t0+1.5s, then reset period start 
time to t0+1s
   3. 10 events happen during [t0+1.5s, t0+2s], Throttler will not throttle 
this time because the estimated rate is `10 / [(t0+2s) - (t0+1s)] = 10`
   
   But the actual rate during [t0, t0+2s] is `(10+15) / 2 = 12.5 > 10`
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve accuracy of the log cleaner throttle rate
> -------------------------------------------------
>
>                 Key: KAFKA-7098
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7098
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>            Priority: Major
>
> LogCleaner uses the Throttler class to throttler the log cleaning rate to the 
> user-specified limit, i.e. log.cleaner.io.max.bytes.per.second. However, in 
> Throttler.maybeThrottle(), the periodStartNs is set to the time before the 
> sleep after the sleep() is called, which artificially increase the actual 
> window size and under-estimate the actual log cleaning rate. This causes the 
> log cleaning IO to be higher than the user-specified limit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to