[
https://issues.apache.org/jira/browse/KAFKA-10189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Kim updated KAFKA-10189:
-----------------------------
Description:
The metric
[EventQueueTimeMs|[http://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerEventManager.scala#L81|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerEventManager.scala#L81]]
does not reset and therefore misrepresents the controller event queue time in
these two scenarios:
1. upon losing leader election - `EventQueueTimeMs` portrays the last event
queue time of the previous controller and not the current controller
2. no controller events are added to the queue - `EventQueueTimeMs` portrays
the most recent event queue time, not the current queue time (which is 0)
For both cases, we should reset the controller event queue time to 0.
Implementation:
Instead of using `LinkedBlockingQueue.take()` [here|#L118]], we can use
`LinkedBlockingQueue.poll(long timeout, TimeUnit unit)` and reset
`EventQueueTimeMs` if the queue is empty.
was:
The metric
[EventQueueTimeMs|[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerEventManager.scala#L81]]
does not reset and therefore misrepresents the controller event queue time in
these two scenarios:
1. upon losing leader election - `EventQueueTimeMs` portrays the last event
queue time of the previous controller and not the current controller
2. no controller events are added to the queue - `EventQueueTimeMs` portrays
the most recent event queue time, not the current queue time (which is 0)
For both cases, we should reset the controller event queue time to 0.
Implementation:
Instead of using `LinkedBlockingQueue.take()` [here|#L118]], we can use
`LinkedBlockingQueue.poll(long timeout, TimeUnit unit)` and reset
`EventQueueTimeMs` if the queue is empty.
> Reset metric EventQueueTimeMs
> ------------------------------
>
> Key: KAFKA-10189
> URL: https://issues.apache.org/jira/browse/KAFKA-10189
> Project: Kafka
> Issue Type: Bug
> Components: controller, metrics
> Reporter: Jeff Kim
> Assignee: Jeff Kim
> Priority: Minor
>
> The metric
> [EventQueueTimeMs|[http://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerEventManager.scala#L81|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerEventManager.scala#L81]]
> does not reset and therefore misrepresents the controller event queue time
> in these two scenarios:
> 1. upon losing leader election - `EventQueueTimeMs` portrays the last event
> queue time of the previous controller and not the current controller
> 2. no controller events are added to the queue - `EventQueueTimeMs` portrays
> the most recent event queue time, not the current queue time (which is 0)
> For both cases, we should reset the controller event queue time to 0.
> Implementation:
> Instead of using `LinkedBlockingQueue.take()` [here|#L118]], we can use
> `LinkedBlockingQueue.poll(long timeout, TimeUnit unit)` and reset
> `EventQueueTimeMs` if the queue is empty.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)