[ 
https://issues.apache.org/jira/browse/KAFKA-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15804755#comment-15804755
 ] 

ASF GitHub Bot commented on KAFKA-4441:
---------------------------------------

GitHub user edoardocomar opened a pull request:

    https://github.com/apache/kafka/pull/2325

    KAFKA-4441 Monitoring incorrect during topic creation and deletion

    OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
    topic being deleted
    
    Added integration test which polls the metrics while topics are being
    created and deleted
    
    Developed with @mimaison

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/edoardocomar/kafka KAFKA-4441

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2325.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2325
    
----
commit a793d249e2653255eb62a0d1b9c4a2b99c917b11
Author: Mickael Maison <mickael.mai...@gmail.com>
Date:   2016-12-15T13:26:51Z

    KAFKA-4441 Monitoring incorrect during topic creation and deletion
    
    OfflinePartitionsCount PreferredReplicaImbalanceCount metrics check for
    topic being deleted
    
    Added integration test which polls the metrics while topics are being
    created and deleted
    
    Developed with @mimaison

----


> Kafka Monitoring is incorrect during rapid topic creation and deletion
> ----------------------------------------------------------------------
>
>                 Key: KAFKA-4441
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4441
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.0.0, 0.10.0.1
>            Reporter: Tom Crayford
>            Assignee: Edoardo Comar
>
> Kafka reports several metrics off the state of partitions:
> UnderReplicatedPartitions
> PreferredReplicaImbalanceCount
> OfflinePartitionsCount
> All of these metrics trigger when rapidly creating and deleting topics in a 
> tight loop, although the actual causes of the metrics firing are from topics 
> that are undergoing creation/deletion, and the cluster is otherwise stable.
> Looking through the source code, topic deletion goes through an asynchronous 
> state machine: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/TopicDeletionManager.scala#L35.
> However, the metrics do not know about the progress of this state machine: 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/KafkaController.scala#L185
>  
> I believe the fix to this is relatively simple - we need to make the metrics 
> know that a topic is currently undergoing deletion or creation, and only 
> include topics that are "stable"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to