Todd Lipcon created KUDU-2287:
---------------------------------

             Summary: Add replica metric tracking time since there was a valid 
leader
                 Key: KUDU-2287
                 URL: https://issues.apache.org/jira/browse/KUDU-2287
             Project: Kudu
          Issue Type: New Feature
          Components: metrics, supportability
    Affects Versions: 1.7.0
            Reporter: Todd Lipcon


Currently monitoring systems can report that the Kudu cluster is perfectly 
healthy when in fact some tablet has gotten "stuck" with no leader (eg due to 
some network connectivity problem or a bug). If we exposed a numeric metric on 
a tablet indicating the time since a replica was healthy, or number of failed 
election attempts, etc, we could easily monitor for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to