> There is a disagreement about this, since there are no examples that I 
could find that has a third optional value.

The closest example I can think of is Nagios plugins (0=ok, 1=warning, 
2=critical, 3=unknown). See nrpe_exporter:
https://github.com/canonical/nrpe_exporter
https://www.robustperception.io/nagios-nrpe-prometheus-exporter

And, I guess things like ifOperStatus from snmp_exporter.

I'd say having a 0/1/2 status isn't necessarily "wrong", and in Grafana you 
can map these numbers to strings and/or colours.

However, you're also right to say this isn't normal recommended practice. 
Typically you've have a set of timeseries and set one to 1 and the others 
to 0. Client libraries tend to call this group of metrics an "enum", e.g.
https://github.com/prometheus/client_python#enum

I wouldn't worry about efficiency. Prometheus timeseries are very cheap, 
especially when the metric values are mostly constant.

On Sunday, 26 March 2023 at 07:24:53 UTC+1 Shirly Radco wrote:

> Hi, 
>
> *Short summery:*
> Can we have a metric that reports 3 values (0/1/2), to indicate status 
> instead of using labels or adding the status to the metric name?
>
> *Full story:*
> I'm working on creating a general recommendation for reporting an 
> Kubernetes operator health metric.
>
> The full proposal is here , 
> https://github.com/operator-framework/operator-sdk/pull/6315/files.
>
> I proposed to recommend operators to add a new health metric that would 
> have the following naming:
> *<operator-name-prefix>_operator_health_status *[1]
>
> I proposed that the values of this metric would indicate the health status:
>   * `0` - Indicates that the operator is healthy and working as expected.
>   * `1` - Indicates that the operator has some issues that needs to be 
> addressed and can potentially lead to loss of functionality.
>   * `2` - Indicates that the operator is unhealthy and there is a loss of 
> functionality that should be addressed.
>
> There is a disagreement about this, since there are no examples that I 
> could find that has a third optional value.
> Usually these metrics are represented as Boolean (Healthy/Unhealthy) or 
> the status is stated in the metric name.
>
> The reviewers believe its not recommend to have more than 2 possible 
> values(Boolean).
> I see few issues with this:
> 1. The metric is sent from different operators and it would be problematic 
> to have a label to indicate the level of health in a consistent way.
> 2. I don't see an issue with querying Prometheus with more than 2 values. 
> It might be more efficient than filtering with labels.
>
> I would appreciate you insights on this, considering that the metric is 
> sent from multiple sources that are all developed separately. 
>
> Thank you,
> Shirly Radco
>
> [1] I proposed a different prefix and same suffix since I know there is an 
> issue with sending the same metric name to Prometheus with a different help 
> text.
> Since we can't enforce the help text to be exactly the same, the suffix 
> should be enough to be able to display all the operators health metrics in 
> the same panel.
> Also, it would be easier to identify the origin of metrics that have an 
> issue.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/829716b2-24ce-49b0-a9d0-b5c6da091ecfn%40googlegroups.com.

Reply via email to