> There is a disagreement about this, since there are no examples that I could find that has a third optional value.
The closest example I can think of is Nagios plugins (0=ok, 1=warning, 2=critical, 3=unknown). See nrpe_exporter: https://github.com/canonical/nrpe_exporter https://www.robustperception.io/nagios-nrpe-prometheus-exporter And, I guess things like ifOperStatus from snmp_exporter. I'd say having a 0/1/2 status isn't necessarily "wrong", and in Grafana you can map these numbers to strings and/or colours. However, you're also right to say this isn't normal recommended practice. Typically you've have a set of timeseries and set one to 1 and the others to 0. Client libraries tend to call this group of metrics an "enum", e.g. https://github.com/prometheus/client_python#enum I wouldn't worry about efficiency. Prometheus timeseries are very cheap, especially when the metric values are mostly constant. On Sunday, 26 March 2023 at 07:24:53 UTC+1 Shirly Radco wrote: > Hi, > > *Short summery:* > Can we have a metric that reports 3 values (0/1/2), to indicate status > instead of using labels or adding the status to the metric name? > > *Full story:* > I'm working on creating a general recommendation for reporting an > Kubernetes operator health metric. > > The full proposal is here , > https://github.com/operator-framework/operator-sdk/pull/6315/files. > > I proposed to recommend operators to add a new health metric that would > have the following naming: > *<operator-name-prefix>_operator_health_status *[1] > > I proposed that the values of this metric would indicate the health status: > * `0` - Indicates that the operator is healthy and working as expected. > * `1` - Indicates that the operator has some issues that needs to be > addressed and can potentially lead to loss of functionality. > * `2` - Indicates that the operator is unhealthy and there is a loss of > functionality that should be addressed. > > There is a disagreement about this, since there are no examples that I > could find that has a third optional value. > Usually these metrics are represented as Boolean (Healthy/Unhealthy) or > the status is stated in the metric name. > > The reviewers believe its not recommend to have more than 2 possible > values(Boolean). > I see few issues with this: > 1. The metric is sent from different operators and it would be problematic > to have a label to indicate the level of health in a consistent way. > 2. I don't see an issue with querying Prometheus with more than 2 values. > It might be more efficient than filtering with labels. > > I would appreciate you insights on this, considering that the metric is > sent from multiple sources that are all developed separately. > > Thank you, > Shirly Radco > > [1] I proposed a different prefix and same suffix since I know there is an > issue with sending the same metric name to Prometheus with a different help > text. > Since we can't enforce the help text to be exactly the same, the suffix > should be enough to be able to display all the operators health metrics in > the same panel. > Also, it would be easier to identify the origin of metrics that have an > issue. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/829716b2-24ce-49b0-a9d0-b5c6da091ecfn%40googlegroups.com.

