Thank you Brian for your help with this. I really appreciate it. Shirly
On Sunday, March 26, 2023 at 5:44:25 PM UTC+3 Brian Candler wrote: > > There is a disagreement about this, since there are no examples that I > could find that has a third optional value. > > The closest example I can think of is Nagios plugins (0=ok, 1=warning, > 2=critical, 3=unknown). See nrpe_exporter: > https://github.com/canonical/nrpe_exporter > https://www.robustperception.io/nagios-nrpe-prometheus-exporter > > And, I guess things like ifOperStatus from snmp_exporter. > > I'd say having a 0/1/2 status isn't necessarily "wrong", and in Grafana > you can map these numbers to strings and/or colours. > > However, you're also right to say this isn't normal recommended practice. > Typically you've have a set of timeseries and set one to 1 and the others > to 0. Client libraries tend to call this group of metrics an "enum", e.g. > https://github.com/prometheus/client_python#enum > > I wouldn't worry about efficiency. Prometheus timeseries are very cheap, > especially when the metric values are mostly constant. > > On Sunday, 26 March 2023 at 07:24:53 UTC+1 Shirly Radco wrote: > >> Hi, >> >> *Short summery:* >> Can we have a metric that reports 3 values (0/1/2), to indicate status >> instead of using labels or adding the status to the metric name? >> >> *Full story:* >> I'm working on creating a general recommendation for reporting an >> Kubernetes operator health metric. >> >> The full proposal is here , >> https://github.com/operator-framework/operator-sdk/pull/6315/files. >> >> I proposed to recommend operators to add a new health metric that would >> have the following naming: >> *<operator-name-prefix>_operator_health_status *[1] >> >> I proposed that the values of this metric would indicate the health >> status: >> * `0` - Indicates that the operator is healthy and working as expected. >> * `1` - Indicates that the operator has some issues that needs to be >> addressed and can potentially lead to loss of functionality. >> * `2` - Indicates that the operator is unhealthy and there is a loss of >> functionality that should be addressed. >> >> There is a disagreement about this, since there are no examples that I >> could find that has a third optional value. >> Usually these metrics are represented as Boolean (Healthy/Unhealthy) or >> the status is stated in the metric name. >> >> The reviewers believe its not recommend to have more than 2 possible >> values(Boolean). >> I see few issues with this: >> 1. The metric is sent from different operators and it would be >> problematic to have a label to indicate the level of health in a consistent >> way. >> 2. I don't see an issue with querying Prometheus with more than 2 values. >> It might be more efficient than filtering with labels. >> >> I would appreciate you insights on this, considering that the metric is >> sent from multiple sources that are all developed separately. >> >> Thank you, >> Shirly Radco >> >> [1] I proposed a different prefix and same suffix since I know there is >> an issue with sending the same metric name to Prometheus with a different >> help text. >> Since we can't enforce the help text to be exactly the same, the suffix >> should be enough to be able to display all the operators health metrics in >> the same panel. >> Also, it would be easier to identify the origin of metrics that have an >> issue. > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/db7fac2a-c58c-42eb-9797-b114e635d3b1n%40googlegroups.com.

