Re: [prometheus-users] Counter or Gauge metric?

Christoph Anton Mitterer Sat, 20 Jul 2024 16:51:45 -0700

Hey.

On Sat, 2024-07-20 at 10:26 -0700, 'Brian Candler' via Prometheus Users
wrote:
> 
> If the label stays constant, then the amount of extra space required
> is tiny.  There is an internal mapping between a bag of labels and a
> timeseries ID.


Is it the same if one uses a metric (like for the RPMs from below) and
that never changes? I mean is that also efficient?


> But if any label changes, that generates a completely new timeseries.
> This is not something you want to happen too often (a.k.a "timeseries
> churn"), but moderate amounts are OK.

Why exactly wouldn't one want this? I mean especially with respect to
such _info metrics.

Graphing _info time series doesn't make sense anyway... so it's not as
if one would get some usable time series/graph (like a temperature or
so) interrupted, if e.g. the state changes for a while from OK to
degraded.
It's of course clear why one doesn't want that for "normal" time
series.


> For example, if a drive changes from "OK" to "degraded" then that
> would be reasonable, but putting the drive temperature in a label
> would not.

The latter is anyway clear. :-) 


> Some people prefer to enumerate a separate timeseries per state,
> which can make certain types of query easier since you don't have to
> worry about staleness or timeseries appearing and disappearing. e.g.
> 
> foo{state="OK"} 1
> foo{state="degraded"} 0
> foo{state="absent"} 0

I guess with appearing/disappearing you mean, that one has to take into
account, that e.g. pd_info{state=="OK",pd_name="foo"} won't exist while
"foo" is failed... and thus e.g. when graphing the OK-times of a
device, it would per default show nothing during that time and not a
value of zero?

But other than that... are there any bigger consequences?
I mean I wonder with which kinds of queries this would cause troubles,
cause for the simply counting time series with some special value (like
state=="OK") there should still always be exactly one series for each
(e.g.) PD (respectively for each combination of the "primary key"
labels).

What are the differences with respect to staleness?


> It's much easier to alert on foo{state="OK"}  == 0, than on the
> absence of timeseries foo{state="OK"}.

Why? Can't I just do something like
  count( foo{state!="OK"} ) > 0
?

> However, as you observe, you need to know in advance what all the
> possible states are.
> 
> The other option, if the state values are integer enumerations at
> source (e.g. as from SNMP), is to store the raw numeric value:
> 
> foo 3
> 
> That means the querier has to know how the meaning of these values.
> (Grafana can map specific values to textual labels and/or colours
> though).

But that also requires me to use a label like in enum_metric{value=3},
or I have to construct metric names dynamically (which I could also
have done for the symbolic name), which seems however discouraged (and
I'd say for good reasons)?

Anyway, in my case there's no numeric value anyway ;-)


> > 2) Metrics like:
> > - smartraid_physical_drive_size_bytes
> > - smartraid_physical_drive_rotational_speed_rpm
> > can in principle not change (again: unless the PD is replaced with
> > another one of the same name).
> >  
> > So should they rather be labels, despite being numbers?
> >  
> > OTOH, labels like:
> > - smartraid_logical_drive_size_bytes
> > - smartraid_logical_drive_chunk_size_bytes
> > - smartraid_logical_drive_data_stripe_size_bytes
> > *can* in principle change (namely if the RAID is converted).
> 
> IMO those are all fine as labels. Creation or modification of a
> logical volume is a rare thing to do, and arguably changing such
> fundamental parameters is making a new logical volume.

I'm still unsure what to do ;-)

I mean if both, label and metric, are equally efficient (in therms of
storage)... then using a metric would have still the advantage of being
able to do things like:
   smartraid_logical_drive_chunk_size_bytes > (256*1024)
i.e. select those LDs, that use a chunk size > 256 KiB ... which I
cannot (as easily) do if it's in a label.


> If you ever wanted to do *arithmetic* on those values - like divide
> the physical drive size by the sum of logical drive sizes - then
> you'd want them as metrics.

Ah... here you go ^^

> Also, filtering on labels can be awkward (e.g. "show me all drives
> with speed greater than 7200rpm" requires a bit of regexp magic,
> although "show me all drives with speed not 7200rpm" is easy).
> 
> But I don't think those are common use cases.  Rather it's just about
> collecting secondary identifying information and characteristics.

Well....  yes and now.

What I would actually like to have is that I can use my _info data for
selecting groups (not really sure if it's possible as I want it), but
taking for example some metric from node_exporter about device IO
rates... I'd like to be able to select all those devices that have e.g.
chunksize = 1 MiB in one graph.
And those with say 256 KiB in another.

I had hoped one could do that with some "… and on (…) …" magic.



> 
> > I went now for the approach to have a dedicated metric for those
> > where there's a dedicated property in the RAID tool output, like:
> > - smartraid_controller_temperature_celsius
> 
> Yes: something that's continuously variable (and likely to vary
> frequently), and/or that you might want to draw a graph of or alert
> on, is definitely its own metric value, not a label.

My point here was rather:

Should I have made e.g. only one
  smartraid_temperature{type="bla"}
value (or perhaps a bit more than just "type",
with "bla" being e.g. controller, capacitor, cache_module or "sensor".

I.e. putting all temperatures in one metric, rather than the 4
different ones I have now (and where I have no "type" label).


Thanks,
Chris.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ec767e65f7c71de2c11cabcf5da686ee9bd428e9.camel%40gmail.com.

Re: [prometheus-users] Counter or Gauge metric?

Reply via email to