[prometheus-users] Re: Need to capture metrics using Prometheus

Brian Candler Thu, 24 Aug 2023 01:54:25 -0700

You will need to make a PromQL query which performs the same calculation 
that "top" is doing to calculate that value.  I don't know what time period 
top calculates that over, nor what scrape interval you are using for your 
node_exporter metrics.


As I said before, if you want some queries to copy for any node_exporter 
variables, there are Grafana dashboards available. Just open them up and 
copy the queries they are making.

If you are scraping at 1 minute intervals, then something like this should 
do the trick:
avg by (instance) (rate(node_cpu_seconds_total{mode="steal"}[2m])) * 100

You can use 'sum' instead of 'avg', but then the percentages will reflect 
multiple CPUs (e.g. host has 8 CPUs => values will be out of 800%)

How this works:

- the metric node_cpu_seconds_total{mode="steal"} accumulates all the time 
that each CPU has spent in the "steal" state

- taking a rate(...) of this metric will tell you the fraction of time in 
this state, i.e. the number of seconds in "steal" state, per second of real 
time

- there will be separate values of this metric for each host (instance) and 
each cpu on that host

- avg by (instance) will group together all the metrics for each unique 
host, i.e. all CPUs on that host, and average them - giving one metric per 
host

The values for all CPU states *should* add up to 100%.  In practice, they 
don't quite exactly: see
sum by (instance,cpu)(rate(node_cpu_seconds_total[2m])) * 100

If this matters, you can make a more complex query to normalize the results.

On Thursday, 24 August 2023 at 08:03:11 UTC+1 Monica wrote:

> Hi Brian,
>
> Thank you for the update. The node_cpu_second metric is already present in 
> the system. However, the 'st' value highlighted below is not being 
> reflected, and I am want to capture only this specific parameter. Pls 
> suggest.
>
> %Cpu(s):  1.5 us,  1.2 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.0 si,* 0.0 
> st*
>
> On Wednesday, August 23, 2023 at 7:14:40 PM UTC+5:30 Brian Candler wrote:
>
>> The CPU steal time is already available as a metric from node_exporter, 
>> as node_cpu_seconds_total{instance="XXX",cpu="N",mode="steal"}
>>
>> Since this is an accumulated number of seconds, you'd use rate() to find 
>> out how fast it is growing.
>>
>> If you want to view this in Grafana there are existing dashboards you can 
>> use, e.g. https://grafana.com/grafana/dashboards/1860-node-exporter-full/
>>
>> On Wednesday, 23 August 2023 at 11:55:02 UTC+1 Monica wrote:
>>
>>> Hi All,
>>>
>>> I need to capture the 'st' parameter (which represents the time stolen 
>>> from this virtual machine by the hypervisor) from the Linux 'top' command 
>>> using Prometheus for monitoring purposes.
>>>
>>> %Cpu(s):  1.5 us,  1.2 sy,  0.0 ni, 97.3 id,  0.0 wa,  0.0 hi,  0.0 si,* 
>>> 0.0 st*
>>>
>>> Could anyone please suggest whether this is achievable using Prometheus? 
>>> If so, could you also explain how?
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/173f9ccd-d142-4a15-80e6-b4b60fb55f0cn%40googlegroups.com.

[prometheus-users] Re: Need to capture metrics using Prometheus

Reply via email to