You will need to make a PromQL query which performs the same calculation
that "top" is doing to calculate that value. I don't know what time period
top calculates that over, nor what scrape interval you are using for your
node_exporter metrics.
As I said before, if you want some queries to copy for any node_exporter
variables, there are Grafana dashboards available. Just open them up and
copy the queries they are making.
If you are scraping at 1 minute intervals, then something like this should
do the trick:
avg by (instance) (rate(node_cpu_seconds_total{mode="steal"}[2m])) * 100
You can use 'sum' instead of 'avg', but then the percentages will reflect
multiple CPUs (e.g. host has 8 CPUs => values will be out of 800%)
How this works:
- the metric node_cpu_seconds_total{mode="steal"} accumulates all the time
that each CPU has spent in the "steal" state
- taking a rate(...) of this metric will tell you the fraction of time in
this state, i.e. the number of seconds in "steal" state, per second of real
time
- there will be separate values of this metric for each host (instance) and
each cpu on that host
- avg by (instance) will group together all the metrics for each unique
host, i.e. all CPUs on that host, and average them - giving one metric per
host
The values for all CPU states *should* add up to 100%. In practice, they
don't quite exactly: see
sum by (instance,cpu)(rate(node_cpu_seconds_total[2m])) * 100
If this matters, you can make a more complex query to normalize the results.
On Thursday, 24 August 2023 at 08:03:11 UTC+1 Monica wrote:
> Hi Brian,
>
> Thank you for the update. The node_cpu_second metric is already present in
> the system. However, the 'st' value highlighted below is not being
> reflected, and I am want to capture only this specific parameter. Pls
> suggest.
>
> %Cpu(s): 1.5 us, 1.2 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.0 si,* 0.0
> st*
>
> On Wednesday, August 23, 2023 at 7:14:40 PM UTC+5:30 Brian Candler wrote:
>
>> The CPU steal time is already available as a metric from node_exporter,
>> as node_cpu_seconds_total{instance="XXX",cpu="N",mode="steal"}
>>
>> Since this is an accumulated number of seconds, you'd use rate() to find
>> out how fast it is growing.
>>
>> If you want to view this in Grafana there are existing dashboards you can
>> use, e.g. https://grafana.com/grafana/dashboards/1860-node-exporter-full/
>>
>> On Wednesday, 23 August 2023 at 11:55:02 UTC+1 Monica wrote:
>>
>>> Hi All,
>>>
>>> I need to capture the 'st' parameter (which represents the time stolen
>>> from this virtual machine by the hypervisor) from the Linux 'top' command
>>> using Prometheus for monitoring purposes.
>>>
>>> %Cpu(s): 1.5 us, 1.2 sy, 0.0 ni, 97.3 id, 0.0 wa, 0.0 hi, 0.0 si,*
>>> 0.0 st*
>>>
>>> Could anyone please suggest whether this is achievable using Prometheus?
>>> If so, could you also explain how?
>>>
>>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/173f9ccd-d142-4a15-80e6-b4b60fb55f0cn%40googlegroups.com.