Oops,

    topk(3, max_over_time(foo[31d] @ 1725145200) )

On Tuesday 3 September 2024 at 09:41:23 UTC+1 Brian Candler wrote:

> > If I run "max_over_time{}[1m:15s]" it will show me the peak of every 1m 
> evaluating every 15s sample. That's ok.
>
> That expression is almost certainly wrong; it is querying a metric called 
> "max_over_time" (which probably doesn't exist), rather than calling the 
> function max_over_time(...) on an expression.
>
> Again, why are you not using "max_over_time(foo[1m])" ??  The subquery 
> foo[1m:15s] is just causing more work.
>
> > second step would be to get the top3 max_over_time values of the 
> selected time_range. If I run "max_over_time{}[1m:15s]" for the last 3 days 
> I will get 3d x 24h x 60m "max_over_time" values.
>
> It sounds like you are trying to do stuff with Grafana, and I can't help 
> you with that. If you have an issue with Grafana, please take it to the 
> Grafana discussion community; this mailing list is for Prometheus only.
>
> > May goal was and that is why I used "topk(3,)" (wrong) to get the top3 
> values of the last 24hrs evaluated by "max_over_time{}[1m:15s]". But 
> topk(3,) only show me the top3 values at the same timestamp
>
> If you want the maximum values of each timeseries over the last 24 hours, 
> then you want max_over_time(foo[24h]) - try it in the PromQL web interface.
>
> This expression will return an instant vector. The values are timestamped 
> with the time at which the query was evaluated for - and that's the end 
> time of the 24 hour window.  If you don't select an evaluation time, then 
> the time is "now" and the window is "now - 24h to now"
>
> However, the *values* returned will be the maximum value for each 
> timeseries over that 24 hour period.  And topk(3, max_over_time(foo[24h]) 
> will then give you the three timeseries which have the highest value of 
> that maximum.
>
> > third step would be to get the time of these top3 values calculated 
> earlier.
>
> As far as I know, prometheus can't do that for you.  You'd have to use the 
> plain range vector query "foo[24h]" - pass it to the Prometheus HTTP API as 
> an instant query 
> <https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries>. 
> This will return all the data points in the 24 hour period, each with its 
> own raw timestamp. Then write your own code to identify the maximum 
> value(s) you are interested in, and pick the associated timestamps.
>
> It would be an interesting feature request for max_over_time(...) to 
> return values timestamped with the time they actually occurred at, but it 
> would make max_over_time work differently to other range vector aggregation 
> functions 
> <https://prometheus.io/docs/prometheus/latest/querying/functions/#aggregation_over_time>.
>   
> And there are some edge cases to be ironed out, e.g. what happens if the 
> same maximum value occurs multiple times.
>
> > the fourth step would be to get the top3 vaues of the month august and 
> the top3 values of the month july.
>
> You can evaluate PromQL expressions at a given instant. There are two ways:
> - call the instant query API 
> <https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries> 
> and pass the "time" parameter
> - on more recent versions of Prometheus, send a PromQL query with the @ 
> modifier 
> <https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier>.
>
> For example, for the maxima in August 2024, it would be something like 
> (untested):
>
>     topk(3, max_over_time(foo[31d]) @ 1725145200) 
>
>
> On Monday 2 September 2024 at 20:52:33 UTC+1 Alexander Wilke wrote:
>
>> Hello Brian,
>>
>> thanks for clarification. I investigated the issue further and found that 
>> Grafana Dashboards is manipulationg the data and for that reason 
>> "max_over_time" for the last 1h showed the correct peak and max_over_time 
>> for the last 24hrs did not show that peak because of a too low set of Data 
>> points. I increased it to 11.000 and then I was able to see the peak value 
>> again as expected. As "min Step" I set 15s in Grafana.
>>
>>
>> However this only solved the first part of my main problem. Now I can 
>> reliably query the peaks of a time range and do not miss the peak.
>> If I run "max_over_time{}[1m:15s]" it will show me the peak of every 1m 
>> evaluating every 15s sample. That's ok.
>>
>> second step would be to get the top3 max_over_time values of the selected 
>> time_range. If I run "max_over_time{}[1m:15s]" for the last 3 days I will 
>> get 3d x 24h x 60m "max_over_time" values. May goal was and that is why I 
>> used "topk(3,)" (wrong) to get the top3 values of the last 24hrs evaluated 
>> by "max_over_time{}[1m:15s]". But topk(3,) only show me the top3 values at 
>> the same timestamp
>>
>> third step would be to get the time of these top3 values calculated 
>> earlier.
>>
>> the fourth step would be to get the top3 vaues of the month august and 
>> the top3 values of the month july.
>>
>> Maybe you can push mit into the righ direction.
>>
>> My two prometheus are 2.54.1 and 2.53.1
>>
>> Brian Candler schrieb am Samstag, 31. August 2024 um 10:16:01 UTC+2:
>>
>>> Why are you doing a subquery there?  max_over_time(metric[1h]) should 
>>> give you the largest value at any time over that 1h period. The range 
>>> vector includes all the points in that time period, without resampling.
>>>
>>> A subquery could be used if you needed to take an instant vector 
>>> expression and turn it into a range vector by evaluating it at multiple 
>>> time instants, e.g.
>>>
>>> max_over_time( (metric > 10 < 100)[1h:15s] )
>>>
>>> But for a simple vector expression, the range vector is better than the 
>>> subquery as you get all the data points without resampling.
>>>
>>> You said before:
>>>
>>> > However first problem if the max value is e.g. 22 and it appears 
>>> several times within the timerange I see this displayxed several times.
>>>
>>> That makes no sense. The result of max_over_time() is an *instant 
>>> vector*. By definition, it only has one value for each unique set of 
>>> labels. If you see multiple values of 22, then they are for separate 
>>> timeseries, and each will be identified by its unique sets of labels.
>>>
>>> That's what max_over_time does: it works on a range vector of 
>>> timeseries, and gives you the *single* maximum for *each* timeseries. If 
>>> you pass it a range vector with 10 timeseries, you will get an instant 
>>> vector with 10 timeseries.
>>>
>>> > I would like to idealle see the most recent ones
>>>
>>> That also makes no sense. For each timeseries, you will get the maximum 
>>> value of that timeseries across the whole time range, regardless of at what 
>>> time it occurred, and regardless of the values of any other timeseries.
>>>
>>> topk(3, ...) then just picks whichever three timeseries have the highest 
>>> maxima over the time period.
>>>
>>> > Why do I see a correct peak using
>>> >        max_over_time(metric{}[1h:15s])
>>> > 
>>> > but if I run this command the peak is lower than with the other 
>>> command before?
>>> >         max_over_time(metric{}[24h:15s])
>>>
>>> I'm not sure, but first, try comparing the range vector forms:
>>>
>>> max_over_time(metric{}[1h])
>>> max_over_time(metric{}[24h])
>>>
>>> If those work as expected, then there may be some issue with subqueries. 
>>> That can be drilled down into by looking at the raw data. Try these queries 
>>> in the PromQL browser, set to "table" rather than "graph" mode:
>>>
>>> metric{}[24h]
>>> metric{}[24h:15s]
>>>
>>> It will show the actual data that max_over_time() is working across. It 
>>> might be some issue around resampling of the data, but I can't think off 
>>> the top of my head what it could be.
>>>
>>> What version of prometheus are you running? It could be a bug with 
>>> subqueries, which may or may not be fixed in later versions.
>>>
>>> Also, please remove Grafana from the equation. Enter your PromQL queries 
>>> directly into the PromQL browser in Prometheus.  There are lots of ways you 
>>> can misconfigure Grafana or otherwise confuse matters, e.g. by asking it to 
>>> sweep an instant vector query over a time range to form a graph.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/4d3698a5-b874-4a5b-958c-c7d3cdcfd938n%40googlegroups.com.

Reply via email to