Re: [prometheus-users] fading out sample resolution for samples from longer ago possible?

Christoph Anton Mitterer Wed, 01 Mar 2023 18:58:20 -0800

Hey Brian

On Tue, 2023-02-28 at 00:27 -0800, Brian Candler wrote:
> 
> I can offer a couple more options:
> 
> (1) Use two servers with federation.
> - server 1 does the scraping and keeps the detailed data for 2 weeks
> - server 2 scrapes server 1 at lower interval, using the federation
> endpoint


I had thought about that as well. Though it feels a bit "ugly".


> (2) Use recording rules to generate lower-resolution copies of the
> primary timeseries - but then you'd still have to remote-write them
> to a second server to get the longer retention, since this can't be
> set at timeseries level.

I had (very briefly) read about the recording rules (merely just that
they exist ^^) ... but wouldn't these give me a new name for the
metric?

If so, I'd need to adapt e.g.
https://grafana.com/grafana/dashboards/1860-node-exporter-full/ to use
the metrics generated by the recording rules,... which again seems
quite some maintenance effort.

Plus, as you even wrote below, I'd need users to use different
dashboards, AFAIU, one where the detailed data is used, one where the
downsampled data is used.
Sure that would work as a workaround, but is of course not really a
good solution, as one would rather want to "seamlessly" move from the
detailed to less-detailed data.


> Either case makes the querying more awkward.  If you don't want
> separate dashboards for near-term and long-term data, then it might
> work to stick promxy in front of them.

Which would however make the setup more complex again.


> Apart from saving disk space (and disks are really, really cheap
> these days), I suspect the main benefit you're looking for is to get
> faster queries when running over long time periods.  Indeed, I
> believe Thanos creates downsampled timeseries for exactly this
> reason, whilst still continuing to retain all the full-resolution
> data as well.

I guess I may have too look into that, how complex it's setup would be.



> That depends.  What PromQL query does your graph use? How many
> timeseries does it touch? What's your scrape interval?

So far I've just been playing with the ones from:
https://grafana.com/grafana/dashboards/1860-node-exporter-full/
So all queries in that and all time series that uses.

Interval is 15s.


> Is your VM backed by SSDs?

I think it's a Ceph cluster what the super computing centre uses for
that, but I have no idea what that runs upon. Probably HDDs.


> Another suggestion: running netdata within the VM will give you
> performance metrics at 1 second intervals, which can help identify
> what's happening during those 10-15 seconds: e.g. are you
> bottlenecked on CPU, or disk I/O, or something else.

Good idea, thanks.


Thanks,
Chris.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e35d617dbaab44de43da049414103ff1e9102e61.camel%40gmail.com.

Re: [prometheus-users] fading out sample resolution for samples from longer ago possible?

Reply via email to