[prometheus-users] Re: storage.tsdb.max-block-duration to a lower value completely stops compaction

Sukhada Sankpal Thu, 25 Jan 2024 11:50:27 -0800

Thanks Brian
I have enclosed a screenshot of TSDB head stats.
I have setup GOGC to 60% based on recommendation by Bryan Boreham for this 
setup


However, what does this parameter exactly do? Let's say my data retention 
is 30 days, this parameter by default sets to 3 days. Does that mean every 
3 days the data compaction will be triggered for 30days of data?
On Wednesday, January 24, 2024 at 11:15:09 PM UTC-8 Brian Candler wrote:

> Since regular blocks are 2h, setting maximum size of compacted blocks to 
> 1h sound unlikely to work.  And therefore testing with 1d seems reasonable.
>
> Can you provide more details about the scale of your environment, in 
> particular the "head stats" from Status > TSDB Stats in the Prometheus web 
> interface?
>
> However, I think what you're seeing could be simply an artefact of how 
> Go's garbage collection works, and you can make it more aggressive by 
> tuning GOGC and/or GOMEMLIMIT. See
> https://tip.golang.org/doc/gc-guide#GOGC
> for more details.
>
> Roughly speaking, the default garbage collector behaviour in Go is to 
> allow memory usage to expand to double the current usage, before triggering 
> a garbage collector cycle. So if the steady-state heap is 50GB, it would be 
> normal for it to grow to 100GB if you don't tune it.
>
> If this is the case, setting smaller compacted blocks is unlikely to make 
> any difference to memory usage - and it could degrade query performance.
>
> On Wednesday 24 January 2024 at 21:45:50 UTC Sukhada Sankpal wrote:
>
>> Background on why I wanted to play around this parameter:
>> Using LTS version for testing i.e. 2.45.2
>> During compaction i.e. every 3days, the resident memory of prometheus 
>> spikes to a very high value. Example if average of 
>> process_resident_memory_bytes is around 50 GB and at the time of compaction 
>> it spikes to 120 to 160 GB. Considering the usage of 50 GB want memory 
>> allocated to the host to be around 128GB. But looking at memory usage spike 
>> during compaction, this doesn't seem to be a workable option and keeping a 
>> low value may lead to OOM during compaction. It also adds to cost for cloud 
>> based VMs.
>> On Wednesday, January 24, 2024 at 1:35:16 PM UTC-8 Sukhada Sankpal wrote:
>>
>>> storage.tsdb.max-block-duration default value is set to be 10% of 
>>> retention time. I am currently using a setup with 30 days of retention and 
>>> thereby this flags default value is set to be 3 days.
>>> Based on suggestions posted here: 
>>> https://github.com/prometheus/prometheus/issues/6934#issuecomment-1610921555
>>> I changed storage.tsdb.min-block-duration to 30m and 
>>> storage.tsdb.max-block-duration to 1h. This resulted in no-compaction state 
>>> and local storage increased quickly.
>>>
>>> In order to enable the compaction and have a safe test, I changed 
>>> storage.tsdb.max-block-duration to 1day
>>>
>>> I want some guideline on what is a safe lower value of this parameter 
>>> and keeping it low impact in increased memory usage?
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/da582b28-452c-4377-9ac1-65911f811b48n%40googlegroups.com.

[prometheus-users] Re: storage.tsdb.max-block-duration to a lower value completely stops compaction

Reply via email to