The issue started again. 

629G    chunks_head
0       lock
4.0K    queries.active
9.3G    wal

There is numerous restart of Prometheus
Feb 17 09:02:02 kernel: Out of memory: Kill process 36580 (prometheus) 
score 844 or sacrifice child
Feb 17 09:08:36 kernel: Out of memory: Kill process 39001 (prometheus) 
score 846 or sacrifice child
Feb 17 09:16:02 kernel: Out of memory: Kill process 41074 (prometheus) 
score 845 or sacrifice child
Feb 17 09:22:17 kernel: Out of memory: Kill process 44665 (prometheus) 
score 844 or sacrifice child
Feb 17 09:29:25 kernel: Out of memory: Kill process 47234 (prometheus) 
score 844 or sacrifice child
Feb 17 09:36:06 kernel: Out of memory: Kill process 48970 (prometheus) 
score 846 or sacrifice child
Feb 17 09:43:21 kernel: Out of memory: Kill process 50661 (prometheus) 
score 844 or sacrifice child

but there is plenty of mem available in the servers.

              total        used        free      shared  buff/cache   
available
Mem:             47           5          31           0          10         
 40
Swap:             5           1           3
Total:           52           7          35

On Tuesday, February 1, 2022 at 5:21:32 PM UTC-5 Brian Candler wrote:

> On Tuesday, 1 February 2022 at 21:52:30 UTC Senthil wrote:
>
>> I started on Jan 31, so it's a day.
>>
>> # du -sck chunks_head/*
>> 54140   chunks_head/024326
>> 4       chunks_head/024327
>> 54144   total
>>
>
> That's perfectly reasonable: it's only 54MB (which is a long way from 
> 689GB!)
>
> Here's what I see on a moderately busy system:
>
> root@ldex-prometheus:~# du -sck /var/lib/prometheus/data/chunks_head/*
> 81004        /var/lib/prometheus/data/chunks_head/006831
> 77824        /var/lib/prometheus/data/chunks_head/006832
> 158828        total
>
> That's comparable to yours.
>
> Therefore, I think you need to keep an eye on this periodically.  If only 
> you had a monitoring system which could do this for you :-)
>
> If it does start to rise, that's when you'll need to check prometheus log 
> output and find out what's happening.  But this is very strange, and it 
> does seem to be something specific to your system.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0ed74316-a992-4fdf-ba77-9890cec75131n%40googlegroups.com.

Reply via email to