@Martin Just a ping about this issue, how did you identify what services were causing you trouble with too much metrics? I'm asking because I'm facing a similar problem at the moment.
Thank you. Em segunda-feira, 13 de abril de 2020 às 06:53:13 UTC-3, Martin Man escreveu: > Hi Nishant, > > I’m also new to prometheus and faced similar scenario recently. > > What helped me was to add a job to monitor prometheus instance itself, > then import a prometheus 2.0 grafana dashboard and watch prometheus memory > consumption and samples appended per second while defining new > servicemonitors. This in the end helped me stabilise the memory usage as > well as identify services that generated way too many metrics responsible > for huge memory consumption. > > HTH, > Martin > > > > On 13 Apr 2020, at 11:03, Nishant Ketu <nishan...@atmecs.com> wrote: > > > > We have deployed Prometheus through helm and using after around 2 months > we get OOM error and the pods failed to restart. We have manually clean up > the /data to get the pod running again. I have used the retention flag but > it don't seem to work on wall folder of /data. Any help for this would be > nice. Thanks > > > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/324d48de-c75e-436e-9abf-fe3225853dd8n%40googlegroups.com.