[prometheus-users] Re: Historical Data

2022-08-24 Thread Adso Castro
Oh, another person of culture I see. Nice to see a fellow VM user :) We're running the cluster version of VM too, 6 month retention by the way. This is a PoC we're running to understand if we can put those historical series to some use. Thanks Brian! Em quarta-feira, 24 de agosto de 2022 às 12:

[prometheus-users] Historical Data

2022-08-24 Thread Adso Castro
Hey, What's your opinion about sending Prometheus metrics to a Data Lake or something? My team has a PoC about it and I'm trying to find a way if this is possible or a pure waste of time. The goal is to serve historical data to other teams, basically. By the way, I'm not talking about a prometh

[prometheus-users] AlertManager Dropping Messages

2021-03-16 Thread Adso Castro
Hey, one of my 2 alertmanager pods is having this issue: level=warn ts=2021-03-15T13:16:47.553Z caller=delegate.go:272 component=cluster msg="dropping messages because too many are queued" current=4112 limit=4096 After doing some research, looks like there is no way for me to fix that by incre

[prometheus-users] Relabel configuration for replace action requires target_label

2021-01-04 Thread Adso Castro
Hi all, My testing Prometheus Operator Stack is having some bad time, the prometheus pods are crashing non stop, error below: {"caller":"main.go:289","err":"parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: relabel configuration for replace action requires 'target_label' value"

[prometheus-users] [Thanos] Not deduplicating metrics?

2020-07-24 Thread Adso Castro
Hi guys, I have an issue with Thanos that I would like to drop here first before taking it somewhere else.I’m running prometheus-operator inside my k8s cluster along with Thanos. All good, working fine, until I noticed some metrics are doubled in Grafana, like I query a random http metric for t

Re: [prometheus-users]

2020-07-20 Thread Adso Castro
Hey, Well, you can search some guides on youtube, there's plenty of it, just search "Kubernetes + Prometheus" so you can learn the basics from AlertManager and Prometheus Rules as well. Also, don't forget the official docs: AlertManager: https://prometheus.io/docs/alerting/latest/alertmanager/

[prometheus-users] Heavy Queries and Grafana Dashboards

2020-07-20 Thread Adso Castro
Hey all, Does anyone know how can I identity heavy queries and heavy (Grafana) dashboards so I can create records to relieve it from my Prometheus stack? Maybe a query or something else? Couldn't find anything close to it. Thank you. -- You received this message because you are subscribed

Re: [prometheus-users] OOM error for Prometheus

2020-07-16 Thread Adso Castro
@Martin Just a ping about this issue, how did you identify what services were causing you trouble with too much metrics? I'm asking because I'm facing a similar problem at the moment. Thank you. Em segunda-feira, 13 de abril de 2020 às 06:53:13 UTC-3, Martin Man escreveu: > Hi Nishant, > >

Re: [prometheus-users] OOM error for Prometheus

2020-07-16 Thread Adso Castro
@Martin Just a ping about this issue, how did you identified what services were causing you trouble with too much metrics? I'm asking because I'm facing a similar problem at the moment. Thank you. Em segunda-feira, 13 de abril de 2020 às 06:53:13 UTC-3, Martin Man escreveu: > Hi Nishant, > >

Re: [prometheus-users] Is there any limits for prometheus monitoring?

2020-07-15 Thread Adso Castro
rator/pull/3241). Em quarta-feira, 15 de julho de 2020 17:48:35 UTC-3, Stuart Clark escreveu: > > On 15/07/2020 21:40, Adso Castro wrote: > > Seconded. Clark's right. My current scenario is kinda the same (pod > > memory keep floating between 12~16GB) and I'm currentl

Re: [prometheus-users] Is there any limits for prometheus monitoring?

2020-07-15 Thread Adso Castro
Seconded. Clark's right. My current scenario is kinda the same (pod memory keep floating between 12~16GB) and I'm currently working on identifying why do I have so much targets (17k at the moment) and doing all the records I can to relieve stress from dashboards and rules. Em quarta-feira, 24 d

[prometheus-users] Re: Metrics Deduplication

2020-06-18 Thread Adso Castro
That sure helps a lot. Thank you. Em quinta-feira, 18 de junho de 2020 11:37:09 UTC-3, Mat Arye escreveu: > > > > On Wednesday, June 17, 2020 at 11:18:47 AM UTC-4, Adso Castro wrote: >> >> Hey all, >> >> I have a curious question here: >> >&g

[prometheus-users] Metrics Deduplication

2020-06-17 Thread Adso Castro
Hey all, I have a curious question here: I have 3 Prometheus replicas running under my Prometheus-Operator (plus Thanos) inside a Kubernetes Cluster. When I query something within the range of 3 or 6+ hours, I get the same metric x 3. Is that correct or should I get a single metric already refi

Re: [prometheus-users] Out of Order Samples

2020-05-29 Thread Adso Castro
Thanks, I'll check it out. Adso Castro cho140...@naver.com Em sex, 29 de mai de 2020 6:26 PM, Brian Brazil < brian.bra...@robustperception.io> escreveu: > On Fri, 29 May 2020 at 21:47, Adso Castro wrote: > >> Sup all, >> >> I'm having a problem

[prometheus-users] Out of Order Samples

2020-05-29 Thread Adso Castro
Sup all, I'm having a problem with *out of order samples* that started yesterday on my prometheus operator cluster and I'm finding it quite hard to identify the root cause of it. I've been reading some threads about the problem being one of the targets or maybe the time from the servers, but my

[prometheus-users] Kubernetes known labels that can be dropped

2020-04-20 Thread Adso Castro
Hi all, Do you know any known labels from a Kubernetes Prometheus installation that can be dropped? Obviously, not considering particular metrics from anyone, but some like: - container_memory_failures_total - container_tasks_state - go_* Any ideas? Thanks. -- You received this message beca

Re: [prometheus-users] [Promql] How do I know if a value hasn't changed in a while?

2020-04-12 Thread Adso Castro
That did the magic, thank you very much Christian! Have a nice one! Em dom., 12 de abr. de 2020 às 03:53, Christian Hoffmann < m...@hoffmann-christian.info> escreveu: > Hi, > > On 4/12/20 1:05 AM, Adso Castro wrote: > > That's the question, there's a metric: *job

[prometheus-users] [Promql] How do I know if a value hasn't changed in a while?

2020-04-11 Thread Adso Castro
That's the question, there's a metric: *jobs_sent 1728* I want to know if that value hasn't changed in 1h for example. How do I do that? Thank you -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop recei

[prometheus-users] AlertManager Silence Regex

2020-04-04 Thread Adso Castro
Hi all, I'm trying to create a silence using regex from the AlertManager UI and I want to silence alarms from deployments (from my Kubernetes cluster) like below: *new-batch-stream-**g1343* *super-batch-pipeline-g1273* The match would be the word "batch". I've tried a few times using the UI,

[prometheus-users] Replacing a "/" using regex

2020-02-28 Thread Adso Castro
Hey guys, I'm in the middle of something here, I'm trying to replace the "/" from the metric below but this isn't working at all. I'm using a beanstalkd exporter, and the tubes are being listed with a "/" and it's messing up with some automations that we're using because of that forward slash,