Two alerts suggests that the two instances aren't talking to each other. How have you configured them? Does the UI show the "other" instance?
On 5 July 2022 08:34:45 BST, Venkatraman Natarajan <[email protected]> wrote: >Thanks Brian. I have used last_over_time query in our expression instead of >turning off auto-resolved. > >Also, we have two alert managers in our environment. Both are up and >running. But Nowadays, we are getting two alerts from two alert managers. >Could you please help me to sort this issue as well.? > >Please find the alert manager configuration. > > alertmanager0: > image: prom/alertmanager > container_name: alertmanager0 > user: rootuser > volumes: > - ../data:/data > - ../config/alertmanager.yml:/etc/alertmanager/alertmanager.yml > command: > - '--config.file=/etc/alertmanager/alertmanager.yml' > - '--storage.path=/data/alert0' > - '--cluster.listen-address=0.0.0.0:6783' > - '--cluster.peer={{ IP Address }}:6783' > - '--cluster.peer={{ IP Address }}:6783' > restart: unless-stopped > logging: > driver: "json-file" > options: > max-size: "10m" > max-file: "2" > ports: > - 9093:9093 > - 6783:6783 > networks: > - network > >Regards, >Venkatraman N > > > >On Sat, Jun 25, 2022 at 9:05 PM Brian Candler <[email protected]> wrote: > >> If probe_success becomes non-zero, even for a single evaluation interval, >> then the alert will be immediately resolved. There is no delay on >> resolving, like there is for pending->firing ("for: 5m"). >> >> I suggest you enter the alerting expression, e.g. "probe_success == 0", >> into the PromQL web interface (query browser), and switch to Graph view, >> and zoom in. If you see any gaps in the graph, then the alert was resolved >> at that instant. >> >> Conversely, use the query >> probe_success{instance="xxx"} != 0 >> to look at a particular timeseries, as identified by the label9s), and see >> if there are any dots shown where the label is non-zero. >> >> To make your alerts more robust you may need to use queries with range >> vectors, e.g. min_over_time(foo[5m]) or max_over_time(foo[5m]) or whatever. >> >> As a general rule though: you should consider carefully whether you want >> to send *any* notification for resolved alerts. Personally, I have >> switched to send_resolved = false. There are some good explanations here: >> >> https://www.robustperception.io/running-into-burning-buildings-because-the-fire-alarm-stopped >> >> https://docs.google.com/document/d/199PqyG3UsyXlwieHaqbGiWVa8eMWi8zzAn0YfcApr8Q/ >> >> You don't want to build a culture where people ignore alerts because the >> alert cleared itself - or is expected to clear itself. >> >> You want the alert condition to trigger a *process*, which is an >> investigation of *why* the alert happened, *what* caused it, whether the >> underlying cause has been fixed, and whether the alerting rule itself was >> wrong. When all that has been investigated, manually close the ticket. >> The fact that the alert has gone below threshold doesn't mean that this >> work no longer needs to be done. >> >> On Saturday, 25 June 2022 at 13:27:22 UTC+1 [email protected] wrote: >> >>> Hi Team, >>> >>> We are having two prometheus and two alert managers in separate VMs as >>> containers. >>> >>> Alerts are getting auto resolved even though the issues are there as per >>> threshold. >>> >>> For example, if we have an alert rule called probe_success == 0 means it >>> is triggering an alert but after sometime the alert gets auto-resolved >>> because we have enabled send_resolved = true. But probe_success == 0 still >>> there so we don't want to auto resolve the alerts. >>> >>> Could you please help us on this.? >>> >>> Thanks, >>> Venkatraman N >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Prometheus Users" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/prometheus-users/68bff458-ee79-42ce-bafb-facd239e26aen%40googlegroups.com >> <https://groups.google.com/d/msgid/prometheus-users/68bff458-ee79-42ce-bafb-facd239e26aen%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > >-- >You received this message because you are subscribed to the Google Groups >"Prometheus Users" group. >To unsubscribe from this group and stop receiving emails from it, send an >email to [email protected]. >To view this discussion on the web visit >https://groups.google.com/d/msgid/prometheus-users/CANSgTEbTrr7Jjf_XwD0J8wgMAdiLg9g_MmWDK%3DpgkTjwMA5YZA%40mail.gmail.com. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/D6D0720C-85D3-4B40-B7F7-48C2FE1F86F6%40Jahingo.com.

