I'm trying to use the group_wait parameter in order to allow Alertmanager 
to wait for all the alerts received from Prometheus, group them and send a 
single notification. 

I have the following configuration: 

route:
  receiver: default-receiver
  group_by:
  - alertname
  - environment
  continue: false
  group_wait: 5m
  group_interval: 20m
  repeat_interval: 1d


receivers:
- name: default-receiver
  email_configs:
  - send_resolved: true
    to: [email protected]
    from: [email protected]
    hello: localhost
    smarthost: smptserver:25



Although the group_wait parameter is set to 5 minutes, as soon as 
Alertmanager receives the alerts from Prometheus, it flushes the alerts and 
also sends a notification to the configured receiver. I would expect 
Alertmanager to delay the notification message and send it after 5 minutes 
(value of group_wait parameter). 

ts=2022-11-22T12:37:19.367Z caller=cluster.go:705 level=info 
component=cluster msg="gossip not settled" polls=0 before=0 now=1 
elapsed=2.000781422s
ts=2022-11-22T12:37:21.368Z caller=cluster.go:702 level=debug 
component=cluster msg="gossip looks settled" elapsed=4.001197371s
ts=2022-11-22T12:37:23.368Z caller=cluster.go:702 level=debug 
component=cluster msg="gossip looks settled" elapsed=6.001883916s
ts=2022-11-22T12:37:25.369Z caller=cluster.go:702 level=debug 
component=cluster msg="gossip looks settled" elapsed=8.00222292s
ts=2022-11-22T12:37:27.369Z caller=cluster.go:697 level=info 
component=cluster msg="gossip settled; proceeding" elapsed=10.002782746s
ts=2022-11-22T12:37:42.811Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=file_not_processed[c0e2772][active]
ts=2022-11-22T12:37:42.812Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=file_not_processed[64605a5][active]
ts=2022-11-22T12:37:42.812Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=file_not_processed[e70ae18][active]
ts=2022-11-22T12:37:42.812Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=file_not_processed[7325965][active]
ts=2022-11-22T12:37:42.812Z caller=dispatch.go:517 level=debug 
component=dispatcher aggrGroup="{}:{alertname=\"file_not_processed\", 
environment=\"ACC\"}" msg=flushing 
alerts=[file_not_processed[c0e2772][active]]
ts=2022-11-22T12:37:42.812Z caller=dispatch.go:517 level=debug 
component=dispatcher aggrGroup="{}:{alertname=\"file_not_processed\", 
environment=\"DEV\"}" msg=flushing 
alerts="[file_not_processed[64605a5][active] 
file_not_processed[e70ae18][active] file_not_processed[7325965][active]]"
ts=2022-11-22T12:37:42.883Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=webhook[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:42.914Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=webhook[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:43.031Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=email[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:43.031Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=email[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:43.660Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=locked_oracle_accounts[bcc49ac][active]
ts=2022-11-22T12:37:43.660Z caller=dispatch.go:517 level=debug 
component=dispatcher aggrGroup="{}:{alertname=\"locked_oracle_accounts\", 
environment=\"DEV\"}" msg=flushing 
alerts=[locked_oracle_accounts[bcc49ac][active]]
ts=2022-11-22T12:37:43.704Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=webhook[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:43.840Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=email[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:58.355Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=sdl_critical_services_down[7b9c988][active]
ts=2022-11-22T12:37:58.355Z caller=dispatch.go:517 level=debug 
component=dispatcher 
aggrGroup="{}:{alertname=\"sdl_critical_services_down\", 
environment=\"TST\"}" msg=flushing 
alerts=[sdl_critical_services_down[7b9c988][active]]
ts=2022-11-22T12:37:58.398Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=webhook[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:37:58.416Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=sdl_critical_services_down[7b9c988][active]
ts=2022-11-22T12:37:58.494Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=email[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:38:02.724Z caller=dispatch.go:165 level=debug 
component=dispatcher msg="Received alert" 
alert=edl_instance_down[49003d1][active]
ts=2022-11-22T12:38:02.724Z caller=dispatch.go:517 level=debug 
component=dispatcher aggrGroup="{}:{alertname=\"edl_instance_down\", 
environment=\"ACC\"}" msg=flushing 
alerts=[edl_instance_down[49003d1][active]]
ts=2022-11-22T12:38:02.765Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=webhook[0] 
msg="Notify success" attempts=1
ts=2022-11-22T12:38:02.876Z caller=notify.go:743 level=debug 
component=dispatcher receiver=default-receiver integration=email[0] 
msg="Notify success" attempts=1


I expect Alertmanager to group the alerts from Prometheus and send after 5 
minutes (group_wait value) 1 single notification that contains all the 
grouped alerts. In my case it seems like group_wait parameter is not 
considered and as soon as the alert is received from Prometheus, a 
notification to the receiver is sent immediately after. Due to this 
behavior, alertmanager won't have time to group all the alerts of the same 
type (based on my group_by filters) and  i will have multiple notifications 
for the same alerts at a new evaluation interval period (group_interval). 

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/987ea2b3-8283-467f-a73e-d8c0bc3abde4n%40googlegroups.com.

Reply via email to