> And always alert is getting triggered even if I keep input series value
as 0.
No, the alert is *not* being triggered. That's what "got: []" means: "I got
an empty list of alerts" (although your test is written to *expect* a
non-empty list of alerts, so the reality doesn't match expectations)
> Any help on this please?
Well, to start with:
100000000/1024/1024/1024 *100 = 9.31322574
So you need to add a zero to your test values.
Secondly, your alerting rule has "for: 5m". If you keep this, then the
alert condition has to have been true for 5 minutes, and has to be
evaluated at 5 minutes. So you need:
values: '1000000000 1000000000 1000000000 1000000000 1000000000
1000000000'
and uncomment "eval_time: 5m"
Alternatively, you can comment out "for: 5m" in your alerting rule, in
which case you only need
values: '1000000000'
After this, your alert triggers (it "got" an alert), but you'll find the
labels and annotations don't match exactly what you'd put in the
expectations:
FAILED:
alertname: ServiceMemoryUsage90Percent, time: 5m,
exp:[
0:
Labels:{alert_owner="alerts-infra",
alertname="ServiceMemoryUsage90Percent", severity="Critical"}
Annotations:{description="Service memory has been at over 90%
for 1 minute. Service: (), DnsName: ()."}
],
got:[
0:
Labels:{alert_owner="alerts-infra",
alertname="ServiceMemoryUsage90Percent",
container_label_io_rancher_stack_name="network-service",
severity="Critical"}
Annotations:{description="Service memory has been at over 90%
for 1 minute. Service: network-service, DnsName: . \n"}
]
You can then modify your expectations to make them match what the alerting
rule actually generates - or you can modify the alerting rule itself to
generate a different set of labels and annotations.
On Wednesday, 12 April 2023 at 18:47:10 UTC+1 [email protected] wrote:
> Hi,
> I need help in this alert test.
> This is the alert rule I have,
>
>
> - alert: ServiceMemoryUsage90Percent
> expr:
> sum(container_memory_usage_bytes{job="cadvisor",container_label_io_rancher_stack_name!="",
>
> image=~"rancher.*"}) by (container_label_io_rancher_stack_name, instance,
> dns_name) /1024/1024/1024 *100 > 90
> for: 5m
> labels:
> severity: Critical
> alert_owner: alerts-infra
> annotations:
> description: "Service memory has been at over 90% for 1 minute. Service:
> {{ $labels.container_label_io_rancher_stack_name }}, DnsName: {{
> $labels.dns_name }}. \n"
>
>
>
> I am trying to write a test for this,
> tests:
> # Infra Alert Tests
> - input_series:
> - series:
> 'container_memory_usage_bytes{job="cadvisor",container_label_io_rancher_stack_name="network-service",image="rancher.test"}'
> values: '0 100000000 0 100000000 100000000'
>
> alert_rule_test:
> - alertname: ServiceMemoryUsage90Percent
> # eval_time: 5m
> exp_alerts:
> - exp_labels:
> alertname: ServiceMemoryUsage90Percent
> severity: Critical
> alert_owner: alerts-infra
> exp_annotations:
> description: "Service memory has been at over 90% for 1 minute. Service:
> (), DnsName: ()."
>
> But I am always getting ,
>
>
>
>
>
>
>
>
>
> *Unit Testing: tests/infra_db_alerts-tests.yml FAILED: alertname:
> ServiceMemoryUsage90Percent, time: 0s, exp:[ 0:
> Labels:{alert_owner="alerts-infra",
> alertname="ServiceMemoryUsage90Percent", severity="Critical"}
> Annotations:{description="Service memory has been at over 90% for 1 minute.
> Service: (), DnsName: ()."} ], got:[]*
>
>
> I tried many options in alert rule testing. But nothing helped.
> Any help on this please?
> not sure why I am getting got as empty. And always alert is getting
> triggered even if I keep input series value as 0.
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/7c9aebd3-b3ac-4477-9c3a-8a4f3e750f12n%40googlegroups.com.