Re: [prometheus-users] Re: same recording rules on both remote write sender and receiver

Brian Candler Fri, 04 Feb 2022 08:28:24 -0800

Have you checked your prometheus version at both ends?  It's possible that 
bugs have been fixed. Remote write receiver was only officially promoted to 
"stable" in v2.33


Other than that, I'm afraid I don't have any ideas.

On Friday, 4 February 2022 at 16:13:15 UTC Bogdan L wrote:

> There are external_labels, yes. "instance" is also unique, there is no 
> overlap
>
> On 4 Feb 2022, at 17:28, Brian Candler <[email protected]> wrote:
>
> Have you given each of your "local" prometheus servers unique labels, 
> using the global external_labels 
> <https://prometheus.io/docs/prometheus/latest/configuration> setting 
> (recommended), or some other way?  This is to ensure all timeseries have a 
> unique label set.
>
>
> On Friday, 4 February 2022 at 13:19:22 UTC Bogdan L wrote:
>
>> Hi,
>>
>> I have a situation where I have a few "local" Prometheus servers sending 
>> data to a "global" server using the remote write API. I get errors that 
>> look like this on the remote write receiver:
>>
>> ts=2022-02-03T12:41:11.244Z caller=write_handler.go:57 level=error 
>> component=web msg="Out of order sample from remote write" err="duplicate 
>> sample for timestamp"
>>
>> The senders get the same error from the receiver, with a 400 HTML code.
>>
>> After much trial and error I figured out that it happens because I have 
>> the same recording rules on all servers, on both senders and receiver. 
>> recording-rules.yaml looks like this:
>> ```
>> groups:
>>   - name: node-exporter
>>     rules:
>>       # CPU cores per node
>>       - record: instance:node_cpus:count
>>         expr: count(node_cpu_seconds_total{mode="idle"}) without 
>> (cpu,mode)
>>
>>       # CPU in use by CPU
>>       - record: instance_cpu:node_cpu_seconds_not_idle:rate5m
>>         expr: sum(rate(node_cpu_seconds_total{mode!="idle"}[5m])) without 
>> (mode)
>> ```
>>
>> However, if I delete the second rule, the errors are gone. So if I change 
>> recording-rules.yaml on all servers to:
>> ```
>> groups:
>>   - name: node-exporter
>>     rules:
>>       # CPU cores per node
>>       - record: instance:node_cpus:count
>>         expr: count(node_cpu_seconds_total{mode="idle"}) without 
>> (cpu,mode)
>> ```
>>
>> Why?
>>
>> 1. Why are there duplicates in the first case, does the remote write 
>> receiver also run the rules when it receives data?
>> 2. Why aren't there errors any more when the only rule is the CPU count? 
>> Shouldn't there be duplicates in that case too?
>>
> -- 
> You received this message because you are subscribed to a topic in the 
> Google Groups "Prometheus Users" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/prometheus-users/vRTNtIlbdV8/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to 
> [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/600dd093-c5cc-4003-9fa0-5e531f6667ban%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/prometheus-users/600dd093-c5cc-4003-9fa0-5e531f6667ban%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/54b3e87d-d6fc-49f8-9ac3-a41f0111573fn%40googlegroups.com.

Re: [prometheus-users] Re: same recording rules on both remote write sender and receiver

Reply via email to