[prometheus-users] Re: Add labels

Christian Oelsner Sat, 19 Nov 2022 03:43:32 -0800

Hi Brian,
Thanks for your input, i will try to work with them.

I put in the honor_timestamps only because it was done in the example 
config provided on the confluent cloud metrics api documentation.
The reason why i am fetching the metrics all in one call is that Confluent 
imposes a 60 requests limit pr hour, and we found that we often hit that 
limit and received an HTTP 439, too many requests. After that we were 
"locked" out for 15-20 mins. This was not optimal.


A quick query in prometheus for example gives me this:
confluent_kafka_server_retained_bytes{instance="api.telemetry.confluent.cloud:443",
 job="Confluent-Cloud", kafka_id="lkc-0x3v22", 
topic="confluent-kafka-connect-qa.confluent-kafka_configs"}

Does that mean that i have a label simply called kafka_id?

I did infact try to wrap my head around using file_sd_configs but could not 
work out how the params part of it, so i gave up on that. It would be nice 
though, since our list of clusters keps growing every week.

Let me try ome of your thoughts here in the weekend and report back here.

Thanks again.

/Christian Oelsner

fredag den 18. november 2022 kl. 15.05.10 UTC+1 skrev Brian Candler:

> > How would i go about adding labels to the metrics?
>
> You have this:
>
>    static_configs: 
>       - targets:
>         - api.telemetry.confluent.cloud 
>
> This means you are only scraping one endpoint, one time.  If you wanted to 
> add the same labels to every metric received from that endpoint, you would 
> do this:
>
>    static_configs: 
>       - labels:
>           foo: bar
>           baz: qux
>         targets:
>         - api.telemetry.confluent.cloud 
>
> Of course, that's not what you're asking.
>
> The question now is, do the metrics that you get back all carry a label 
> which identifies the cluster, such as {cluster="lkc-1"}?
>
> If so, then it's a simple case of metric relabelling to add the department 
> labels corresponding to each cluster ID.  Add to the scrape job:
>
>     metric_relabel_configs:
>       - source_labels: [cluster]
>         regex: lkc-1
>         target_label: departmentID
>         replacement: Accounts
>       - source_labels: [cluster]
>         regex: lkc-2
>         target_label: departmentID
>         replacement: Engineering
>       # etc
>
> If you don't have such a label, then you will need to scrape the API 
> endpoint separately, once for each value of resource.kafka.id
>
> The dumb option is multiple scrape jobs:
>
> scrape_configs:
>   - job_name: Confluent Cloud lkc-1
>     scrape_interval: 1m
>     scrape_timeout: 1m
>     static_configs:
>       - labels:
>           department: Accounts
>
>         targets:
>           - api.telemetry.confluent.cloud
>     scheme: https
>     basic_auth:
>       username: <Cloud API Key>
>       password: <Cloud API Secret>
>     metrics_path: /v2/metrics/cloud/export
>     params:
>         "resource.kafka.id": [lkc-1]
>   - job_name: Confluent Cloud lkc-2
>     scrape_interval: 1m
>     scrape_timeout: 1m
>     static_configs:
>       - labels:
>           department: Engineering
>
>         targets:
>           - api.telemetry.confluent.cloud
>     scheme: https
>     basic_auth:
>       username: <Cloud API Key>
>       password: <Cloud API Secret>
>     metrics_path: /v2/metrics/cloud/export
>     params:
>         "resource.kafka.id": [lkc-2]
>   # ... etc
>
> That should work just fine, but is annoyingly verbose and repetitive.
>
> The second option, which I would normally use in this situation, is to set 
> the query parameter using a __param_XXXX label:
>
> scrape_configs:
>   - job_name: Confluent Cloud
>     scrape_interval: 1m
>     scrape_timeout: 1m
>     static_configs:
>       - labels:
>           department: Accounts
>           "__param_resource.kafka.id": lkc-1
>         targets:
>           - api.telemetry.confluent.cloud
>       - labels:
>           department: Engineering
>           "__param_resource.kafka.id": lkc-2
>         targets:
>           - api.telemetry.confluent.cloud
>       - labels:
>           department: Special Projects
>           "__param_resource.kafka.id": lkc-3
>         targets:
>           - api.telemetry.confluent.cloud
>       # etc
>
>     scheme: https
>     basic_auth:
>       username: <Cloud API Key>
>       password: <Cloud API Secret>
>     metrics_path: /v2/metrics/cloud/export
>
> Here, the parameter value is set to a single value each time using the 
> magic label "__param_<paramname>" instead of using "params: { name: [ 
> list_of_values ] }"
>
> Unfortunately, the problem is that I'm not sure that __param supports 
> parameter names with dots in them, because dots are technically not valid 
> in a label name 
> <https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels>.  
> You would need to try it to find out if it works, and I wouldn't be 
> surprised if it were rejected.
>
> Aside:
> - You should almost never use "honor_timestamps" so I have removed it in 
> the examples above.  If you do use it, you have to be very sure why, and 
> understand how it may break things.
> - When there are multiple targets like this I would use file_sd_configs 
> rather than static_configs for this (it's easier to maintain).
>
> The downside to these approaches is that you are now hitting the same API 
> endpoint N times (each returning 1/Nth of the data).  This only matters if 
> you get charged per API call.
>
> If you still want to fetch the responses in a single API call as you are 
> now, then you will have to use metric_relabelling, and somehow decide for 
> each metric that comes back which kafka cluster it came from by examining 
> the labels - which is the first approach I proposed.
>
> HTH,
>
> Brian.
>
> On Friday, 18 November 2022 at 09:51:41 UTC [email protected] wrote:
>
>> Hello,
>>
>> I am trying to add labels to metrics fetched from Confluent Cloud.
>> We are monitoring some 35 Kafka clusters.
>>
>> scrape_configs:
>>   - job_name: Confluent Cloud 
>>     scrape_interval: 1m
>>     scrape_timeout: 1m 
>>     honor_timestamps: true
>>     static_configs: 
>>       - targets: - api.telemetry.confluent.cloud 
>>     scheme: https
>>     basic_auth: 
>>       username: <Cloud API Key> 
>>       password: <Cloud API Secret> 
>>     metrics_path: /v2/metrics/cloud/export 
>>     params: 
>>         "resource.kafka.id": 
>>            - lkc-1 
>>            - lkc-2
>>            - lkc-3
>>            - lkc-4 
>>            - lkc-5
>>            - lkc-6
>>            - lkc-etc etc
>>
>>
>> Each lkc-xxxx represent a cluster which belongs to a department.
>> I would like to add a departmentID to the metrics belonging to to each 
>> cluster.
>> For example lkc-1 and lkc-5 would beong to department "analytics"
>>
>> How would i go about adding labels to the metrics?
>>
>> Best regards
>> Christian Oelsner
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/227963d1-1ede-4b56-ba01-e10b987b375dn%40googlegroups.com.

[prometheus-users] Re: Add labels

Reply via email to