Hello,

We're experiencing difficulties in using Flink metrics in a generic way
since various properties are included in the name of the metric itself.
This makes it difficult to generate sensible (and general) dashboards (with
aggregations).

One example is the metric for rocksdb estimated live data size (
state.backend.rocksdb.metrics.estimate-live-data-size). the name appears as
:
flink_taskmanager_job_task_operator_<my_state_descriptor_name>_state_rocksdb_estimate_live_data_size
.

If, on the other hand, the state name was included as label, this would
facilitate aggregation across states, i.e.:
flink_taskmanager_job_task_operator_state_rocksdb_estimate_live_data_size{state_descriptor="my_state_descriptor"}

I think this is a general issue with the Flink metrics. If I define a
custom metric, it is not supported to use labels (
https://prometheus.io/docs/practices/naming/#labels) in a dynamic way.

Thanks !

Lars

Reply via email to