> I think this is a general issue with the Flink metrics.
Not quite. There are a few instance in Flink were code wasn't updated to
encode metadata as additional labels, and the RocksDB metrics may be one
of them.
Also for RocksDB, you could try setting
"state.backend.rocksdb.metrics.column-family-as-variable: true" to
resolve this particular problem.
> If I define a custom metric, it is not supported to use labels
You can do so via MetricGroup#addGroup(String key, String value).
See
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/ops/metrics/#user-variables
On 17/10/2023 14:31, Lars Skjærven wrote:
Hello,
We're experiencing difficulties in using Flink metrics in a generic
way since various properties are included in the name of the metric
itself. This makes it difficult to generate sensible (and general)
dashboards (with aggregations).
One example is the metric for rocksdb estimated live data size
(state.backend.rocksdb.metrics.estimate-live-data-size). the name
appears as :
flink_taskmanager_job_task_operator_<my_state_descriptor_name>_state_rocksdb_estimate_live_data_size
.
If, on the other hand, the state name was included as label, this
would facilitate aggregation across states, i.e.:
flink_taskmanager_job_task_operator_state_rocksdb_estimate_live_data_size{state_descriptor="my_state_descriptor"}
I think this is a general issue with the Flink metrics. If I define a
custom metric, it is not supported to use labels
(https://prometheus.io/docs/practices/naming/#labels) in a dynamic way.
Thanks !
Lars