I’m using the Flink 1.3.0 release and am not seeing all of the metrics that I 
would expect to see. I have flink configured to write out metrics via statsd 
and I am consuming this with telegraf. Initially I thought this was an issue 
with telegraf parsing the data generated. I dumped all of the metrics going 
into telegraf using tcpdump and found that there was a bunch of data missing 
that I expect.

I’m using this as a reference for what metrics I expect:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

I see all of the JobManager and TaskManager level metrics. Things like 
Status.JVM.* are coming through. TaskManager Status.Network are there (but not 
Task level buffers). The ‘Cluster’ metrics are there.

This IO section contains task and operator level metrics (like what is 
available on the dashboard). I’m not seeing any of these metrics coming through 
when using statsd.

I’m configuring flink with this configuration:

metrics.reporters: statsd
metrics.reporter.statsd.class: org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.statsd.host: hostname
metrics.reporter.statsd.port: 8125

# Customized Scopes
metrics.scope.jm: flink.jm
metrics.scope.jm.job: flink.jm.<job_name>
metrics.scope.tm: flink.tm.<tm_id>
metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>
metrics.scope.task: flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>
metrics.scope.operator: 
flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>

I have tried with and without specifically setting the metrics.scope values.

Is anyone else having similar issues with metrics in 1.3?

Thanks

Chris Dail
Director, Software Engineering
Dell EMC | Infrastructure Solutions Group
mobile +1 506 863 4675
christopher.d...@dell.com<mailto:christopher.d...@dell.com>



Reply via email to