The scopes look OK to me.
Let's try to narrow down the problem areas a bit:
1. Did this work with the same setup before 1.3?
2. Are all task/operator metrics available in the metrics tab of the
dashboard?
3. Are there any warnings in the TaskManager logs from the
MetricRegistry or StatsDReporter?
My *guess *would be that the operator/task metrics contain characters
that either StatsD or telegraf don't allow,
which causes them to be dropped.
On 12.06.2017 20:32, Dail, Christopher wrote:
I’m using the Flink 1.3.0 release and am not seeing all of the metrics
that I would expect to see. I have flink configured to write out
metrics via statsd and I am consuming this with telegraf. Initially I
thought this was an issue with telegraf parsing the data generated. I
dumped all of the metrics going into telegraf using tcpdump and found
that there was a bunch of data missing that I expect.
I’m using this as a reference for what metrics I expect:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html
I see all of the JobManager and TaskManager level metrics. Things like
Status.JVM.* are coming through. TaskManager Status.Network are there
(but not Task level buffers). The ‘Cluster’ metrics are there.
This IO section contains task and operator level metrics (like what is
available on the dashboard). I’m not seeing any of these metrics
coming through when using statsd.
I’m configuring flink with this configuration:
metrics.reporters: statsd
metrics.reporter.statsd.class:
org.apache.flink.metrics.statsd.StatsDReporter
metrics.reporter.statsd.host: hostname
metrics.reporter.statsd.port: 8125
# Customized Scopes
metrics.scope.jm: flink.jm
metrics.scope.jm.job: flink.jm.<job_name>
metrics.scope.tm: flink.tm.<tm_id>
metrics.scope.tm.job: flink.tm.<tm_id>.<job_name>
metrics.scope.task:
flink.tm.<tm_id>.<job_name>.<task_name>.<subtask_index>
metrics.scope.operator:
flink.tm.<tm_id>.<job_name>.<operator_name>.<subtask_index>
I have tried with and without specifically setting the metrics.scope
values.
Is anyone else having similar issues with metrics in 1.3?
Thanks
*Chris Dail*
Director, Software Engineering
*Dell EMC* | Infrastructure Solutions Group
mobile +1 506 863 4675
christopher.d...@dell.com <mailto:christopher.d...@dell.com>