I'm currently working on a metric system that
a) exposes several TaskManger metrics
b) allows gathering metrics in various parts of a task, most notably
user-defined functions.
The first version makes these metrics available via JMX on each
TaskManager.
While a mechanism to make that pluggable is /planned/ there are no
details on that yet.
I /guess/ once it is merged you should be able to modify one of the
classes so that the data is directly
exported to your tool, but i would have to know more about it to make a
definite assessment.
There are no plans to funnel all those metrics unaggregated through
Flink's accumulator mechanism;
only a selection that will be aggregated locally and on the JobManager
to display in the Dashboard.
Out of curiosity, what metrics are you interested in?
On 14.04.2016 20:59, Maxim wrote:
Hi!
I'm looking into integrating Flink into our stack and one of the
requirements is to report metrics to an internal system. The current
Accumulators are not adequate to provide visibility that we need to run
such a system in production. We want much more information about the
internal cluster state and ability to calculate aggregates ourselves. The
core reporting API accepts a metric name, metric type (gauge, counter,
timer) and a set of key value pairs that act as dimensions.
The ideal solution for us would report the metrics through such API and
provide default binding to existing Accumulators, but allow overriding it
to our internal reporting client.
Is it something that could be added to the Flink or there are other plans
for monitoring?
Thanks!
Maxim.