Although Flink exposes some metrics in the API/UI, it probably only does that because it was easy to do and convenient for users. However, I don't think Flink is intended to be a complete monitoring solution for your cluster.
Instead, you should take a look at collectd https://collectd.org/ which is meant for monitoring OS-level metrics and has, for example, a Graphite plugin which you can use to write to a Graphite server or statsd instance… or you can integrate it some other way depending on what you have & what you want. -Shannon From: Lydia Ickler <ickle...@googlemail.com<mailto:ickle...@googlemail.com>> Date: Wednesday, December 21, 2016 at 12:55 PM To: <user@flink.apache.org<mailto:user@flink.apache.org>> Subject: Monitoring REST API Hi all, I have a question regarding the Monitoring REST API; I want to analyze the behavior of my program with regards to I/O MiB/s, Network MiB/s and CPU % as the authors of this paper did. (https://hal.inria.fr/hal-01347638v2/document) From the JSON file at http:master:8081/jobs/jobid/ I get a summary including the information of read/write records and read/write bytes. Unfortunately the entries of Network or CPU are either (unknown) or 0.0. I am running my program on a cluster with up to 32 nodes. Where can I find the values for e.g. CPU or Network? Thanks in advance! Lydia