Hi Declan, Thanks for reaching out, we always welcome new users to Apache Flink community :)
Your first question is a bit tricky. I am still trying to understand the motivation behind. In general there is no generic way to access the records which one of the operator currently processes. Are your referring to records which are buffered in the sink and not yet sind or are you referring to all record which are currently processed by the entire pipeline? Flink provides by default a set of metrics for each operator [1]. You can either collect via the REST API, or some kind of configured reporter (JMX, prometheus. The simplest way to look at the metrics is opening the web UI of Flink. It shows you an overview of the running pipeline and the metrics for all tasks. Overall, it seems you are trying to investigate some kind of performance problem, please correct me if I am wrong. You can also directly ask more detailed questions if you are seeing an unexpected behaviour. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/metrics/#io