Hi Declan,

Thanks for reaching out, we always welcome new users to Apache Flink community 
:)

Your first question is a bit tricky. I am still trying to understand the 
motivation behind. In general there is no generic way to access the records 
which one of the operator currently processes. 
Are your referring to records which are buffered in the sink and not yet sind 
or are you referring to all record which are currently processed by the 
entire pipeline?

Flink provides by default a set of metrics for each operator [1]. You can 
either collect via the REST API, or some kind of configured reporter (JMX, 
prometheus. The simplest way to look at the metrics is opening the web UI of 
Flink. It shows you an overview of the running pipeline and the 
metrics for all tasks.

Overall, it seems you are trying to investigate some kind of performance 
problem, please correct me if I am wrong. You can also directly ask more 
detailed questions if you are seeing an unexpected behaviour.

Best,
Fabian

[1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/metrics/#io

Reply via email to