Hello Flink Community,
I am working on a network scheduler and am currently reading Flink's
code to figure out how the data exchange works. It would be great if you
could help me with some of my issues and questions.
Basically I want to extract from flink the time when a data transmission
between two machines starts (1), their connection details (2), how much
data is involved (3) and when it ends (4).
So far I have understood that the scheduling of tasks is done via the
scheduleOrUpdateConsumers JobManagerMessage. In the function of the same
name in the class Execution I have been able to extract the IP/Port pair
of both the producer and the consumer(s) use.
Furthermore I understand that in the context of a "blocking" data
transmission Flink will first create a ResultPartition and store all the
data in the form of Buffers before starting the transmission. So in
principle I should be able to figure out what amount of data Flink will
communicate by looking at the respective
ResultSubpartition.totalNumberOfBytes, right?
However, in the process I would need to map each ResultSubpartition to a
slot or deployed task, so that I can associate this amount of data with
connection details of the sender and the receiver. Any hints on how to
do that?
Now from what I see the same is not possible in a "pipelined" context,
correct? Can anything be said about the data to be communicated?
Finally, I was unable to locate in the code and in the logs where a
Task's state is changing from RUNNING to FINISHED. Could you give me a
pointer?
It would be great if you could share your insights on the problems above ;).
Best regards,
Niklas
--
PhD Student / Research Assistant
INET, TU Berlin
Room 4.029
Marchstr 23
10587 Berlin
Tel: +49 30 314 78752