Hi to all,
I was looking into the Flink example of the Flink training trying to
understand why in the ClickEventCount[1]  one task manager was reading
twice the speed of the other.

I had to debug a lot of internal code of Flink to understand that it
depends on the adopted hash function (used by Flink to assign keys to
taskmanagers) that was assigning 4 keys to a TM and 2 to the other. Is
there a smarter way to monitor this thing (e.g. a metric like
taskManager_numKeys)?

I also discovered that one cannot force how to partition keys per
taskmanager (i.e. use keyBy after a customPartition). Is there any
development effort in this direction?

Best,
Flavio

[1]
https://github.com/apache/flink-playgrounds/blob/master/docker/ops-playground-image/java/flink-playground-clickcountjob/src/main/java/org/apache/flink/playgrounds/ops/clickcount/ClickEventCount.java

Reply via email to