Hi to all, I was looking into the Flink example of the Flink training trying to understand why in the ClickEventCount[1] one task manager was reading twice the speed of the other.
I had to debug a lot of internal code of Flink to understand that it depends on the adopted hash function (used by Flink to assign keys to taskmanagers) that was assigning 4 keys to a TM and 2 to the other. Is there a smarter way to monitor this thing (e.g. a metric like taskManager_numKeys)? I also discovered that one cannot force how to partition keys per taskmanager (i.e. use keyBy after a customPartition). Is there any development effort in this direction? Best, Flavio [1] https://github.com/apache/flink-playgrounds/blob/master/docker/ops-playground-image/java/flink-playground-clickcountjob/src/main/java/org/apache/flink/playgrounds/ops/clickcount/ClickEventCount.java