Re: Monitor number of keys per Taskmanager

2019-10-25 Thread Flavio Pompermaier
Thnk you all for the reply. Maybe I could set up some metrics and count the keys per subtasks/slot by myself. However in the example of the playground there are 6 keys and they get distributed in the 2 slots as 4 and 2: is this a bug (since Piotr said that key groups can have sizes +/- 1 and in thi

Re: Monitor number of keys per Taskmanager

2019-10-23 Thread Till Rohrmann
Currently, we don't work on trying to ensure that the number of key groups is as evenly spread as possible. As a workaround I would suggest to increase the number of key groups or to change the key function. Cheers, Till On Wed, Oct 23, 2019 at 1:42 PM Piotr Nowojski wrote: > Hi, > > This is a

Re: Monitor number of keys per Taskmanager

2019-10-23 Thread Piotr Nowojski
Hi, This is a known issue of Flink. For example key groups can have sizes +/- 1 and they are currently randomly distributed across the cluster, so some machines will get more keys to handle then the others. If the number of keys is relatively small, like 3 keys per key group, the load differenc

Monitor number of keys per Taskmanager

2019-10-22 Thread Flavio Pompermaier
Hi to all, I was looking into the Flink example of the Flink training trying to understand why in the ClickEventCount[1] one task manager was reading twice the speed of the other. I had to debug a lot of internal code of Flink to understand that it depends on the adopted hash function (used by Fl