Hi everybody, I want to improve the performance of broadcasts in Flink. Therefore Till told me to start a FLIP on this topic to discuss how to go forward to solve the current issues for broadcasts.
The problem in a nutshell: Instead of sending data to each taskmanager only once, at the moment the data is sent to each task. This means if there are 3 slots on each taskmanager we will send the data 3 times instead of once. There are multiple ways to tackle this problem and I started to do some research and investigate. You can follow my thought process here: https://cwiki.apache.org/confluence/display/FLINK/FLIP-5%3A+Only+send+data+to+each+taskmanager+once+for+broadcasts This is my first FLIP. So please correct me, if I did something wrong. I am interested in your thoughts about how to solve this issue. Do you think my approach is heading into the right direction or should we follow a totally different one. I am happy about any comment :) Best regards, Felix