Hi, I'm using Flink broadcast state similar to what Fabian explained in [1]. One difference might be the size of the broadcasted data; the size is around 150MB.
I've launched 32 TMs by setting - taskmanager.numberOfTaskSlots : 6 - parallelism of the non-broadcast side : 192 Here's some questions: 1) AFAIK, the broadcasted data (150MB) is sent to all 192 tasks. Is it right? 2) Any recommended way to broadcast data only to 32 TMs so that 6 tasks in each TM can read the broadcasted data? I'm considering implementing a static class for the non-broadcast side to directly load data only once on each TaskManager instead of the broadcast state (FYI, I'm using per-job clusters on YARN, so each TM is only for a single job). However, I'd like to use Flink native facilities if possible. The type of broadcasted data is Map<Long, Int> with around 600K entries, so every time the data is broadcasted a lot of GC is inevitable on each TM due to the (de)serialization cost. Any advice would be much appreciated. Best, Dongwon [1] https://flink.apache.org/2019/06/26/broadcast-state.html