Hello! We have a strangely behaving iterative Flink job: when we give it more memory, it gets much slower (more than 10 times). The problem seems to be mostly caused by GCs. Enabling object reuse didn’t help.
With some profiling and debugging, we traced the problem to the operators requesting new memory segments from the memory manager at every superstep of the iteration, and the memory manager satisfying these requests by allocating new memory segments from the Java heap [1], and then the old ones have to be eventually reclaimed by garbage collections. We found the option “taskmanager.memory.preallocate”, which mostly solved the GC problem, but we would like to understand the situation better. What is the reason for the default value of this setting being false? Is there a downside to enabling this option? If the only downside is the slower startup of the task managers, then we could have the best of both worlds, by modifying the logic of the memory manager to use pooling only after releases. I mean the memory manager would give the segments back to the pool when the operators release them even when “preallocate” is false, and then `allocatePages` would use a new method of the memory pool, which would first check if there are segments in the pool and calls `allocateNewSegment` or `requestSegmentFromPool` accordingly. (Instead of the current behaviour, which is to basically disable pooling, when the “preallocate” setting is false.) [1] https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/memory/MemoryManager.java#L293-L307 Best, Gábor and Márton