Hi Yufei,
My prime suspect would be changes to the memory configuration introduced in
1.11 [1]
Piotrek
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/release-notes/flink-1.11.html#memory-management
pon., 28 gru 2020 o 09:52 Till Rohrmann napisaĆ(a):
> Hi Yufei,
>
> I cannot
Hi Yufei,
I cannot remember exactly the changes in this area between Flink 1.10.0 and
Flink 1.12.0. It sounds a bit as if we were not releasing memory segments
fast enough or had a memory leak. One thing to try out is to increase the
restart delay to see whether it is the first problem. Alternativ
Hi, Yufei.
Can you reproduce this issue in 1.10.0? The deterministic slot sharing
introduced in 1.12.0 is one possible reason. Before 1.12.0, the
distribution of tasks in slots is not determined. Even if the network
buffers are enough from the perspective of the cluster. Bad
distribution of tasks