Thanks for debugging and resolving the issue and driving the discussion
Yun!
For the given solutions, I prefer option 1 (supply another Dockerfile using
jemalloc as default memory allocator) because of the below reasons:
1. It's hard to say jemalloc is always better than ptmalloc (glibc malloc),
or else glibc should have already adopted it as the default memory
allocator. And as indicated here [1], in some cases jemalloc will
consume as much as twice the memory than glibc
2. All existing Flink docker images use glibc, if we change the default
memory allocator to jemalloc and only supply one series of images, we will
leave those having better performance with glibc no other choices but
staying with old images. In another word, there's a risk of introducing new
problems while fixing an existing one if choosing option-2.
And there is a third option considering the efforts of maintaining more
images if the memory leak issue is not widely observed, that we could
document the steps of building Dockerfile with jemalloc as default
allocator so users could build it when needed, which leaves the burden to
our users so for me it's not the best option.
Best Regards,
Yu
[1] https://stackoverflow.com/a/33993215
On Tue, 13 Oct 2020 at 15:34, Yun Tang <myas...@live.com> wrote:
Hi all
Users report they meet serious memory leak when submitting jobs
continously in session mode within k8s (please refer to FLINK-18712[1] ),
and I also reproduce this to find this is caused by memory fragmentation
of
glibc [2][3] and provide solutions to fix this:
* Quick but not very clean solution to limit the memory pool of
glibc,
limit MALLOC_ARENA_MAX to 2
* More general solution by rebuilding the image to install
libjemalloc-dev and add the libjemalloc.so it to LD_PRELOAD
The reporter adopted the 2nd solution to fix this issue eventually. Thus,
I begin to think whether we should change our Dockerfile to adopt
jemalloc
as default memory allocator [4].
From my point of view, we have two choices:
1. Introduce another Dockerfile using jemalloc as default memory
allocator, which means Flink needs another two new image tags to build
docker with jemalloc while default docker still use glibc.
2. Set the default memory allocator as jemalloc in our existing
Dockerfiles, which means Flink offer docker image with jemalloc by
default.
I prefer the 2nd option as our company already use jemalloc as default
memory allocator for JDK at our production environment due to messages
from
os team warning of glibc's memory fragmentation.
Moreover, I found several open source projects adopting jemalloc as
default memory allocator within their images to resolve memory
fragmentation problem, e.g fluent [5], home-assistant [6].
What do you guys think of this issue?
[1] https://issues.apache.org/jira/browse/FLINK-18712
[2]
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Freeing-after-Malloc
[3] https://sourceware.org/bugzilla/show_bug.cgi?id=15321
[4] https://issues.apache.org/jira/browse/FLINK-19125
[5]
https://docs.fluentbit.io/manual/v/1.0/installation/docker#why-there-is-no-fluent-bit-docker-image-based-on-alpine-linux
[6] https://github.com/home-assistant/core/pull/33237
Best
Yun Tang