Hello,
in our setup we have:
- Flink 1.11.2
- job submission via REST API (first we upload jar, then we submit
multiple jobs with it)
- additional jars embedded in lib directory of main jar (this is crucial
part)
When we submit jobs this way, Flink creates new temp jar files via
PackagedProgram.extractContainedLibraries method.
We observe that they are not removed after job finishes - it seems that
PackagedProgram.deleteExtractedLibraries is not invoked when using REST
API.
What's more, it seems that those jars remain open in JobManager process.
We observe that when we delete them manually via scripts, the disk space
is not reclaimed until process is restarted, we also see via heap dump
inspection that java.util.zip.ZipFile$Source objects remain, pointing
to those files. This is quite a problem for us, as we submit quite a few
jobs, and after a while we ran out of either heap or disk space on
JobManager process/host. Unfortunately, I cannot so far find where this
leak would happen...
Does anybody have some pointers where we can search? Or how to fix this
behaviour?
thanks,
maciek