Generally I see 2 options:
a) There's a memory leak somewhere. It would be good to know how the
baseline heap usage during idleness evolves over time. Are the same 20
jobs running continuously or are they (or others) periodically re-submitted?
b) The JVM just doesn't feel like running garbage collection. This
doesn't seem that unreasonable given that there's plenty of memory to go
around.
Overall, unless you run into OutOfMemoryErrors or the usage during
idleness keeps steadily rising I wouldn't worry about it too much at
this time.
On 1/27/2021 8:12 AM, Daniel Peled wrote:
Hi,
We have a flink cluster with 1 JM and 7 TM running about 20 jobs.
We have noticed that both JM & TM are consuming a huge amount of
memory (several GB) *_although the jobs are doing nothing_* meaning no
records are passing through the pipeline.
Checkpoints are enabled and the interval between checkpoints is 10
second (but again no records coming in)
Attached are screenshots of metrics of both JM and one of the TM
Is that normal ?
Any tips for debugging this issue ?
BR,
Danny