Hi,

I have a pipeline running on flink which ingests around 6k messages per
second. Each message is around 1kb and it passes through various stages
like filter, 5 sec tumbling window per key etc.. and finally flatmap to
computation before sending it to kafka sink. The data is first ingested as
protocol buffers and then in subsequent operators they are converted into
POJO's.

There are lots objects created inside the user functions and some of them
are cached as well. I have been running this pipeline on 48 task slots
across 3 task manages with each one allocated with 22GB memory.

The issue I'm having is within a period of 10 hours, almost 19k young
generation GC have been run which is roughly every 2 seconds and GC time
taken value is more than 2 million. I have also enabled object reuse. Any
suggestions on how this issue could be resolved? Thanks.

Regards,
Govind

Reply via email to