Re: Flink CDC job getting failed due to G1 old gc

Ayush Chauhan Fri, 30 Jul 2021 03:08:30 -0700

I am using RocksDB as the state backend. My pipeline checkpoint size is
hardly ~100kb.


I will add gc and heap dump config and will let you know of any findings

Right now I have doubts that there is some memory leak either in flink cdc
code or in iceberg sink https://iceberg.apache.org/flink/#overwrite-data

On Fri, Jul 30, 2021 at 12:31 PM David Morávek <d...@apache.org> wrote:

> Hi Ayush,
>
> This would signal some of your task managers is running out of memory,
> which would cause frequent old gen GC, because one cycle is not able to
> free up enough memory.
>
> What state backend are you using? If in-memory, off-loading state to
> RocksDB might help.
>
> Anyway, the general approach here would be the same as for any Java
> application:
> - You can enable GC logs and validate this really happens (more
> lightweight check would be just using something like `jstat -gccause <pid>
> ...`.
> - Take a heap dump of the affected TM to see what exactly is consuming
> your memory (eclipse MAT is fairly good with large heaps).
>
> Best,
> D.
>
>

-- 
 Ayush Chauhan
 Data Platform
 [image: mobile-icon]  +91 9990747111

-- 












This email is intended only for the person or the entity to 
whom it is addressed. If you are not the intended recipient, please delete 
this email and contact the sender.

Re: Flink CDC job getting failed due to G1 old gc

Reply via email to