Re: OutOfMemory Failure on Savepoint

2021-07-07 Thread Arvid Heise
Hi Abhishek, Does your job use checkpointing? It seems like it's the first time the respective checkpoint/savepoint thread pool is touched and at this point, there are not enough handles. Do you have a way to inspect the ulimits on the task managers? If you don't have a way to change the limits,

OutOfMemory Failure on Savepoint

2021-06-30 Thread Abhishek SP
Hello, I am observing a failure whenever I trigger a savepoint on my Flink Application which otherwise runs without issues The app is deployed via AWS KDA(Kubernetes) with 256 KPU(6 Task managers with 43 slots each. 1 KPU = 1 vCPU, 4GB Memory, and 50GB Diskspace. It uses RocksDB backend) The sav