Re: Limiting backpressure during checkpoints

2022-10-24 Thread Zakelly Lan
Hi Robin, You said that during the checkpoint async phase the CPU is stable at 100%, which is pretty strange to me. Normally the cpu usage of the taskmanager process could exceed 100%, depending on what all the threads are doing. I'm wondering if there is any scheduling mechanism controlling the C

Re: Limiting backpressure during checkpoints

2022-10-24 Thread Robin Cassan via user
Hello Yuan Mei! Thanks a lot for your answer :) About the CPU usage, it is pretty stable at 80% normally. Every 15 minutes we trigger a checkpoint, and during this time it is stable at 100% I am starting to wonder if CPU is the real limiting factor, because when checking the Flink UI I see that mo

Re: Limiting backpressure during checkpoints

2022-10-19 Thread Yuan Mei
Hey Robin, Thanks for sharing the detailed information. May I ask, when you are saying "CPU usage is around 80% when checkpoints aren't running, and capped at 100% when they are", do you see zigzag patterns of CPU usage, or is it kept capped at 100% of CPU? I think one possibility is that the sy

Limiting backpressure during checkpoints

2022-10-13 Thread Robin Cassan via user
Hello all, hope you're well :) We are attempting to build a Flink job with minimal and stable latency (as much as possible) that consumes data from Kafka. Currently our main limitation happens when our job checkpoints the RocksDB state: backpressure is applied on the stream, causing latency. I am w