Hi Robin,
You said that during the checkpoint async phase the CPU is stable at
100%, which is pretty strange to me. Normally the cpu usage of the
taskmanager process could exceed 100%, depending on what all the
threads are doing. I'm wondering if there is any scheduling mechanism
controlling the C
Hello Yuan Mei! Thanks a lot for your answer :)
About the CPU usage, it is pretty stable at 80% normally. Every 15 minutes
we trigger a checkpoint, and during this time it is stable at 100%
I am starting to wonder if CPU is the real limiting factor, because when
checking the Flink UI I see that mo
Hey Robin,
Thanks for sharing the detailed information. May I ask, when you are
saying "CPU usage is around 80% when checkpoints aren't running, and capped
at 100% when they are", do you see zigzag patterns of CPU usage, or is it
kept capped at 100% of CPU?
I think one possibility is that the sy
Hello all, hope you're well :)
We are attempting to build a Flink job with minimal and stable latency (as
much as possible) that consumes data from Kafka. Currently our main
limitation happens when our job checkpoints the RocksDB state: backpressure
is applied on the stream, causing latency. I am w