Hello,
I'm new to Flink and doing a POC. I have a Flink job which reads events
from a kafka source Topic, performs some calculations and outputs a couple
of SQL sinks.
I deployed this to a stand alone cluster running on my linux virtual
machine (all default settings).

Parallelism=3
NoOfTaskSlots allowed in config.yml=10
NoOfTaskSlots required for my job=3
Rest of the settings are default.

The job runs fine for the first 100,000 event and the response is near real
time. After that the first operator of the job starts to show Busy (max):
100% and the processing slows down significantly (see below picture).
Heap is at 50%.
Source Lag (kafka consumers lag) is 0. Source Kafka cluster CPU is <3%.


1. How can I triage what is causing slowness? Is it a CPU or Memory issue,
how do I find it? Everything looks normal to me. No exceptions in logs.
2. Why did the job run fine for the 100K event super fast and started
slowing down? Any theory on this?
Please suggest. Thank you!


[image: Picture 1, Picture]

Reply via email to