Re: Diagnosing bottlenecks in Flink jobs

JING ZHANG Wed, 16 Jun 2021 22:30:10 -0700

Hi Dan,
Would you please describe what's the problem about your job? High latency
or low throughput?
Please first check the job throughput and latency .
If the job throughput matches the speed of sources producing data and the
latency metric is good, maybe the job works well without bottlenecks.
If you find unnormal throughput or latency, please try the following points:
1. check the back pressure
2. check whether checkpoint duration is long and whether the checkpoint
size is expected


Please share the details for deeper analysis in this email if you find
something abnormal about  the job.

Best,
JING ZHANG

Dan Hill <quietgol...@gmail.com> 于2021年6月17日周四 下午12:44写道：

> We have a job that has been running but none of the AWS resource metrics
> for the EKS, EC2, MSK and EBS show any bottlenecks.  I have multiple 8
> cores allocated but only ~2 cores are used.  Most of the memory is not
> consumed.  MSK does not show much use.  EBS metrics look mostly idle.  I
> assumed I'd be able to see whichever resources is a bottleneck.
>
> Is there a good way to diagnose where the bottleneck is for a Flink job?
>

Re: Diagnosing bottlenecks in Flink jobs

Reply via email to