Re: Diagnosing bottlenecks in Flink jobs

2021-06-17 Thread Dan Hill
Thanks Jing! On Wed, Jun 16, 2021 at 11:30 PM JING ZHANG wrote: > Hi Dan, > It's better to split the Kafka partition into multiple partitions. > Here is a way to try without splitting the Kafka partition. Add a > rebalance shuffle between source and the downstream operators, set multiple > paral

Re: Diagnosing bottlenecks in Flink jobs

2021-06-16 Thread JING ZHANG
Hi Dan, It's better to split the Kafka partition into multiple partitions. Here is a way to try without splitting the Kafka partition. Add a rebalance shuffle between source and the downstream operators, set multiple parallelism for the downstream operators. But this way would introduce extra cpu c

Re: Diagnosing bottlenecks in Flink jobs

2021-06-16 Thread Dan Hill
Thanks, JING ZHANG! I have one subtask for one Kafka source that is getting backpressure. Is there an easy way to split a single Kafka partition into multiple subtasks? Or do I need to split the Kafka partition? On Wed, Jun 16, 2021 at 10:29 PM JING ZHANG wrote: > Hi Dan, > Would you please d

Re: Diagnosing bottlenecks in Flink jobs

2021-06-16 Thread JING ZHANG
Hi Dan, Would you please describe what's the problem about your job? High latency or low throughput? Please first check the job throughput and latency . If the job throughput matches the speed of sources producing data and the latency metric is good, maybe the job works well without bottlenecks. If

Diagnosing bottlenecks in Flink jobs

2021-06-16 Thread Dan Hill
We have a job that has been running but none of the AWS resource metrics for the EKS, EC2, MSK and EBS show any bottlenecks. I have multiple 8 cores allocated but only ~2 cores are used. Most of the memory is not consumed. MSK does not show much use. EBS metrics look mostly idle. I assumed I'd