Hi Dan, It's better to split the Kafka partition into multiple partitions. Here is a way to try without splitting the Kafka partition. Add a rebalance shuffle between source and the downstream operators, set multiple parallelism for the downstream operators. But this way would introduce extra cpu cost for serialize/deserialize and extra network cost for shuffle data. I'm not sure the benefits of this method can offset the additional costs.
Best, JING ZHANG Dan Hill <quietgol...@gmail.com> 于2021年6月17日周四 下午1:49写道: > Thanks, JING ZHANG! > > I have one subtask for one Kafka source that is getting backpressure. Is > there an easy way to split a single Kafka partition into multiple > subtasks? Or do I need to split the Kafka partition? > > On Wed, Jun 16, 2021 at 10:29 PM JING ZHANG <beyond1...@gmail.com> wrote: > >> Hi Dan, >> Would you please describe what's the problem about your job? High latency >> or low throughput? >> Please first check the job throughput and latency . >> If the job throughput matches the speed of sources producing data and the >> latency metric is good, maybe the job works well without bottlenecks. >> If you find unnormal throughput or latency, please try the following >> points: >> 1. check the back pressure >> 2. check whether checkpoint duration is long and whether the checkpoint >> size is expected >> >> Please share the details for deeper analysis in this email if you find >> something abnormal about the job. >> >> Best, >> JING ZHANG >> >> Dan Hill <quietgol...@gmail.com> 于2021年6月17日周四 下午12:44写道: >> >>> We have a job that has been running but none of the AWS resource metrics >>> for the EKS, EC2, MSK and EBS show any bottlenecks. I have multiple 8 >>> cores allocated but only ~2 cores are used. Most of the memory is not >>> consumed. MSK does not show much use. EBS metrics look mostly idle. I >>> assumed I'd be able to see whichever resources is a bottleneck. >>> >>> Is there a good way to diagnose where the bottleneck is for a Flink job? >>> >>