How many taskmanagers and server do you have? Can you also share the task managers page of flink dashboard?
On Mon, Apr 15, 2024 at 10:58 AM Oscar Perez via user <user@flink.apache.org> wrote: > Hi community! > > We have an interesting problem with Flink after increasing parallelism in > a certain way. Here is the summary: > > 1) We identified that our job bottleneck were some Co-keyed process > operators that were affecting on previous operators causing backpressure. > 2( What we did was to increase the parallelism to all the operators from 6 > to 12 but keeping 6 these operators that read from kafka. The main reason > was that all our topics have 6 partitions so increasing the parallelism > will not yield better performance > > See attached job layout prior and after the changes: > What happens was that some operations that were chained in the same > operator like reading - filter - map - filter now are rebalanced and the > overall performance of the job is suffering (keeps throwing exceptions now > and then) > > Is the rebalance operation going over the network or this happens in the > same node? How can we effectively improve performance of this job with the > given resources? > > Thanks for the input! > Regards, > Oscar > > >