Hi Everyone I have deployed a Flink cluster using a Flink Kubernetes operator and then submitted an Apache Beam Pipeline using a FlinkRunner.
I submitted two jobs. One with *parallelism=20* and another with *parallelism=1* but both jobs took almost the same time to complete the task (A difference of a few seconds), which is very surprising. In Flink UI, I can see the parallelism is set to 20 and each task has 20 sub-tasks but 19 subtasks are finishing the execution in a few seconds to minutes and only one subtask is running for the majority of the time. I have attached a screenshot below, where one subtask took nearly 1 hour 14 minutes, and the remaining 19 subtasks took less than 2 minutes to complete. So I am not getting any benefit of parallelism here. The task is simple and does not rely on any state still it's not using the resources to parallelize the work. * Is there any way to force parallelism here? * [image: image.png] The task has multiple steps and the final step is to write the output to the bucket, the output is written to multiple files so the task can be parallelized but only one subtask is doing the actual job. [image: image.png] Can someone help me figure out the right configuration and setup needed to parallelize the work? Regards Dipak