Hi Ky Alexandre, I would recommend reading this section which explains slot sharing b/w tasks. Link <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/concepts/flink-architecture/#task-slots-and-resources>
Quote - By default, Flink allows subtasks to share slots even if they are > subtasks of different tasks, so long as they are from the same job. The > result is that one slot may hold an entire pipeline of the job. For your use case, I would suggest either of the two options: 1. Set different parallelism for different tasks. <say 3 for the heavy compute task, 2 for medium and 1 for the least >. This shall ensure you will utilize all 3 task slots. 2. Use Fine-Grained Resource Management Techniques on the job. Link <https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/finegrained_resource/> Hope this helps. Regards Saurabh On Fri, Jul 12, 2024 at 8:47 PM Alexandre KY <alexandre...@magellium.fr> wrote: > Hello, > > > I am trying to run a pipeline made of 3 tasks (have a look at > flink_job.png): > > - Task 1: Source, FlatMap, Map, keyBy > - Task 2: Window, Map, keyBy > - Task 3: FlatMap, Map, Sink > > From what I have read, in streaming mode, all the tasks run > simultaneously. Therefore, each task take one TaskSlot like here > <https://medium.com/@LIME0040/flink-trouble-shooting-techniques-i-resource-management-73c61fcc8022>. > However, as you can see in the picture flink_tm (I run the job on a cluster > made of 1 jobmanager and 1 taskmanager), the taskmanager has 3 slots, but > only 1 of them is being used even though the 3 tasks are running. The > first task is still creating more data (supposed to produce 25 outputs) to > send to the 2nd one and even when the 3rd task receives data, the number of > taskslots used remain 1. > > > I don't understand why Flink doesn't use all the taskslots which leads it > to behave similarly to batch mode: it tends to produce all the outputs of > Task 1, then produces only the outputs of Task 2 and crashes because my > computer is out of memory since it keeps accumulates the outputs of Task 2 > in memory before sending them to Task 3 despite setting ` > env.set_runtime_mode(RuntimeExecutionMode.STREAMING)` > > I said "tends" because while Task 2 is processing the 25 products, Task 3 > received 2 of them and produced 2 outputs, but after that it stopped (the > number of records received remained 2) and Flink only runs Task 2 (I see it > in the logs) until the memory explodes. > > > To sum it up, I have no idea why Flink doesn't use all the Taskslots > available despite having more than 1 Task and shouldn't the backpressure > stop Task 2 since it's output buffer is getting full thanks to the > backpressure mechanism ? Or maybe I should reduce the number of buffers to > make the backpressure mechanism kick in ? > > > Thanks in advance and best regards, > > Ky Alexandre >