I have a sample application that reads around 2 GB of csv files, converts each record into Avro object and sends it to kafka. I use a custom FileReader that reads the files in a directory. I have set taskmanager.numberOfTaskSlots to 4. I see that if I use setParallelism(3), 3 subtasks are created. But if I use setMaxParallelism(3), only 1 subtask is created.
On Wed, Mar 28, 2018 at 12:29 PM, Jörn Franke <jornfra...@gmail.com> wrote: > What was the input format, the size and the program that you tried to > execute > > On 28. Mar 2018, at 08:18, Data Engineer <dataenginee...@gmail.com> wrote: > > I went through the explanation on MaxParallelism in the official docs here: > https://ci.apache.org/projects/flink/flink-docs- > master/ops/production_ready.html#set-maximum-parallelism- > for-operators-explicitly > > However, I am not able to figure out how Flink decides the parallelism > value. > For instance, if I setMaxParallelism to 3, I see that for my job, there is > only 1 subtask that is created. How did Flink decide that 1 subtask was > enough? > > Regards, > James > >