ans.com]
> Sent: Wednesday, March 28, 2018 8:54 AM
> To: Data Engineer
> Cc: Jörn Franke ; user@flink.apache.org
> Subject: Re: How does setMaxParallelism work
>
> Flink does not decide the parallelism based on your job.
> There is a default parallelism (configured via parallelis
-artisans.com]
Sent: Wednesday, March 28, 2018 8:54 AM
To: Data Engineer
Cc: Jörn Franke ; user@flink.apache.org
Subject: Re: How does setMaxParallelism work
Flink does not decide the parallelism based on your job.
There is a default parallelism (configured via parallelism.default [1], by
default 1) which
Flink does not decide the parallelism based on your job.
There is a default parallelism (configured via parallelism.default [1],
by default 1) which is used if you do not specify it yourself.
Nico
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/config.html#common-options
On
Agreed. But how did Flink decide that it should allot 1 subtask? Why not 2
or 3?
I am trying to understand the implications of using setMaxParallelism vs
setParallelism
On Wed, Mar 28, 2018 at 2:58 PM, Nico Kruber wrote:
> Hi James,
> the number of subtasks being used is defined by the paralleli
Hi James,
the number of subtasks being used is defined by the parallelism, the max
parallelism, however, "... determines the maximum parallelism to which
you can scale operators" [1]. That is, once set, you cannot ever (even
after restarting your program from a savepoint) increase the operator's
pa
I have a sample application that reads around 2 GB of csv files, converts
each record into Avro object and sends it to kafka.
I use a custom FileReader that reads the files in a directory.
I have set taskmanager.numberOfTaskSlots to 4.
I see that if I use setParallelism(3), 3 subtasks are created.
What was the input format, the size and the program that you tried to execute
> On 28. Mar 2018, at 08:18, Data Engineer wrote:
>
> I went through the explanation on MaxParallelism in the official docs here:
> https://ci.apache.org/projects/flink/flink-docs-master/ops/production_ready.html#set-m
I went through the explanation on MaxParallelism in the official docs here:
https://ci.apache.org/projects/flink/flink-docs-master/ops/production_ready.html#set-maximum-parallelism-for-operators-explicitly
However, I am not able to figure out how Flink decides the parallelism
value.
For instance,