>
> Increasing network memory buffers (fraction, min, max) seems to increase
> tasks slightly.

That's wired. I don't think the number of network memory buffers have
anything to do with the task amount.

Let me try to clarify a few things.

Please be aware that, how many tasks a Flink job has, and how many slots a
Flink cluster has, are two different things.
- The number of tasks are decided by your job's parallelism and topology.
E.g., if your job graph have 3 vertices A, B and C, with parallelism 2, 3,
4 respectively. Then you would have totally 9 (2+3+4) tasks.
- The number of slots are decided by number of TMs and slots-per-TM.
- For streaming jobs, you have to make sure the number of slots is enough
for executing all your tasks. The number of slots needed for executing your
job is by default the max parallelism of your job graph vertices. Take the
above example, you would need 4 slots, because it's the max among all the
vertices' parallelisms (2, 3, 4).

In your case, the screenshot shows that you job has 9621 tasks in total
(not around 18000, the dark box shows total tasks while the green box shows
running tasks), and 600 slots are in use (658 - 58) suggesting that the max
parallelism of your job graph vertices is 600.

If you want to increase the number of tasks, you should increase your job
parallelism. There are several ways to do that.

   - In your job codes (assuming you are using DataStream API)
      - Use `StreamExecutionEnvironment#setParallelism()` to set
      parallelism for all operators.
      - Use `SingleOutputStreamOperator#setParallelism()` to set
      parallelism for a specific operator. (Only supported for subclasses of
      `SingleOutputStreamOperator`.)
   - When submitting your job, use `-p <parallelism>` as an argument for
   the `flink run` command, to set parallelism for all operators.
   - Set `parallelism.default` in your `flink-conf.yaml`, to set a default
   parallelism for your jobs. This will be used for jobs that have not set
   parallelism with neither of the above methods.


Thank you~

Xintong Song



On Sat, May 23, 2020 at 1:11 AM Vijay Balakrishnan <bvija...@gmail.com>
wrote:

> Hi Xintong,
> Thx for your reply.  Increasing network memory buffers (fraction, min,
> max) seems to increase tasks slightly.
>
> Streaming job
> Standalone
>
> Vijay
>
> On Fri, May 22, 2020 at 2:49 AM Xintong Song <tonysong...@gmail.com>
> wrote:
>
>> Hi Vijay,
>>
>> I don't think your problem is related to number of opening files. The
>> parallelism of your job is decided before actually tries to open the files.
>> And if the OS limit for opening files is reached, you should see a job
>> execution failure, instead of a success execution with a lower parallelism.
>>
>> Could you share some more information about your use case?
>>
>>    - What kind of job are your executing? Is it a streaming or batch
>>    processing job?
>>    - Which Flink deployment do you use? Standalone? Yarn?
>>    - It would be helpful if you can share the Flink logs.
>>
>>
>> Thank you~
>>
>> Xintong Song
>>
>>
>>
>> On Wed, May 20, 2020 at 11:50 PM Vijay Balakrishnan <bvija...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I have increased the number of slots available but the Job is not using
>>> all the slots but runs into this approximate 18000 Tasks limit. Looking
>>> into the source code, it seems to be opening file -
>>> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/io/FileOutputFormat.java#L203
>>> So, do I have to tune the ulimit or something similar at the Ubuntu O/S
>>> level to increase number of tasks available ? What I am confused about is
>>> the ulimit is per machine but the ExecutionGraph is across many machines ?
>>> Please pardon my ignorance here. Does number of tasks equate to number of
>>> open files. I am using 15 slots per TaskManager on AWS m5.4xlarge which has
>>> 16 vCPUs.
>>>
>>> TIA.
>>>
>>> On Tue, May 19, 2020 at 3:22 PM Vijay Balakrishnan <bvija...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Flink Dashboard UI seems to show tasks having a hard limit for Tasks
>>>> column around 18000 on a Ubuntu Linux box.
>>>> I kept increasing the number of slots per task manager to 15 and number
>>>> of slots increased to 705 but the slots to tasks
>>>> stayed at around 18000. Below 18000 tasks, the Flink Job is able to
>>>> start up.
>>>> Even though I increased the number of slots, it still works when 312
>>>> slots are being used.
>>>>
>>>> taskmanager.numberOfTaskSlots: 15
>>>>
>>>> What knob can I tune to increase the number of Tasks ?
>>>>
>>>> Pls find attached the Flink Dashboard UI.
>>>>
>>>> TIA,
>>>>
>>>>

Reply via email to