Hey Punit,

you need to give the task managers more network buffers as Robert
suggested. Using the formula from the docs, can you please use 147456
(96^2*4*4) for the number of network buffers. Each buffer is 32 KB,
meaning that you give 4,5 GB of memory to the network stack. You might
have to adjust the heap memory (taskmanager.heap.mb) you give to the
task managers accordingly.

Does this solve it?

– Ufuk


On Sat, May 7, 2016 at 10:50 AM, Punit Naik <naik.puni...@gmail.com> wrote:
> I am afraid not.
>
> On 07-May-2016 1:24 PM, "Aljoscha Krettek" <aljos...@apache.org> wrote:
>>
>> Could it be that the TaskManagers are configured with not-enough memory?
>>
>> On Thu, 5 May 2016 at 13:35 Robert Metzger <rmetz...@apache.org> wrote:
>>>
>>> The default value of taskmanager.network.numberOfBuffers is 2048. I would
>>> recommend to use a multiple of that value, for example 16384 (given that you
>>> have enough memory per TaskManager)
>>>
>>> I recommend checking out these slides I created a while ago. They explain
>>> what the network buffers are needed for:
>>> http://www.slideshare.net/robertmetzger1/apache-flink-hands-on#37
>>>
>>>
>>> On Thu, May 5, 2016 at 1:30 PM, Punit Naik <naik.puni...@gmail.com>
>>> wrote:
>>>>
>>>> Yes I followed it and changed it to 298 but again it said the same
>>>> thing. The only change was that it now said "required 298, but only 200
>>>> available".
>>>>
>>>> Why did it say that?
>>>>
>>>> On Thu, May 5, 2016 at 4:50 PM, Robert Metzger <rmetz...@apache.org>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I think you've chosen a good initial value for the parallelism.
>>>>> The higher the parallelism, the more network buffers are needed. I
>>>>> would follow the recommendation from the exception and increase the number
>>>>> of network buffers.
>>>>>
>>>>> On Thu, May 5, 2016 at 11:23 AM, Punit Naik <naik.puni...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hello
>>>>>>
>>>>>> I was running a program with 'parallelism.default' of 384 as I read in
>>>>>> the documentation on Flink's official page that 'parallelism.default' is
>>>>>> "the total number of CPUs in the cluster". I have four machines with 96
>>>>>> cores on each of them. So 96*4=384. But the program thew an error saying:
>>>>>>
>>>>>> Caused by: java.io.IOException: Insufficient number of network
>>>>>> buffers: required 384, but only 298 available. The total number of 
>>>>>> network
>>>>>> buffers is currently set to 2048. You can increase this number by setting
>>>>>> the configuration key 'taskmanager.network.numberOfBuffers'.
>>>>>>
>>>>>> What does this mean? And how to choose a proper value for parallelism?
>>>>>>
>>>>>> --
>>>>>> Thank You
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Punit Naik
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Thank You
>>>>
>>>> Regards
>>>>
>>>> Punit Naik

Reply via email to