Hi Ufuk Thanks for the detailed answer. I will definitely try this and get back to you. On 09-May-2016 2:08 PM, "Ufuk Celebi" <u...@apache.org> wrote:
> Hey Punit, > > you need to give the task managers more network buffers as Robert > suggested. Using the formula from the docs, can you please use 147456 > (96^2*4*4) for the number of network buffers. Each buffer is 32 KB, > meaning that you give 4,5 GB of memory to the network stack. You might > have to adjust the heap memory (taskmanager.heap.mb) you give to the > task managers accordingly. > > Does this solve it? > > – Ufuk > > > On Sat, May 7, 2016 at 10:50 AM, Punit Naik <naik.puni...@gmail.com> > wrote: > > I am afraid not. > > > > On 07-May-2016 1:24 PM, "Aljoscha Krettek" <aljos...@apache.org> wrote: > >> > >> Could it be that the TaskManagers are configured with not-enough memory? > >> > >> On Thu, 5 May 2016 at 13:35 Robert Metzger <rmetz...@apache.org> wrote: > >>> > >>> The default value of taskmanager.network.numberOfBuffers is 2048. I > would > >>> recommend to use a multiple of that value, for example 16384 (given > that you > >>> have enough memory per TaskManager) > >>> > >>> I recommend checking out these slides I created a while ago. They > explain > >>> what the network buffers are needed for: > >>> http://www.slideshare.net/robertmetzger1/apache-flink-hands-on#37 > >>> > >>> > >>> On Thu, May 5, 2016 at 1:30 PM, Punit Naik <naik.puni...@gmail.com> > >>> wrote: > >>>> > >>>> Yes I followed it and changed it to 298 but again it said the same > >>>> thing. The only change was that it now said "required 298, but only > 200 > >>>> available". > >>>> > >>>> Why did it say that? > >>>> > >>>> On Thu, May 5, 2016 at 4:50 PM, Robert Metzger <rmetz...@apache.org> > >>>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> I think you've chosen a good initial value for the parallelism. > >>>>> The higher the parallelism, the more network buffers are needed. I > >>>>> would follow the recommendation from the exception and increase the > number > >>>>> of network buffers. > >>>>> > >>>>> On Thu, May 5, 2016 at 11:23 AM, Punit Naik <naik.puni...@gmail.com> > >>>>> wrote: > >>>>>> > >>>>>> Hello > >>>>>> > >>>>>> I was running a program with 'parallelism.default' of 384 as I read > in > >>>>>> the documentation on Flink's official page that > 'parallelism.default' is > >>>>>> "the total number of CPUs in the cluster". I have four machines > with 96 > >>>>>> cores on each of them. So 96*4=384. But the program thew an error > saying: > >>>>>> > >>>>>> Caused by: java.io.IOException: Insufficient number of network > >>>>>> buffers: required 384, but only 298 available. The total number of > network > >>>>>> buffers is currently set to 2048. You can increase this number by > setting > >>>>>> the configuration key 'taskmanager.network.numberOfBuffers'. > >>>>>> > >>>>>> What does this mean? And how to choose a proper value for > parallelism? > >>>>>> > >>>>>> -- > >>>>>> Thank You > >>>>>> > >>>>>> Regards > >>>>>> > >>>>>> Punit Naik > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Thank You > >>>> > >>>> Regards > >>>> > >>>> Punit Naik >