Hi Ufuk, Thanks for this. Really appreciated.
Cheers On Tue, Feb 23, 2016 at 8:04 PM, Ufuk Celebi <u...@apache.org> wrote: > I would go with one task manager with 48 slots per machine. This > reduces the communication overheads between task managers. > > Regarding memory configuration: Given that the machines have plenty of > memory, I would configure a bigger heap than the 4 GB you had > previously. Furhermore, you can also consider adding more network > buffers, which should improve job throughput. > > – Ufuk > > On Tue, Feb 23, 2016 at 11:57 AM, Welly Tambunan <if05...@gmail.com> > wrote: > > Hi Ufuk and Fabian, > > > > Is that better to start 48 task manager ( one slot each ) in one machine > > than having single task manager with 48 slot ? Any trade-off that we > should > > know etc ? > > > > Cheers > > > > On Tue, Feb 23, 2016 at 3:03 PM, Welly Tambunan <if05...@gmail.com> > wrote: > >> > >> Hi Ufuk, > >> > >> Thanks for the explanation. > >> > >> Yes. Our jobs is all streaming job. > >> > >> Cheers > >> > >> On Tue, Feb 23, 2016 at 2:48 PM, Ufuk Celebi <u...@apache.org> wrote: > >>> > >>> The new default is equivalent to the previous "streaming mode". The > >>> community decided to get rid of this distinction, because it was > >>> confusing to users. > >>> > >>> The difference between "streaming mode" and "batch mode" was how > >>> Flink's managed memory was allocated, either lazily when required > >>> ('streaming mode") or eagerly on task manager start up ("batch mode"). > >>> Now it's lazy by default. > >>> > >>> This is not something you need to worry about, but if you are mostly > >>> using the DataSet API where pre allocation has benefits, you can get > >>> the "batch mode" behaviour by using the following configuration key: > >>> > >>> taskmanager.memory.preallocate: true > >>> > >>> But you are using the DataStream API anyways, right? > >>> > >>> – Ufuk > >>> > >>> > >>> On Tue, Feb 23, 2016 at 6:36 AM, Welly Tambunan <if05...@gmail.com> > >>> wrote: > >>> > Hi Fabian, > >>> > > >>> > Previously when using flink 0.9-0.10 we start the cluster with > >>> > streaming > >>> > mode or batch mode. I see that this one is gone on Flink 1.00 > snapshot > >>> > ? So > >>> > this one has already taken care of the flink and optimize by runtime > > > >>> > > >>> > On Mon, Feb 22, 2016 at 5:26 PM, Fabian Hueske <fhue...@gmail.com> > >>> > wrote: > >>> >> > >>> >> Hi Welly, > >>> >> > >>> >> sorry for the late response. > >>> >> > >>> >> The number of network buffers primarily depends on the maximum > >>> >> parallelism > >>> >> of your job. > >>> >> The given formula assumes a specific cluster configuration (1 task > >>> >> manager > >>> >> per machine, one parallel task per CPU). > >>> >> The formula can be translated to: > >>> >> > >>> >> taskmanager.network.numberOfBuffers: p ^ 2 * t * 4 > >>> >> > >>> >> where p is the maximum parallelism of the job and t is the number of > >>> >> task > >>> >> manager. > >>> >> You can process more than one parallel task per TM if you configure > >>> >> more > >>> >> than one processing slot per machine ( > taskmanager.numberOfTaskSlots). > >>> >> The > >>> >> TM will divide its memory among all its slots. So it would be > possible > >>> >> to > >>> >> start one TM for each machine with 100GB+ memory and 48 slots each. > >>> >> > >>> >> We can compute the number of network buffers if you give a few more > >>> >> details about your setup: > >>> >> - How many task managers do you start? I assume more than one TM per > >>> >> machine given that you assign only 4GB of memory out of 128GB to > each > >>> >> TM. > >>> >> - What is the maximum parallelism of you program? > >>> >> - How many processing slots do you configure for each TM? > >>> >> > >>> >> In general, pipelined shuffles with a high parallelism require a lot > >>> >> of > >>> >> memory. > >>> >> If you configure batch instead of pipelined transfer, the memory > >>> >> requirement goes down > >>> >> (ExecutionConfig.setExecutionMode(ExecutionMode.BATCH)). > >>> >> > >>> >> Eventually, we want to merge the network buffer and the managed > memory > >>> >> pools. So the "taskmanager.network.numberOfBuffers" configuration > >>> >> whill > >>> >> hopefully disappear at some point in the future. > >>> >> > >>> >> Best, Fabian > >>> >> > >>> >> 2016-02-19 9:34 GMT+01:00 Welly Tambunan <if05...@gmail.com>: > >>> >>> > >>> >>> Hi All, > >>> >>> > >>> >>> We are trying to running our job in cluster that has this > information > >>> >>> > >>> >>> 1. # of machine: 16 > >>> >>> 2. memory : 128 gb > >>> >>> 3. # of core : 48 > >>> >>> > >>> >>> However when we try to run we have an exception. > >>> >>> > >>> >>> "insufficient number of network buffers. 48 required but only 10 > >>> >>> available. the total number of network buffers is currently set to > >>> >>> 2048" > >>> >>> > >>> >>> After looking at the documentation we set configuration based on > docs > >>> >>> > >>> >>> taskmanager.network.numberOfBuffers: # core ^ 2 * # machine * 4 > >>> >>> > >>> >>> However we face another error from JVM > >>> >>> > >>> >>> java.io.IOException: Cannot allocate network buffer pool: Could not > >>> >>> allocate enough memory segments for NetworkBufferPool (required > (Mb): > >>> >>> 2304, > >>> >>> allocated (Mb): 698, missing (Mb): 1606). Cause: Java heap space > >>> >>> > >>> >>> We fiddle the taskmanager.heap.mb: 4096 > >>> >>> > >>> >>> Finally the cluster is running. > >>> >>> > >>> >>> However i'm still not sure about the configuration and fiddling in > >>> >>> task > >>> >>> manager heap really fine tune. So my question is > >>> >>> > >>> >>> Am i doing it right for numberOfBuffers ? > >>> >>> How much should we allocate on taskmanager.heap.mb given the > >>> >>> information > >>> >>> Any suggestion which configuration we need to set to make it > optimal > >>> >>> for > >>> >>> the cluster ? > >>> >>> Is there any chance that this will get automatically resolve by > >>> >>> memory/network buffer manager ? > >>> >>> > >>> >>> Thanks a lot for the help > >>> >>> > >>> >>> Cheers > >>> >>> > >>> >>> -- > >>> >>> Welly Tambunan > >>> >>> Triplelands > >>> >>> > >>> >>> http://weltam.wordpress.com > >>> >>> http://www.triplelands.com > >>> >> > >>> >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Welly Tambunan > >>> > Triplelands > >>> > > >>> > http://weltam.wordpress.com > >>> > http://www.triplelands.com > >> > >> > >> > >> > >> -- > >> Welly Tambunan > >> Triplelands > >> > >> http://weltam.wordpress.com > >> http://www.triplelands.com > > > > > > > > > > -- > > Welly Tambunan > > Triplelands > > > > http://weltam.wordpress.com > > http://www.triplelands.com > -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com <http://www.triplelands.com/blog/>