[ https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894230#comment-15894230 ]
ASF GitHub Bot commented on FLINK-4545: --------------------------------------- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3467 [FLINK-4545] preparations for removing the network buffers parameter This PR includes some preparations for following PRs that ultimately lead to removing the network buffer parameter that was hard to tune. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-4545-prep Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3467.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3467 ---- commit dfea1bac97dbbf30a2e049618cc41fdca53ea6d3 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-02-10T13:36:37Z [FLINK-4545] remove (unused) persistent partition type commit 11557c004450bcbbe680f1575f0e41d164424eae Author: Nico Kruber <n...@data-artisans.com> Date: 2017-02-10T15:11:08Z [docs] improve some documentation around network buffers commit cd999061d04ae803c79473241ac1f9b39c1f2731 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-02-10T15:12:19Z [hotfix][network] add some assertions documenting on which locks we rely commit 8f529bb3f42916c816c5091228569952917ad9b5 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-03-01T13:33:44Z [FLINK-4545] remove fixed-size BufferPool instances These were unused except for unit tests and will be replaced with bounded BufferPool instances. ---- > Flink automatically manages TM network buffer > --------------------------------------------- > > Key: FLINK-4545 > URL: https://issues.apache.org/jira/browse/FLINK-4545 > Project: Flink > Issue Type: Wish > Components: Network > Reporter: Zhenzhong Xu > > Currently, the number of network buffer per task manager is preconfigured and > the memory is pre-allocated through taskmanager.network.numberOfBuffers > config. In a Job DAG with shuffle phase, this number can go up very high > depends on the TM cluster size. The formula for calculating the buffer count > is documented here > (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers). > > #slots-per-TM^2 * #TMs * 4 > In a standalone deployment, we may need to control the task manager cluster > size dynamically and then leverage the up-coming Flink feature to support > scaling job parallelism/rescaling at runtime. > If the buffer count config is static at runtime and cannot be changed without > restarting task manager process, this may add latency and complexity for > scaling process. I am wondering if there is already any discussion around > whether the network buffer should be automatically managed by Flink or at > least expose some API to allow it to be reconfigured. Let me know if there is > any existing JIRA that I should follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)