[ https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967540#comment-15967540 ]
ASF GitHub Bot commented on FLINK-4545: --------------------------------------- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/3721 [FLINK-4545] replace the network buffers parameter (based on #3708 and #3713) Instead, allow the configuration with the following three new (more flexible) parameters: * `taskmanager.network.memory.fraction`: fraction of JVM memory to use for network buffers (default: 0.1) * `taskmanager.network.memory.min`: minimum memory size for network buffers (default: 64 MB) * `taskmanager.network.memory.max`: maximum memory size for network buffers (default: 1 GB) Note that I needed to adapt two unit tests which would have been killed on Travis CI because these defaults result in ~150MB memory being used for network buffers which apparently was too much there. You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink flink-4545 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3721.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3721 ---- commit e61f7bc4debce332c421cb645ff1025b4d03d8d0 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-11T09:26:29Z [FLINK-6292] fix transfer.sh upload by using https Seems the upload via http is not supported anymore. commit 362ceec0823b179719449d0ed244c591dfcf51f4 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-12T09:09:03Z [FLINK-6299] make all IT cases extend from TestLogger This way, currently executed tests and their failures are properly logged. commit 973099ef55701fe63951639d37b4f01765b06a01 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-06T12:41:52Z [FLINK-4545] replace the network buffers parameter Instead, allow the configuration with the following three new (more flexible) parameters: * "taskmanager.network.memory.fraction": fraction of JVM memory to use for network buffers (default: 0.1) * "taskmanager.network.memory.min": minimum memory size for network buffers (default: 64 MB) * "taskmanager.network.memory.max": maximum memory size for network buffers (default: 1 GB) # Please enter the commit message for your changes. Lines starting commit 09a981189b59ac13bd39000cc77913c0b03289fd Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-11T12:20:40Z [hotfix] fix typo in error message commit 0960a809c8da51b9787f3f726945716933051fc3 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-11T13:29:41Z [hotfix] fix typo in taskmanager.sh usage string commit 298bb69451a1405df774451de11eb5684534c956 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-06T15:58:14Z [FLINK-4545] adapt taskmanager.sh to take network buffers memory into account commit ea2fb24f4a6eb18cc3f8d3ebd83a49c0f1386a8a Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-10T09:43:50Z [FLINK-4545] add configuration checks for the new network buffer memory config commit 5133d250c4dba4a5e72baad95c841d2b03cb49ea Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-10T16:22:10Z [FLINK-4545] add unit tests using the new network configuration parameters and methods commit a24a548e6ff7e36581f7f7457099656362ca3974 Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-11T16:52:56Z [FLINK-4545] add unit tests for heap size calculation in shell scripts These verify that the results are the same as in the calculation done by Java. commit d55153d559bf110a931b5de849df812038ba4a7a Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-12T16:11:37Z [FLINK-4545] update the docs with the changed network buffer parameter Also update the descriptions of taskmanager.memory.fraction not being relative to the full size of taskmanager.heap.mb but that network buffer memory is subtracted before! commit c48beb0d67e8ef847ef845835e342d4a49127e7d Author: Nico Kruber <n...@data-artisans.com> Date: 2017-04-12T16:25:27Z [FLINK-4545] fix some tests being killed on Travis CI Due to the increased defaults for network buffer memory use, some builds on Travis CI fail with unit tests being killed. This affects * RocksDbBackendEventTimeWindowCheckpointingITCase and * HBaseConnectorITCase We fix this by limiting the maximum amount of network buffer memory to 80MB (current defaults would yield 150MB, previously 64MB were used). ---- > Flink automatically manages TM network buffer > --------------------------------------------- > > Key: FLINK-4545 > URL: https://issues.apache.org/jira/browse/FLINK-4545 > Project: Flink > Issue Type: Wish > Components: Network > Reporter: Zhenzhong Xu > > Currently, the number of network buffer per task manager is preconfigured and > the memory is pre-allocated through taskmanager.network.numberOfBuffers > config. In a Job DAG with shuffle phase, this number can go up very high > depends on the TM cluster size. The formula for calculating the buffer count > is documented here > (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers). > > #slots-per-TM^2 * #TMs * 4 > In a standalone deployment, we may need to control the task manager cluster > size dynamically and then leverage the up-coming Flink feature to support > scaling job parallelism/rescaling at runtime. > If the buffer count config is static at runtime and cannot be changed without > restarting task manager process, this may add latency and complexity for > scaling process. I am wondering if there is already any discussion around > whether the network buffer should be automatically managed by Flink or at > least expose some API to allow it to be reconfigured. Let me know if there is > any existing JIRA that I should follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)