[ 
https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967540#comment-15967540
 ] 

ASF GitHub Bot commented on FLINK-4545:
---------------------------------------

GitHub user NicoK opened a pull request:

    https://github.com/apache/flink/pull/3721

    [FLINK-4545] replace the network buffers parameter

    (based on #3708 and #3713)
    
    Instead, allow the configuration with the following three new (more 
flexible) parameters:
    * `taskmanager.network.memory.fraction`: fraction of JVM memory to use for 
network buffers (default: 0.1)
    * `taskmanager.network.memory.min`: minimum memory size for network buffers 
(default: 64 MB)
    * `taskmanager.network.memory.max`: maximum memory size for network buffers 
(default: 1 GB)
    
    Note that I needed to adapt two unit tests which would have been killed on 
Travis CI because these defaults result in ~150MB memory being used for network 
buffers which apparently was too much there.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NicoK/flink flink-4545

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3721.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3721
    
----
commit e61f7bc4debce332c421cb645ff1025b4d03d8d0
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-11T09:26:29Z

    [FLINK-6292] fix transfer.sh upload by using https
    
    Seems the upload via http is not supported anymore.

commit 362ceec0823b179719449d0ed244c591dfcf51f4
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-12T09:09:03Z

    [FLINK-6299] make all IT cases extend from TestLogger
    
    This way, currently executed tests and their failures are properly logged.

commit 973099ef55701fe63951639d37b4f01765b06a01
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-06T12:41:52Z

    [FLINK-4545] replace the network buffers parameter
    
    Instead, allow the configuration with the following three new (more 
flexible)
    parameters:
     * "taskmanager.network.memory.fraction": fraction of JVM memory to use for 
network buffers (default: 0.1)
     * "taskmanager.network.memory.min": minimum memory size for network 
buffers (default: 64 MB)
     * "taskmanager.network.memory.max": maximum memory size for network 
buffers (default: 1 GB)
    
     # Please enter the commit message for your changes. Lines starting

commit 09a981189b59ac13bd39000cc77913c0b03289fd
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-11T12:20:40Z

    [hotfix] fix typo in error message

commit 0960a809c8da51b9787f3f726945716933051fc3
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-11T13:29:41Z

    [hotfix] fix typo in taskmanager.sh usage string

commit 298bb69451a1405df774451de11eb5684534c956
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-06T15:58:14Z

    [FLINK-4545] adapt taskmanager.sh to take network buffers memory into 
account

commit ea2fb24f4a6eb18cc3f8d3ebd83a49c0f1386a8a
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-10T09:43:50Z

    [FLINK-4545] add configuration checks for the new network buffer memory 
config

commit 5133d250c4dba4a5e72baad95c841d2b03cb49ea
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-10T16:22:10Z

    [FLINK-4545] add unit tests using the new network configuration parameters 
and methods

commit a24a548e6ff7e36581f7f7457099656362ca3974
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-11T16:52:56Z

    [FLINK-4545] add unit tests for heap size calculation in shell scripts
    
    These verify that the results are the same as in the calculation done by 
Java.

commit d55153d559bf110a931b5de849df812038ba4a7a
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-12T16:11:37Z

    [FLINK-4545] update the docs with the changed network buffer parameter
    
    Also update the descriptions of taskmanager.memory.fraction not being 
relative
    to the full size of taskmanager.heap.mb but that network buffer memory is
    subtracted before!

commit c48beb0d67e8ef847ef845835e342d4a49127e7d
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-04-12T16:25:27Z

    [FLINK-4545] fix some tests being killed on Travis CI
    
    Due to the increased defaults for network buffer memory use, some builds on
    Travis CI fail with unit tests being killed. This affects
    * RocksDbBackendEventTimeWindowCheckpointingITCase and
    * HBaseConnectorITCase
    
    We fix this by limiting the maximum amount of network buffer memory to 80MB
    (current defaults would yield 150MB, previously 64MB were used).

----


> Flink automatically manages TM network buffer
> ---------------------------------------------
>
>                 Key: FLINK-4545
>                 URL: https://issues.apache.org/jira/browse/FLINK-4545
>             Project: Flink
>          Issue Type: Wish
>          Components: Network
>            Reporter: Zhenzhong Xu
>
> Currently, the number of network buffer per task manager is preconfigured and 
> the memory is pre-allocated through taskmanager.network.numberOfBuffers 
> config. In a Job DAG with shuffle phase, this number can go up very high 
> depends on the TM cluster size. The formula for calculating the buffer count 
> is documented here 
> (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers).
>   
> #slots-per-TM^2 * #TMs * 4
> In a standalone deployment, we may need to control the task manager cluster 
> size dynamically and then leverage the up-coming Flink feature to support 
> scaling job parallelism/rescaling at runtime. 
> If the buffer count config is static at runtime and cannot be changed without 
> restarting task manager process, this may add latency and complexity for 
> scaling process. I am wondering if there is already any discussion around 
> whether the network buffer should be automatically managed by Flink or at 
> least expose some API to allow it to be reconfigured. Let me know if there is 
> any existing JIRA that I should follow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to