[jira] [Commented] (FLINK-4545) Flink automatically manages TM network buffer

ASF GitHub Bot (JIRA) Fri, 03 Mar 2017 04:09:07 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894230#comment-15894230
 ]


ASF GitHub Bot commented on FLINK-4545:
---------------------------------------

GitHub user NicoK opened a pull request:

    https://github.com/apache/flink/pull/3467

    [FLINK-4545] preparations for removing the network buffers parameter

    This PR includes some preparations for following PRs that ultimately lead 
to removing the network buffer parameter that was hard to tune.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NicoK/flink flink-4545-prep

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3467.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3467
    
----
commit dfea1bac97dbbf30a2e049618cc41fdca53ea6d3
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-02-10T13:36:37Z

    [FLINK-4545] remove (unused) persistent partition type

commit 11557c004450bcbbe680f1575f0e41d164424eae
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-02-10T15:11:08Z

    [docs] improve some documentation around network buffers

commit cd999061d04ae803c79473241ac1f9b39c1f2731
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-02-10T15:12:19Z

    [hotfix][network] add some assertions documenting on which locks we rely

commit 8f529bb3f42916c816c5091228569952917ad9b5
Author: Nico Kruber <n...@data-artisans.com>
Date:   2017-03-01T13:33:44Z

    [FLINK-4545] remove fixed-size BufferPool instances
    
    These were unused except for unit tests and will be replaced with bounded
    BufferPool instances.

----


> Flink automatically manages TM network buffer
> ---------------------------------------------
>
>                 Key: FLINK-4545
>                 URL: https://issues.apache.org/jira/browse/FLINK-4545
>             Project: Flink
>          Issue Type: Wish
>          Components: Network
>            Reporter: Zhenzhong Xu
>
> Currently, the number of network buffer per task manager is preconfigured and 
> the memory is pre-allocated through taskmanager.network.numberOfBuffers 
> config. In a Job DAG with shuffle phase, this number can go up very high 
> depends on the TM cluster size. The formula for calculating the buffer count 
> is documented here 
> (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#configuring-the-network-buffers).
>   
> #slots-per-TM^2 * #TMs * 4
> In a standalone deployment, we may need to control the task manager cluster 
> size dynamically and then leverage the up-coming Flink feature to support 
> scaling job parallelism/rescaling at runtime. 
> If the buffer count config is static at runtime and cannot be changed without 
> restarting task manager process, this may add latency and complexity for 
> scaling process. I am wondering if there is already any discussion around 
> whether the network buffer should be automatically managed by Flink or at 
> least expose some API to allow it to be reconfigured. Let me know if there is 
> any existing JIRA that I should follow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4545) Flink automatically manages TM network buffer

Reply via email to