yes, also please consider the new and simplified network buffer configuration from 1.3 onwards: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ config.html#configuring-the-network-buffers
Nico On Thursday, 8 June 2017 05:26:57 CEST Zhijiang(wangzhijiang999) wrote: > Hi Ray, > For your question > : Does that say that each parallel task inside the TaskManager talk to all > parallel tasks inside the same TaskManager or to all parallel tasks across a > ll task managers? Each task will talk to all parallel upstream and > downstream tasks that both include the same TaskManager and across > different task managers.The consumer and producer tasks may be deployed in > the same TaskManager or different TaskManagers.For the case of same > TaskManager, the local data shuffle is directly done by memory copy and the > required buffers can be determined by #slots-per-TM^2.For the case of > across TaskManagers, the remote data shuffle is done by network transport > and only one tcp connection between two TaskManagers can be reused by all > the internal tasks. So the required buffers can be determined by #TMs. > Considering both cases, the formular is #slots-per-TM^2 * #TMs, hope it can > help you. > Cheers,Zhijiang ----------------------------------------------------------- > -------发件人:Ray Ruvinskiy <ray.ruvins...@arcticwolf.com>发送时间:2017年6月7日(星期三) > 23:59收件人:user@flink.apache.org <user@flink.apache.org>主 题:Question > regarding configuring number of network buffers The documentation provides > the formula #slots-per-TM^2 * #TMs * 4 to determine the number of network > buffers we should configure. The documentation also says, “A logical > network connection exists for each point-to-point exchange of data over the > network, which typically happens at repartitioning- or broadcasting steps > (shuffle phase). In those, each parallel task inside the TaskManager has to > be able to talk to all other parallel tasks.” Does that say that each > parallel task inside the TaskManager talk to all parallel tasks inside the > same TaskManager or to all parallel tasks across all task managers? > Intuitively, I would assume the latter, but then wouldn’t the formula for > determining the number of network buffers be more along the lines of > (#slots-per-TM * #TMs)^2? Thanks, Ray
signature.asc
Description: This is a digitally signed message part.