Luke Hutchison created FLINK-6057: ------------------------------------- Summary: Better default needed for num network buffers Key: FLINK-6057 URL: https://issues.apache.org/jira/browse/FLINK-6057 Project: Flink Issue Type: Bug Components: Core Affects Versions: 1.2.0 Reporter: Luke Hutchison
Using the default environment, {code} ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); {code} my code will sometimes fail with an error that Flink ran out of network buffers. To fix this, I have to do: {code} int numTasks = Runtime.getRuntime().availableProcessors(); config.setInteger(ConfigConstants.DEFAULT_PARALLELISM_KEY, numTasks); config.setInteger(ConfigConstants.TASK_MANAGER_NUM_TASK_SLOTS, numTasks); config.setInteger(ConfigConstants.TASK_MANAGER_NETWORK_NUM_BUFFERS_KEY, numTasks * 2048); {code} The default value of 2048 fails when I increase the degree of parallelism for a large Flink pipeline (hence the fix to set the number of buffers to numTasks * 2048). This is particularly problematic because a pipeline can work fine on one machine, and when you start the pipeline on a machine with more cores, it can fail. The default execution environment should pick a saner default based on the level of parallelism (or whatever is needed to ensure that the number of network buffers is not going to be exceeded for a given execution environment). -- This message was sent by Atlassian JIRA (v6.3.15#6346)