[ https://issues.apache.org/jira/browse/FLINK-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
liupengcheng updated FLINK-18852: --------------------------------- Description: Currently, the parallelism for StreamTableSourceScan/DataStreamScan is not inherited from the upstream input, but retrieved from the config. I think this is unexpected. I find this issue through UT, here is an example: {code:java} // env parallelism is set to 4 val env = StreamExecutionEnvironment.getExecutionEnvironment val tEnv = StreamTableEnvironment.create(env) StreamITCase.testResults = new mutable.MutableList[String] env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) env.setParallelism(4) // DataSource parallelism is set to 1 val table1 = env.fromCollection(left) .setParallelism(1) .assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0)) .toTable(tEnv, 'a, 'b) val table2 = env.fromCollection(right) .setParallelism(1) .assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0)) .toTable(tEnv, 'a, 'b) {code} But when you start the execution, and visualize the execution plan, you can find that the "from"(the StreamScan) operator's parallelism is 4. !image-2020-08-07-21-22-57-843.png! was: Currently, the parallelism for StreamTableSourceScan/DataStreamScan is not inherited from the upstream input, but retrieved from the config. I think this is unexpected. I find this issue through UT, here is an example: {code:java} // env parallelism is set to 4 val env = StreamExecutionEnvironment.getExecutionEnvironment val tEnv = StreamTableEnvironment.create(env) StreamITCase.testResults = new mutable.MutableList[String] env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) env.setParallelism(4) // DataSource parallelism is set to 1 val table1 = env.fromCollection(left) .setParallelism(1) .assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0)) .toTable(tEnv, 'a, 'b) val table2 = env.fromCollection(right) .setParallelism(1) .assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0)) .toTable(tEnv, 'a, 'b) {code} But when you start the execution, and visualize the execution plan, you can find that the "from"(the StreamScan) operator's parallelism is 4. !image-2020-08-07-21-22-57-843.png|thumbnail! > StreamScan should keep the same parallelism as the input > -------------------------------------------------------- > > Key: FLINK-18852 > URL: https://issues.apache.org/jira/browse/FLINK-18852 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.11.1 > Reporter: liupengcheng > Priority: Major > Attachments: image-2020-08-07-21-22-57-843.png > > > Currently, the parallelism for StreamTableSourceScan/DataStreamScan is not > inherited from the upstream input, but retrieved from the config. I think > this is unexpected. > I find this issue through UT, here is an example: > {code:java} > // env parallelism is set to 4 > val env = StreamExecutionEnvironment.getExecutionEnvironment > val tEnv = StreamTableEnvironment.create(env) > StreamITCase.testResults = new mutable.MutableList[String] > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) > env.setParallelism(4) > // DataSource parallelism is set to 1 > val table1 = env.fromCollection(left) > .setParallelism(1) > .assignTimestampsAndWatermarks(new > TimestampAndWatermarkWithOffset[(Long, String)](0)) > .toTable(tEnv, 'a, 'b) > val table2 = env.fromCollection(right) > .setParallelism(1) > .assignTimestampsAndWatermarks(new > TimestampAndWatermarkWithOffset[(Long, String)](0)) > .toTable(tEnv, 'a, 'b) > {code} > But when you start the execution, and visualize the execution plan, you can > find that the "from"(the StreamScan) operator's parallelism is 4. > !image-2020-08-07-21-22-57-843.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)