[ https://issues.apache.org/jira/browse/FLINK-18295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhu Zhu updated FLINK-18295: ---------------------------- Description: Currently an {{IntermediateDataSet}} can have multiple consumer {{JobEdge}}s. That's why the consumers of an {{IntermediateResultPartition}} is in the form of {{List<List<ExecutionEdge>>}}. However, in scheduler/{{ExecutionGraph}} there is assumption that one {{IntermediateResultPartition}} can be consumed by one only {{ExecutionJobVertex}}. This results in a lot of hack logics which assumes partition consumers to contain a single list. Given that there is no plan yet to support multiple consumer {{JobEdge}}s of one {{IntermediateDataSet}}. I propose to refactor {{IntermediateDataSet}} to have one only consumer {{JobEdge}}. Thus the scheduler can get rid of these hack logics. was: Currently an {{IntermediateDataSet}} can have multiple {{JobVertex}} as its consumers. That's why the consumers of a `IntermediateResultPartition` is in the form of {{List<List<ExecutionEdge>>}}. However, in scheduler/{{ExecutionGraph}} there is assumption that one `IntermediateResultPartition` can be consumed by one only `ExecutionJobVertex`. This results in a lot of hack logics which assumes partition consumers to contain a single list. We should remove these hack logics. The idea is to change `IntermediateResultPartition#consumers` to be `List<ExecutionEdge>`. `ExecutionGraph` building logics should be adjusted accordingly with the assumption that an `IntermediateResult` can have one only consumer vertex. In `JobGraph`, there should also be check logics for this assumption. > Remove the hack logics of result consumers > ------------------------------------------ > > Key: FLINK-18295 > URL: https://issues.apache.org/jira/browse/FLINK-18295 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Coordination > Reporter: Zhu Zhu > Priority: Major > > Currently an {{IntermediateDataSet}} can have multiple consumer {{JobEdge}}s. > That's why the consumers of an {{IntermediateResultPartition}} is in the form > of {{List<List<ExecutionEdge>>}}. > However, in scheduler/{{ExecutionGraph}} there is assumption that one > {{IntermediateResultPartition}} can be consumed by one only > {{ExecutionJobVertex}}. This results in a lot of hack logics which assumes > partition consumers to contain a single list. > Given that there is no plan yet to support multiple consumer {{JobEdge}}s of > one {{IntermediateDataSet}}. I propose to refactor {{IntermediateDataSet}} to > have one only consumer {{JobEdge}}. Thus the scheduler can get rid of these > hack logics. -- This message was sent by Atlassian Jira (v8.3.4#803005)