harker2015 commented on code in PR #20739:
URL: https://github.com/apache/flink/pull/20739#discussion_r970409594


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/strategy/PipelinedRegionSchedulingStrategy.java:
##########
@@ -207,72 +214,97 @@ public void onExecutionStateChange(
             final ExecutionVertexID executionVertexId, final ExecutionState 
executionState) {
         if (executionState == ExecutionState.FINISHED) {
             maybeScheduleRegions(
-                    
getDownstreamRegionsOfVertex(schedulingTopology.getVertex(executionVertexId)));
+                    getBlockingDownstreamRegionsOfVertex(
+                            schedulingTopology.getVertex(executionVertexId)));
         }
     }
 
     @Override
     public void onPartitionConsumable(final IntermediateResultPartitionID 
resultPartitionId) {}
 
     private void maybeScheduleRegions(final Set<SchedulingPipelinedRegion> 
regions) {
-        final List<SchedulingPipelinedRegion> regionsSorted =
-                SchedulingStrategyUtils.sortPipelinedRegionsInTopologicalOrder(
-                        schedulingTopology, regions);
+        final Set<SchedulingPipelinedRegion> regionsToSchedule = new 
LinkedHashSet<>();
+        LinkedHashSet<SchedulingPipelinedRegion> nextRegions = new 
LinkedHashSet<>(regions);
+        while (!nextRegions.isEmpty()) {
+            nextRegions = addSchedulableAndGetNextRegions(nextRegions, 
regionsToSchedule);
+        }
+        // schedule regions in topological order.
+        SchedulingStrategyUtils.sortPipelinedRegionsInTopologicalOrder(
+                        schedulingTopology, regionsToSchedule)
+                .forEach(this::scheduleRegion);
+    }
 
+    private LinkedHashSet<SchedulingPipelinedRegion> 
addSchedulableAndGetNextRegions(
+            Set<SchedulingPipelinedRegion> currentRegions,
+            Set<SchedulingPipelinedRegion> regionsToSchedule) {
+        LinkedHashSet<SchedulingPipelinedRegion> nextRegions = new 
LinkedHashSet<>();
+        // cache consumedPartitionGroup's consumable status to avoid compute 
repeatedly.
         final Map<ConsumedPartitionGroup, Boolean> consumableStatusCache = new 
HashMap<>();
-        final Set<SchedulingPipelinedRegion> downstreamSchedulableRegions = 
new HashSet<>();
-        for (SchedulingPipelinedRegion region : regionsSorted) {
-            if (maybeScheduleRegion(region, consumableStatusCache)) {
-                downstreamSchedulableRegions.addAll(
-                        consumedPartitionGroupsOfRegion.getOrDefault(region, 
Collections.emptySet())
-                                .stream()
-                                .flatMap(
-                                        consumedPartitionGroups ->
-                                                partitionGroupConsumerRegions
-                                                        .getOrDefault(
-                                                                
consumedPartitionGroups,
-                                                                
Collections.emptySet())
-                                                        .stream())
-                                .collect(Collectors.toSet()));
+        final Set<ConsumedPartitionGroup> visitedConsumedPartitionGroups = new 
HashSet<>();
+
+        for (SchedulingPipelinedRegion currentRegion : currentRegions) {
+            if (isRegionSchedulable(currentRegion, consumableStatusCache, 
regionsToSchedule)) {
+                regionsToSchedule.add(currentRegion);
+                producedPartitionGroupsOfRegion
+                        .getOrDefault(currentRegion, Collections.emptySet())
+                        .forEach(
+                                (consumedPartitionGroup) -> {

Review Comment:
   For consistency, shall we use producedPartitionGroup instead of 
consumedPartitionGroup?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to