Sxnan commented on code in PR #19653:
URL: https://github.com/apache/flink/pull/19653#discussion_r884592825


##########
flink-runtime/src/main/java/org/apache/flink/runtime/deployment/TaskDeploymentDescriptorFactory.java:
##########
@@ -244,7 +278,50 @@ public static TaskDeploymentDescriptorFactory 
fromExecutionVertex(
                 
internalExecutionGraphAccessor.getPartitionLocationConstraint(),
                 executionVertex.getAllConsumedPartitionGroups(),
                 internalExecutionGraphAccessor::getResultPartitionOrThrow,
-                internalExecutionGraphAccessor.getBlobWriter());
+                internalExecutionGraphAccessor.getBlobWriter(),
+                clusterPartitionShuffleDescriptors);
+    }
+
+    private static Map<IntermediateDataSetID, ShuffleDescriptor[]>
+            getClusterPartitionShuffleDescriptors(ExecutionVertex 
executionVertex) {
+        final InternalExecutionGraphAccessor internalExecutionGraphAccessor =
+                executionVertex.getExecutionGraphAccessor();
+        final List<IntermediateDataSetID> consumedClusterDataSetIds =
+                
executionVertex.getJobVertex().getJobVertex().getIntermediateDataSetIdToConsume();
+        Map<IntermediateDataSetID, ShuffleDescriptor[]> 
clusterPartitionShuffleDescriptors =
+                new HashMap<>();
+
+        for (IntermediateDataSetID consumedClusterDataSetId : 
consumedClusterDataSetIds) {
+            Collection<? extends ShuffleDescriptor> shuffleDescriptors =
+                    
internalExecutionGraphAccessor.getClusterPartitionShuffleDescriptors(
+                            consumedClusterDataSetId);
+
+            Preconditions.checkState(
+                    executionVertex.getTotalNumberOfParallelSubtasks() == 
shuffleDescriptors.size(),
+                    "The parallelism (%s) of the cache consuming job vertex is 
"
+                            + "different from the number of shuffle 
descriptors (%s) of the intermediate data set",
+                    executionVertex.getTotalNumberOfParallelSubtasks(),
+                    shuffleDescriptors.size());
+
+            shuffleDescriptors =
+                    shuffleDescriptors.stream()
+                            .filter(
+                                    descriptor ->
+                                            descriptor
+                                                            
.getResultPartitionID()
+                                                            .getPartitionId()
+                                                            
.getPartitionNumber()
+                                                    == 
executionVertex.getParallelSubtaskIndex())
+                            .collect(Collectors.toList());
+
+            Preconditions.checkState(

Review Comment:
   Yes, the producer and consumer of the cluster partition should have the same 
parallelism and each consumer Task consumes one output partition of the 
producer. It is up to the job graph generator side to make sure the assumption 
holds.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to