gaoyunhaii commented on a change in pull request #14734:
URL: https://github.com/apache/flink/pull/14734#discussion_r567823301



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -350,13 +340,15 @@ public CheckpointCoordinator(
                         this.minPauseBetweenCheckpoints,
                         this.pendingCheckpoints::size,
                         
this.checkpointsCleaner::getNumberOfCheckpointsToClean);
+
         this.cachedTasksById =
-                new LinkedHashMap<ExecutionAttemptID, 
ExecutionVertex>(tasksToWaitFor.length) {
+                new LinkedHashMap<ExecutionAttemptID, ExecutionVertex>(
+                        attemptMappingProvider.getNumberOfTasks()) {
 
                     @Override
                     protected boolean removeEldestEntry(
                             Map.Entry<ExecutionAttemptID, ExecutionVertex> 
eldest) {
-                        return size() > 
CheckpointCoordinator.this.tasksToWaitFor.length;
+                        return size() > 
attemptMappingProvider.getNumberOfTasks();

Review comment:
       Yes, since the incomplete tasks' metric is reported after its checkpoint 
is declined, if new checkpoints occur and `tasksToWaitFor` changed between the 
first checkpoint get declined and the metrics are reported, some tasks might be 
missed from the cache.  Thus it would be better to always consider all the 
tasks. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to