gaoyunhaii commented on a change in pull request #14734: URL: https://github.com/apache/flink/pull/14734#discussion_r567823301
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ########## @@ -350,13 +340,15 @@ public CheckpointCoordinator( this.minPauseBetweenCheckpoints, this.pendingCheckpoints::size, this.checkpointsCleaner::getNumberOfCheckpointsToClean); + this.cachedTasksById = - new LinkedHashMap<ExecutionAttemptID, ExecutionVertex>(tasksToWaitFor.length) { + new LinkedHashMap<ExecutionAttemptID, ExecutionVertex>( + attemptMappingProvider.getNumberOfTasks()) { @Override protected boolean removeEldestEntry( Map.Entry<ExecutionAttemptID, ExecutionVertex> eldest) { - return size() > CheckpointCoordinator.this.tasksToWaitFor.length; + return size() > attemptMappingProvider.getNumberOfTasks(); Review comment: Yes, since the incomplete tasks' metric is reported after its checkpoint is declined, if new checkpoints occur and `tasksToWaitFor` changed between the first checkpoint get declined and the metrics are reported, some tasks might be missed from the cache. Thus it would be better to always consider all the tasks. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org