rkhachatryan commented on code in PR #21981: URL: https://github.com/apache/flink/pull/21981#discussion_r1116959052
########## flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/allocator/StateLocalitySlotAssigner.java: ########## @@ -157,13 +165,21 @@ public Map<AllocationID, Integer> calculateScore( .getMaxParallelism(), parallelism.get(evi.getJobVertexId()), evi.getSubtaskIndex()); + // Estimate state size per key group. For scoring, assume 1 if size estimate is 0 to + // accommodate for averaging non-zero states + Optional<Long> kgSizeMaybe = + stateSizeEstimates.estimate(evi.getJobVertexId()).map(e -> Math.max(e, 1L)); + if (!kgSizeMaybe.isPresent()) { + continue; + } Review Comment: We still need to consider this state: if we place the task on a different TM then it will have to download all its SST files (or am I missing something?) There are two methods for keyed state: 1. `handle.getStateSize()` returns the full state size 2. [`handle.getCheckpointedSize()`](https://github.com/apache/flink/blob/464ded1c2a0497255b70f711167c3b7ae52ea0f7/flink-runtime/src/main/java/org/apache/flink/runtime/state/CompositeStateHandle.java#L62) returns "incremental" state size As per above, `getStateSize` is used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org