rkhachatryan commented on code in PR #21981:
URL: https://github.com/apache/flink/pull/21981#discussion_r1116959052


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptive/allocator/StateLocalitySlotAssigner.java:
##########
@@ -157,13 +165,21 @@ public Map<AllocationID, Integer> calculateScore(
                                     .getMaxParallelism(),
                             parallelism.get(evi.getJobVertexId()),
                             evi.getSubtaskIndex());
+            // Estimate state size per key group. For scoring, assume 1 if 
size estimate is 0 to
+            // accommodate for averaging non-zero states
+            Optional<Long> kgSizeMaybe =
+                    stateSizeEstimates.estimate(evi.getJobVertexId()).map(e -> 
Math.max(e, 1L));
+            if (!kgSizeMaybe.isPresent()) {
+                continue;
+            }

Review Comment:
   We still need to consider this state: if we place the task on a different TM 
then it will have to download all its SST files (or am I missing something?)
   
   There are two methods for keyed state:
   1. `handle.getStateSize()` returns the full state size
   2. 
[`handle.getCheckpointedSize()`](https://github.com/apache/flink/blob/464ded1c2a0497255b70f711167c3b7ae52ea0f7/flink-runtime/src/main/java/org/apache/flink/runtime/state/CompositeStateHandle.java#L62)
 returns "incremental" state size
   
   As per above, `getStateSize` is used.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to