Weihua Hu created FLINK-31771: --------------------------------- Summary: Improve select available slot from SlotPool Key: FLINK-31771 URL: https://issues.apache.org/jira/browse/FLINK-31771 Project: Flink Issue Type: Improvement Components: Runtime / Coordination Reporter: Weihua Hu
DefaultScheduler will request slots from SlotPool for tasks one by one. For each task, the PhysicalSlotProviderImpl#tryAllocateFromAvailable will retrieve all available slots from DefaultAllocatedSlotPool#getFreeSlotsInformation, and then select the best slot by SlotSelectionStrategy. Currently DefaultAllocatedSlotPool#getFreeSlotsInformation always calculates the taskExecutorUtilization. This causes task schedules to be too slow when there are lots of slots, such as 20000 slots total. But only the EvenlySpreadOutLocationPreferenceSlotSelectionStrategy uses this utilization. So I would like to move the calculation of taskExecutorUtilization to usage. DefaultAllocatedSlotPool provides a function: getTaskExecutorUtilization, and is only used in EvenlySpreadOutLocationPreferenceSlotSelectionStrategy. This change could reduce the latency of allocated 20000 slots from 72s to 12s in my local IDE. -- This message was sent by Atlassian Jira (v8.20.10#820010)