Weihua Hu created FLINK-31771:
---------------------------------
Summary: Improve select available slot from SlotPool
Key: FLINK-31771
URL: https://issues.apache.org/jira/browse/FLINK-31771
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Reporter: Weihua Hu
DefaultScheduler will request slots from SlotPool for tasks one by one.
For each task, the PhysicalSlotProviderImpl#tryAllocateFromAvailable will
retrieve all available slots from
DefaultAllocatedSlotPool#getFreeSlotsInformation, and then select the best slot
by SlotSelectionStrategy.
Currently DefaultAllocatedSlotPool#getFreeSlotsInformation always calculates
the taskExecutorUtilization. This causes task schedules to be too slow when
there are lots of slots, such as 20000 slots total. But only the
EvenlySpreadOutLocationPreferenceSlotSelectionStrategy uses this utilization.
So I would like to move the calculation of taskExecutorUtilization to usage.
DefaultAllocatedSlotPool provides a function: getTaskExecutorUtilization, and
is only used in EvenlySpreadOutLocationPreferenceSlotSelectionStrategy.
This change could reduce the latency of allocated 20000 slots from 72s to 12s
in my local IDE.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)