Weihua Hu created FLINK-31771:
---------------------------------

             Summary: Improve select available slot from SlotPool
                 Key: FLINK-31771
                 URL: https://issues.apache.org/jira/browse/FLINK-31771
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Coordination
            Reporter: Weihua Hu


DefaultScheduler will request slots from SlotPool for tasks one by one.
For each task, the PhysicalSlotProviderImpl#tryAllocateFromAvailable will 
retrieve all available slots from 
DefaultAllocatedSlotPool#getFreeSlotsInformation, and then select the best slot 
by SlotSelectionStrategy.

Currently DefaultAllocatedSlotPool#getFreeSlotsInformation always calculates 
the taskExecutorUtilization.  This causes task schedules to be too slow when 
there are lots of slots, such as 20000 slots total. But only the 
EvenlySpreadOutLocationPreferenceSlotSelectionStrategy uses this utilization.

So I would like to move the calculation of taskExecutorUtilization to usage. 
DefaultAllocatedSlotPool provides a function: getTaskExecutorUtilization, and 
is only used in EvenlySpreadOutLocationPreferenceSlotSelectionStrategy.

This change could reduce the latency of allocated 20000 slots from 72s to 12s 
in my local IDE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to