RocMarshal commented on code in PR #25647:
URL: https://github.com/apache/flink/pull/25647#discussion_r1852202449


##########
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SlotSharingExecutionSlotAllocatorFactory.java:
##########
@@ -40,13 +41,16 @@ public SlotSharingExecutionSlotAllocatorFactory(
             PhysicalSlotProvider slotProvider,
             boolean slotWillBeOccupiedIndefinitely,
             PhysicalSlotRequestBulkChecker bulkChecker,
-            Duration allocationTimeout) {
+            Duration allocationTimeout,
+            TaskManagerOptions.TaskManagerLoadBalanceMode 
taskManagerLoadBalanceMode) {
         this(
                 slotProvider,
                 slotWillBeOccupiedIndefinitely,
                 bulkChecker,
                 allocationTimeout,
-                new LocalInputPreferredSlotSharingStrategy.Factory());
+                taskManagerLoadBalanceMode == 
TaskManagerOptions.TaskManagerLoadBalanceMode.TASKS
+                        ? new 
TaskBalancedPreferredSlotSharingStrategy.Factory()

Review Comment:
   In fact, the phenomenon of uneven task scheduling results is mainly caused 
by two stages:
   1. tasks->slot
   2. slots->taskmanager
   Because prior to the initial feature freeze, we anticipated that the entire 
FLIP would be at risk. Our original intention in initiating this PR was to 
release this change as early as possible to address the imbalance issue in the 
task ->slot dimension. But now there is no time risk with this feature, so we 
plan to release both dimensions of functionality together.
   BTW, the optimizations and changes mentioned above are highly likely not to 
be fixed in the current PR as it may be abandoned.
   But we will place the corresponding optimization in the appropriate PR 
position in the future. If you are willing to help with the review, I would 
greatly appreciate it! :)



##########
docs/layouts/shortcodes/generated/all_taskmanager_section.html:
##########
@@ -90,7 +90,7 @@
             <td><h5>taskmanager.load-balance.mode</h5></td>
             <td style="word-wrap: break-word;">NONE</td>
             <td><p>Enum</p></td>
-            <td>Mode for the load-balance allocation strategy across all 
available <code class="highlighter-rouge">TaskManagers</code>.<ul><li>The <code 
class="highlighter-rouge">SLOTS</code> mode tries to spread out the slots 
evenly across all available <code 
class="highlighter-rouge">TaskManagers</code>.</li><li>The <code 
class="highlighter-rouge">NONE</code> mode is the default mode without any 
specified strategy.</li></ul><br /><br />Possible 
values:<ul><li>"NONE"</li><li>"SLOTS"</li></ul></td>
+            <td>Mode for the load-balance allocation strategy across all 
available <code class="highlighter-rouge">TaskManagers</code>.<ul><li>The <code 
class="highlighter-rouge">SLOTS</code> mode tries to spread out the slots 
evenly across all available <code 
class="highlighter-rouge">TaskManagers</code>.</li><li>The <code 
class="highlighter-rouge">TASKS</code> mode tries to schedule evenly all tasks 
based on its' number across all available <code 
class="highlighter-rouge">TaskManagers</code>. Note: Currently, enabling this 
parameter can only achieve the balancing effect of the slot level dimension of 
<code class="highlighter-rouge">DefaultScheduler</code>.</li><li>The <code 
class="highlighter-rouge">NONE</code> mode is the default mode without any 
specified strategy.</li></ul><br /><br />Possible 
values:<ul><li>"NONE"</li><li>"SLOTS"</li><li>"TASKS"</li></ul></td>

Review Comment:
   Good idea~, we do need a more detailed and precise description, as well as 
traceable design documents.
   How about making the following changes based on your suggestion?
   1. Optimize the configuration description section in the code, including 
adding FLIP addresses
   2. The explanation of how to use and the differences between it and other 
configurations, as well as the creation of charts, can be placed in 
https://issues.apache.org/jira/browse/FLINK-33392 Completed in the middle



##########
flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java:
##########
@@ -707,6 +707,12 @@ public class TaskManagerOptions {
                                                     "The %s mode tries to 
spread out the slots evenly across all available %s.",
                                                     
code(TaskManagerLoadBalanceMode.SLOTS.name()),
                                                     code("TaskManagers")),
+                                            text(
+                                                    "The %s mode tries to 
schedule evenly all tasks based on its' number across all available %s. "

Review Comment:
   The meaning expressed here is that the scheduler will try its best to ensure 
that the scheduling results are concentrated and the number of tasks on each 
Taskmanager is similar.
   I will add a more detailed description
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to