Hi Gen & Zhu, -> 1. Can we also show "Blocked Slots" in the resource card, so that users can easily figure out how many slots are available/blocked/in-use?
I think we should describe the "available" and "blocked" more clearly. In my opinion, I think users should be interested in the number of slots in the following 3 state: 1. free and unblocked, I think it's OK to call this state "available". 2. free and blocked, I think it's not appropriate to call "blocked" directly, because "blocked" should include both the "free and blocked" and "in-use and blocked". 3. in-use And the sum of the aboved 3 kind of slots should be the total number of slots in this cluster. WDYT? Best, Lijie Gen Luo <luogen...@gmail.com> 于2022年7月8日周五 16:14写道: > Hi Zhu, > Thanks for the feedback! > > 1.Good idea. Users should be more familiar with the slots as the resource > units. > > 2.You remind me that the "speculative attempts" are execution attempts > started by the SpeculativeScheduler when slot tasks are detected, while the > current execution attempts other than the "most current" one are not really > the speculative attempts. I agree we should modify the field name. > > 3.ArchivedSpeculativeExecutionVertex seems to be introduced with the > speculative execution to handle the speculative attempts as a part of the > execution history. Since this FLIP is handling the attempts with a more > proper way, I agree that we can remove the > ArchivedSpeculativeExecutionVertex. > > Thanks again and I'll update the FLIP later according to these suggestions. > > On Thu, Jul 7, 2022 at 4:35 PM Zhu Zhu <reed...@gmail.com> wrote: > > > Thanks for writing this FLIP and initiating the discussion, Gen, Yun and > > Junhan! > > It will be very useful to have these improvements on the web UI for > > speculative execution users, allowing them to know what is happening. > > I just have a few comment regarding the design details: > > > > 1. Can we also show "Blocked Slots" in the resource card, so that users > > can easily figure out how many slots are available/blocked/in-use? > > 2. I think "speculative-attempts" is not accurate, because the > > root/fastest current can be a specualtive execution attempt, and in > > this case "speculative-attempts" will contain the intial execution > > attempt. How about name it as "other-concurrent-attempts"? > > 3. I think ArchivedSpeculativeExecutionVertex is not necessarily > > needed. We can rework the ArchivedExecutionVertex to contains a set of > > current execution attempts. The set will have one only element in > > non-speculative cases though. In this way, we can have a unified > > processing for ArchivedExecutionVertex in speculative/non-speculative > > cases. > > > > Thanks, > > Zhu > > > > Gen Luo <luogen...@gmail.com> 于2022年7月5日周二 15:10写道: > > > > > > > > Hi everyone, > > > > > > The speculative execution for batch jobs has been proposed and accepted > > in > > > FLIP-168[1], as well as the related blocklist mechanism in FLIP-224[2]. > > As > > > a follow-up step, the Flink Web UI needs to be enhanced to display the > > > related information if the speculative execution mechanism is enabled. > > > > > > Junhan Yang, Yun Gao and I would like to start the discussion about the > > Web > > > UI enhancement and the corresponding REST API changes in FLIP-249[3], > > > including: > > > - show the speculative executions in the subtask list and the > > backpressure > > > page, where the fastest is shown directly while others are folded; > > > - show the number of the blocked task managers in the Task Managers and > > > Slots card, when the number is not 0; > > > - show the BLOCKED label in the task manager list and the task manager > > > detail page for the blocked task managers. > > > > > > All changes expect to be transparent to users who don’t use speculative > > > execution. > > > > > > Please see the FLIP page[3] for more details. Looking forward to your > > > feedback. > > > > > > [1] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job > > > [2] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism > > > [3] > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-249%3A+Flink+Web+UI+Enhancement+for+Speculative+Execution > > >