Hi Gen & Zhu,

-> 1. Can we also show "Blocked Slots" in the resource card, so that users
can easily figure out how many slots are available/blocked/in-use?

I think we should describe the "available" and "blocked" more clearly. In
my opinion, I think users should be interested in the number of slots in
the following 3 state:
1. free and unblocked, I think it's OK to call this state "available".
2. free and blocked, I think it's not appropriate to call "blocked"
directly, because "blocked" should include both the "free and blocked" and
"in-use and blocked".
3. in-use

And the sum of the aboved 3 kind of slots should be the total number of
slots in this cluster.

WDYT?

Best,
Lijie

Gen Luo <luogen...@gmail.com> 于2022年7月8日周五 16:14写道:

> Hi Zhu,
> Thanks for the feedback!
>
> 1.Good idea. Users should be more familiar with the slots as the resource
> units.
>
> 2.You remind me that the "speculative attempts" are execution attempts
> started by the SpeculativeScheduler when slot tasks are detected, while the
> current execution attempts other than the "most current" one are not really
> the speculative attempts. I agree we should modify the field name.
>
> 3.ArchivedSpeculativeExecutionVertex seems to be introduced with the
> speculative execution to handle the speculative attempts as a part of the
> execution history. Since this FLIP is handling the attempts with a more
> proper way, I agree that we can remove the
> ArchivedSpeculativeExecutionVertex.
>
> Thanks again and I'll update the FLIP later according to these suggestions.
>
> On Thu, Jul 7, 2022 at 4:35 PM Zhu Zhu <reed...@gmail.com> wrote:
>
> > Thanks for writing this FLIP and initiating the discussion, Gen, Yun and
> > Junhan!
> > It will be very useful to have these improvements on the web UI for
> > speculative execution users, allowing them to know what is happening.
> > I just have a few comment regarding the design details:
> >
> > 1. Can we also show "Blocked Slots" in the resource card, so that users
> > can easily figure out how many slots are available/blocked/in-use?
> > 2. I think "speculative-attempts" is not accurate, because the
> > root/fastest current can be a specualtive execution attempt, and in
> > this case "speculative-attempts" will contain the intial execution
> > attempt. How about name it as "other-concurrent-attempts"?
> > 3. I think ArchivedSpeculativeExecutionVertex is not necessarily
> > needed. We can rework the ArchivedExecutionVertex to contains a set of
> > current execution attempts. The set will have one only element in
> > non-speculative cases though. In this way, we can have a unified
> > processing for ArchivedExecutionVertex in speculative/non-speculative
> > cases.
> >
> > Thanks,
> > Zhu
> >
> > Gen Luo <luogen...@gmail.com> 于2022年7月5日周二 15:10写道:
> >
> > >
> > > Hi everyone,
> > >
> > > The speculative execution for batch jobs has been proposed and accepted
> > in
> > > FLIP-168[1], as well as the related blocklist mechanism in FLIP-224[2].
> > As
> > > a follow-up step, the Flink Web UI needs to be enhanced to display the
> > > related information if the speculative execution mechanism is enabled.
> > >
> > > Junhan Yang, Yun Gao and I would like to start the discussion about the
> > Web
> > > UI enhancement and the corresponding REST API changes in FLIP-249[3],
> > > including:
> > > - show the speculative executions in the subtask list and the
> > backpressure
> > > page, where the fastest is shown directly while others are folded;
> > > - show the number of the blocked task managers in the Task Managers and
> > > Slots card, when the number is not 0;
> > > - show the BLOCKED label in the task manager list and the task manager
> > > detail page for the blocked task managers.
> > >
> > > All changes expect to be transparent to users who don’t use speculative
> > > execution.
> > >
> > > Please see the FLIP page[3] for more details. Looking forward to your
> > > feedback.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-168%3A+Speculative+Execution+for+Batch+Job
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-224%3A+Blocklist+Mechanism
> > > [3]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-249%3A+Flink+Web+UI+Enhancement+for+Speculative+Execution
> >
>

Reply via email to