Hi Xintong Song

2. We could switch between the detailed mode(including cpu, task heap,
task off-heap, shuffle, on-heap managed, off-heap managed) and the summary
mode(only including cpu and mem), which is very easy to do in UI design.

4. I think the key point is not pagination in Web UI but the REST API will
totally *break* without pagination in current design mode.
In my opinion, pagination is better than nothing, the pagination is a
solution to keep log API work, and it would be great if there is another
way to keep it work with huge log data.

Xintong Song <tonysong...@gmail.com> 于2019年9月30日周一 下午7:19写道:

> @Yadong
>
> 2. I agree that we can update the task executor ui after flip-56 is done.
> But I would suggest keep it on discussion to come up with a proper ui
> design for task executor resources. I don't think the mentioned image from
> flip-56 is a good choice. That image is a simplified figure with cpu and
> total memory only, for the purpose of demonstrating dynamic slot
> allocation. In fact, there are 6 fields to be displayed (cpu, task heap,
> task off-heap, shuffle, on-heap managed, off-heap managed). If we display
> cpu and total memory only, then user will be confused when seeing a task
> executor with enough remaining resources but tasks cannot be deployed onto
> it (because the desired type of memory might be used up).
>
> 4. I've been using blink webui, which already have log pagination. It's
> quite common that we need do search for some keywords (e.g., exception,
> error, warning) from a large amount of logs for diagnosing problems. I find
> it very inconvenient that I have to click into each page searching for the
> keywords, and I'd rather take the effort to find the original log files
> from the filesystem to view the log. Personally speaking, if the keyword
> searching cannot be supported, I would prefer to take some time loading the
> non-paginated logs over than paginated ones. Or we may at least have a
> button on the webui for switching between the two alternatives.
>
> @Till
>
> Thanks for the inputs.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Sep 30, 2019 at 5:55 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
> > For 3. At the moment the log and stdout file serving requires the
> > TaskExecutor to be running. But in some scenarios when having a NFS, it
> > should be enough to know where the file is located. However, this
> > assumption does not hold in the general case.
> >
> > Cheers,
> > Till
> >
> > On Mon, Sep 30, 2019 at 11:43 AM Yadong Xie <vthink...@gmail.com> wrote:
> >
> > > Hi Xintong Song
> > >
> > > Thanks for your comments!
> > >
> > > 1. I think it is a good idea that to align CPU and memory usage with
> > > FLIP-49 if it will release in version 1.10
> > > 2. We can update the task executor UI design after FLIP-56 merged into
> > > master. Actually, the image
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2
> > > >
> > > in FLIP-56 is a good UI design, we can follow it in the Flink web.
> > > 3. No idea about it, maybe anyone famailar with the runtime part could
> > > answer it? but it would be great to add it to the web UI in my opinion.
> > > 4. I'm not sure will keyword searching across all the pages may cost
> too
> > > many resources in job manager, but I think it would be very useful if
> the
> > > REST API could support it.
> > >
> > > Best,
> > > Yadong
> > >
> > > Xintong Song <tonysong...@gmail.com> 于2019年9月29日周日 下午8:11写道:
> > >
> > > > Thanks for drafting the FLIP and starting this discussion, Yadong.
> > > >
> > > >
> > > > I have some comments:
> > > >
> > > >
> > > >    - I can see that the proposed memory and cpu usage to be displayed
> > (in
> > > >    section 1.1) are aligned with the current ResourceProfile fields.
> > > > However,
> > > >    we are working on changing the memory fields in 1.10 with FLIP-49
> > > [1]. I
> > > >    suggest we align the UI design with the new FLIP-49 memory fields.
> > > >    - The task executor overview design (in section 1.2) is based on
> the
> > > >    current slot model. The coming FLIP-56 [2] which is also planned
> for
> > > > 1.10
> > > >    is changing the model so that task executors no longer have fixed
> > > > number of
> > > >    slots, but allocated slots (may have different resources) and
> > > available
> > > >    resources.
> > > >       - I can see that there's discussions in the google doc about
> > using
> > > >       different color for available resources. However, the resource
> > > > availability
> > > >       for different fields can be different, and may not be simply
> > > > displayed by a
> > > >       different color. E.g., a task executor may have two slot, while
> > > slot
> > > > 1
> > > >       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2
> > takes
> > > > (10%
> > > >       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> > > > resources in
> > > >       the task executor are (70% cpu, 55% heap mem, 50% managed mem,
> > > > etc.). How
> > > >       do you plan to display that?
> > > >       - I would suggest to have multiple bars for each task executor,
> > > while
> > > >       each bar represents one of the resource fields. In addition, we
> > > > may have a
> > > >       number (or some other figures) showing how many slots are
> > allocated
> > > > from
> > > >       the task executor.
> > > >    - Is there any way we provide access to logs of terminated task
> > > >    executors? It occurs to us a lot that a job failed due to a task
> > > > executor
> > > >    fail/lost. And we have to find the logs of failed task executors
> by
> > > >    manually accessing the file system. I think it would be helpful if
> > we
> > > > can
> > > >    find the logs of failed task executors directly in flink webui.
> > > >    - Regarding log pagination, is there any way to provide keyword
> > > >    searching across all the pages?
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > > >
> > > > On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <paullin3...@gmail.com>
> > wrote:
> > > >
> > > > > Filed a jira to track this[1].  Thanks a lot.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > > > > https://issues.apache.org/jira/browse/FLINK-14242>
> > > > >
> > > > > Best,
> > > > > Paul Lam
> > > > >
> > > > > > 在 2019年9月27日,14:34,Yadong Xie <vthink...@gmail.com> 写道:
> > > > > >
> > > > > > Hi Paul
> > > > > > Thanks for your suggestion.
> > > > > > I think it is easy to implement, could you create a JIRA for me?
> > > > > >
> > > > > > Paul Lam <paullin3...@gmail.com> 于2019年9月27日周五 上午11:11写道:
> > > > > >
> > > > > >> Hi Yadong,
> > > > > >>
> > > > > >> Thanks a lot for summing up the Web UI efforts.
> > > > > >>
> > > > > >> I have a minor suggestion: can we provide a collapse button for
> > the
> > > > task
> > > > > >> names in job graph visualization? For some complex jobs,
> > especially
> > > > SQL
> > > > > >> jobs, the task names are quite long which makes the job graph
> hard
> > > to
> > > > > read.
> > > > > >>
> > > > > >> Best,
> > > > > >> Paul Lam
> > > > > >>
> > > > > >>> 在 2019年9月27日,10:13,Yadong Xie <vthink...@gmail.com> 写道:
> > > > > >>>
> > > > > >>> Hi all
> > > > > >>>
> > > > > >>> Flink Web UI is the main platform for most users to monitor
> their
> > > > jobs
> > > > > >> and
> > > > > >>> clusters. We have reconstructed Flink web in 1.9.0 version, but
> > > there
> > > > > are
> > > > > >>> still some shortcomings.
> > > > > >>>
> > > > > >>> This discussion thread aims to provide a better experience for
> > > Flink
> > > > UI
> > > > > >>> users.
> > > > > >>>
> > > > > >>> Here is the design doc I drafted:
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >>>
> > > > > >>>
> > > > > >>> The FLIP can be found at [2].
> > > > > >>>
> > > > > >>> Please keep the discussion here, in the mailing list.
> > > > > >>>
> > > > > >>> Looking forward to your opinions, any feedbacks are welcome.
> > > > > >>>
> > > > > >>> [1]:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >>> <
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > >>>
> > > > > >>> [2]:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to