Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Fabian Wollert Tue, 09 Oct 2018 03:55:07 -0700

argh, i think the screenshot is missing (at least nabble is not showing
anything). here is a link to the mockup:


https://drive.google.com/file/d/1p3wVP028_AFFLZ6fjPb41yAI8zUhgDTO/view?usp=sharing

Cheers

--


*Fabian WollertZalando SE*

E-Mail: [email protected]


Am Di., 9. Okt. 2018 um 12:46 Uhr schrieb Fabian Wollert <[email protected]
>:

> Hi everyone,
>
> disclaimer: i read the contribution guide about improvement requests (i.e.
> i should actually just start a jira ticket) but i thought it would make
> sense to run this first through the mailing list here. after collecting
> some input i would then create the jira ticket.
>
> When accessing the Flink Web Dashboard (which is basically what i do
> almost every day to check some status of a job or so), I recently felt that
> the actual information given in the top portion of the start page is highly
> improvable. I created a first mock by moving html elements around and
> wanted to share this one now:
>
> [image: image.png]
>
> With the exception of the metrics (see below) none of this information
> should be new, but rather re-organized to speed up investigation and
> monitoring:
>
>    - complete overview on the cluster status and health, without clicking
>    through a lot of pages.
>    - Active and stand-by Job Managers. Also their health is depicted as a
>       color (as a first suggestion: last heartbeat is inside 
> heartbeat.timeout)
>       - Current registered Task Managers
>          - the little bar on the side indicates task slot usage. i did
>          not color it since a fully utilised task manager is not necessarily
>          something bad.
>          - the color indicates the health of the task manager (as a first
>          suggestion: last heartbeat is inside heartbeat.timeout)
>       - overview on some cluster metrics
>
> Some points to notice:
>
>    - All data you see on the screenshot is mock, no number relates to
>    another number at all. but colors should relate to the numbers already
>    which they indicate.
>    - All of this could also be done with other monitoring solutions
>    someone might have in his company, by reading out JMX metrics and then
>    plotting those in his monitoring solution (e.g. grafana). But this out of
>    the box solution would save everyone from doing it on their own and they
>    could trust the metrics shown here.
>    - Some of the metrics can only be done with FLINK-7286
>    <https://issues.apache.org/jira/browse/FLINK-7286> being done. So i
>    would split the implementation of this into two parts (cluster overview and
>    metrics) and do them separately.
>    - This first mock up is targeted to what we here at Zalando would like
>    to see first glance, so it fits our use case very well. We mostly use
>    long-running session clusters.
>    - I'm more a Backend Guy with some Frontend expertise (but mostly in
>    React, no angular1 (Flink Web Dashboard is built with this currently)
>    experience) and not at all a designer.
>
> What do you think? I would be glad to have some feedback on this,
> especially if this makes sense in the broad community. I would no matter
> what implement this somehow, if not in the Flink Master branch, then as a
> OS project which anyone can deploy next to their flink clusters. But i
> first wanted to run it through here to see if this sparks any interest.
>
> Please also let me know if you see difficulties implementing this already,
> maybe i have overseen something.
>
> Can't wait for your input.
>
> Cheers
>
> --
>
>
> *Fabian WollertZalando SE*
>
> E-Mail: [email protected]
>

Re: [DISCUSS] Flink Cluster Overview Dashboard Improvement Proposal

Reply via email to