Hi, Emre.
In large-scale production jobs, the phenomenon of data skew often occurs. Having an metric on the UI that reflects data skew without the need for manual inspection of each vertex by clicking on them would be quite cool. This could help users quickly identify problematic nodes, simplifying development and operations. I'm mainly curious about two minor points: 1. How will the colors of vertics with high data skew scores be unified with existing backpressure and high busyness colors on the UI? Users should be able to distinguish at a glance which vertics in the entire job graph is skewed. 2. Can you tell me that you prefer to unify Data Skew Score and Exception tab? In my opinion, Data Skew Score is in the same category as the existing Backpressured and Busy metrics. Looking forward to your reply. -- Best! Xuyang At 2024-01-16 00:59:57, "Kartoglu, Emre" <kar...@amazon.co.uk.INVALID> wrote: >Hello, > >I’m opening this thread to discuss a FLIP[1] to make data skew more visible on >Flink Dashboard. > >Data skew is currently not as visible as it should be. Users have to click >each operator and check how much data each sub-task is processing and compare >the sub-tasks against each other. This is especially cumbersome and >error-prone for jobs with big job graphs and high parallelism. I’m proposing >this FLIP to improve this. > >Kind regards, >Emre > >[1] >https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard > > >