Hi Krzysztof,

Thank you for the feedback! Please find my comments below.

1. Configurability

Adding a feature flag / configuration to enable this is still on the table as 
far as I am concerned. However I believe adding a new metric shouldn't warrant 
a flag/configuration. One might argue that we should have it for showing the 
metrics on the Flink UI, and I'd appreciate input on this. My default position 
is to not have a configuration/flag unless there is a good reason (e.g. it 
turns out there is impact on Flink UI for so far unknown reason). This is 
because the proposed change should only be improving the experience without any 
unwanted side effect.

2. Metrics

I agree the new metrics should be compatible with the rest of the Flink metric 
reporting mechanism. I will update the FLIP and propose names for the metrics.

Kind regards,
Emre

On 23/01/2024, 10:31, "Krzysztof Dziołak" <kdzio...@live.com 
<mailto:kdzio...@live.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi Emre,


Thank you for driving this proposal. I've got two questions about the 
extensions to the proposal that are not captured in the FLIP.




1. Configurability - what kind of configuration would you propose to maintain 
for this feature? Would On/off switch and/or aggregated period length be 
configurable? Should we capture the toggles in the FLIP ?
2. Metrics - are we planning to emit the skew metric via metric reporters 
mechanism. Should we capture proposed metric schema in the FLIP ?


Kind regards,
Krzysztof


________________________________
From: Kartoglu, Emre <kar...@amazon.co.uk.inva 
<mailto:kar...@amazon.co.uk.inva>LID>
Sent: Monday, January 15, 2024 4:59 PM
To: dev@flink.apache.org <mailto:dev@flink.apache.org> <dev@flink.apache.org 
<mailto:dev@flink.apache.org>>
Subject: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard


Hello,


I’m opening this thread to discuss a FLIP[1] to make data skew more visible on 
Flink Dashboard.


Data skew is currently not as visible as it should be. Users have to click each 
operator and check how much data each sub-task is processing and compare the 
sub-tasks against each other. This is especially cumbersome and error-prone for 
jobs with big job graphs and high parallelism. I’m proposing this FLIP to 
improve this.


Kind regards,
Emre


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
 
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard>









Reply via email to