Hi Lincoln Lee,
Thanks for your feedback!
For the 2nd question, I have added the description for the configuration
items to guide users on when and how to adjust them. Typically, users
can refer to the "Read Bytes" metric of the subtasks to adjust
the "skewed-factor" and "skewed-threshold". I've
Thank you Yang for your reply!
The updated optoins in the FLIP addressed my 1st question.
And for the 2nd question, I understand the mechanism of your
implementation, but as a public option, it's important for the user
to have a specific metric to base the value of the option on when
they need to
Hi Lincoln Lee,
Thanks for your feedback!
For the 1st question, thank you for the reminder. This optimization is only
available for Table jobs in batch mode, and I have put these new options
into table module. I also replaced the "enable" and "force" configurations
with a new enum type configuratio
Thanks for bringing up this! It would be a useful feature for batch users.
For the FLIP, I have some questions:
1st, the implementation plan is to rewrite the optimization based on the
execnode of the table planner, but the config option for the optimization
is under flink-core module, does it me
+1 for the FLIP
Long-tail tasks caused by skewed data usually pose significant
challenges for users. It's great that Flink can mitigate such
issues automatically.
Thanks,
Zhu
Lei Yang 于2024年8月16日周五 11:18写道:
> Hi devs,
>
>
> Junrui Lee, Xia Sun and I would like to initiate a discussion about
>
Hi devs,
Junrui Lee, Xia Sun and I would like to initiate a discussion about
FLIP-475: Support Adaptive Skewed Join Optimization [1].
In a Join query, when certain keys occur frequently, it can lead to an
uneven distribution of data across partitions. This may affect the
execution performance o