Hi all,

for the upcoming 1.11 release, I started looking into adding support for
Hadoop 3[1] for Flink. I have explored a little bit already into adding a
shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching
between Hadoop 2 and 3 dependencies in the Flink build.

However, Chesnay made me aware that we could also go a different route: We
let Flink depend on vanilla Hadoop dependencies and stop providing shaded
fat jars for Hadoop through “flink-shaded”.

Why?
- Maintaining properly shaded Hadoop fat jars is a lot of work (we have
insufficient test coverage for all kinds of Hadoop features)
- For Hadoop 2, there are already some known and unresolved issues with our
shaded jars that we didn’t manage to fix

Users will have to use Flink with Hadoop by relying on vanilla or
vendor-provided Hadoop dependencies.

What do you think?

Best,
Robert

[1] https://issues.apache.org/jira/browse/FLINK-11086

Reply via email to