Hi all, for the upcoming 1.11 release, I started looking into adding support for Hadoop 3[1] for Flink. I have explored a little bit already into adding a shaded hadoop 3 into “flink-shaded”, and some mechanisms for switching between Hadoop 2 and 3 dependencies in the Flink build.
However, Chesnay made me aware that we could also go a different route: We let Flink depend on vanilla Hadoop dependencies and stop providing shaded fat jars for Hadoop through “flink-shaded”. Why? - Maintaining properly shaded Hadoop fat jars is a lot of work (we have insufficient test coverage for all kinds of Hadoop features) - For Hadoop 2, there are already some known and unresolved issues with our shaded jars that we didn’t manage to fix Users will have to use Flink with Hadoop by relying on vanilla or vendor-provided Hadoop dependencies. What do you think? Best, Robert [1] https://issues.apache.org/jira/browse/FLINK-11086