I guess upgrading the minimal version should also mean cleaning up the codebase, i.e. removing code segments that have been around to allow support for older versions. The overall goal should be to improve the Flink codebase in my opinion. Considering what David said in the old thread about Hadoop users usually lacking behind with version upgrades [1], would we do this version bump in two phases, i.e. adding some deprecation notes and doing the actual cleanup later on? I think Gabor has a point with it not being really mentioned anywhere in the docs (the only location I could find in the docs about Hadoop version is [2]). In this sense, the support for older Hadoop versions was kind of implicit: We're talking about compiling Flink with Hadoop 2.8.5 but also mention older Hadoop versions which leaves room for interpretation.
Additionally, having code that hasn't been touch for a while increases the risk of it Matthias [1] https://lists.apache.org/thread/w7www13tossxrxo1mttgb68v81rf6fks [2] https://nightlies.a1pache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#supported-hadoop-versions <https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#supported-hadoop-versions> On Fri, Oct 21, 2022 at 4:13 AM Xintong Song <tonysong...@gmail.com> wrote: > I believe there are some reflection based approaches in the `flink-yarn` > module, for supporting outdated APIs in early Hadoop versions. > > I haven't done a thorough check, and these are what I get. > - AMRMClientAsyncReflector > - ApplicationSubmissionContextReflector > - ContainerRequestReflector > - RegisterApplicationMasterResponseReflector > - ResourceInformationReflector > > Are we removing these as well? If yes, then Flink can no longer work with > the old hadoop versions. (That's how I understand "bumping the minimal > supported hadoop version".) I personally am not super eager to get rid of > theses, because the relevant parts of codes are no longer frequently > changing, thus the maintenance overhead is low. > > Best, > > Xintong > > > > On Thu, Oct 20, 2022 at 8:00 PM Yang Wang <danrtsey...@gmail.com> wrote: > > > Given that we do not bundle any hadoop classes in the Flink binary, do > you > > mean simply bump the hadoop version in the parent pom? > > If it is, why do not we use the latest stable hadoop version 3.3.4? It > > seems that our cron build has verified that hadoop3 could work. > > > > Best, > > Yang > > > > David Morávek <david.mora...@gmail.com> 于2022年10月19日周三 16:29写道: > > > > > +1; anything below 2.10.x seems to be EOL > > > > > > Best, > > > D. > > > > > > On Mon, Oct 17, 2022 at 10:48 AM Márton Balassi < > > balassi.mar...@gmail.com> > > > wrote: > > > > > > > Hi Martjin, > > > > > > > > +1 for 2.10.2. Do you expect to have bandwidth in the near term to > > > > implement the bump? > > > > > > > > On Wed, Oct 5, 2022 at 5:00 PM Gabor Somogyi < > > gabor.g.somo...@gmail.com> > > > > wrote: > > > > > > > > > Hi Martin, > > > > > > > > > > Thanks for bringing this up! Lately I was thinking about to bump > the > > > > hadoop > > > > > version to at least 2.6.1 to clean up issues like this: > > > > > > > > > > > > > > > > > > > > https://github.com/apache/flink/blob/8d05393f5bcc0a917b2dab3fe81a58acaccabf13/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/util/HadoopUtils.java#L157-L159 > > > > > > > > > > All in all +1 from my perspective. > > > > > > > > > > Just a question here. Are we stating the minimum Hadoop version for > > > users > > > > > somewhere in the doc or they need to find it out from source code > > like > > > > > this? > > > > > > > > > > > > > > > > > > > > https://github.com/apache/flink/blob/3a4c11371e6f2aacd641d86c1d5b4fd86435f802/tools/azure-pipelines/build-apache-repo.yml#L113 > > > > > > > > > > BR, > > > > > G > > > > > > > > > > > > > > > On Wed, Oct 5, 2022 at 5:02 AM Martijn Visser < > > > martijnvis...@apache.org> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > Little over a year ago a discussion thread was opened on changing > > the > > > > > > minimal supported version of Hadoop and bringing that to 2.8.5. > [1] > > > In > > > > > this > > > > > > discussion thread, I would like to propose to bring that minimal > > > > > supported > > > > > > version of Hadoop to 2.10.2. > > > > > > > > > > > > Hadoop 2.8.5 is vulnerable for multiple CVEs which are classified > > as > > > > > > Critical. [2] [3]. While Flink is not directly impacted by those, > > we > > > do > > > > > see > > > > > > vulnerability scanners flag Flink as being vulnerable. We could > > > easily > > > > > > mitigate that by bumping the minimal supported version of Hadoop > to > > > > > 2.10.2. > > > > > > > > > > > > I'm looking forward to your opinions on this topic. > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Martijn > > > > > > https://twitter.com/MartijnVisser82 > > > > > > https://github.com/MartijnVisser > > > > > > > > > > > > [1] > > https://lists.apache.org/thread/81fhnwfxomjhyy59f9bbofk9rxpdxjo5 > > > > > > [2] https://nvd.nist.gov/vuln/detail/CVE-2022-25168 > > > > > > [3] https://nvd.nist.gov/vuln/detail/CVE-2022-26612 > > > > > > > > > > > > > > > > > > > > >