I would be +1 on dropping Hadoop 2.7.3 when moving to Iceberg 2.0.0. If people want to use that Hadoop version, then there would still be the Iceberg 1.x.y versions available for usage.
Another related topic would be the current Hive support, which essentially makes it difficult for the project to upgrade to a newer JDK version, as Hive requires JDK 8. With Iceberg 2.0.0 an upgrade to JDK 17 (LTS) would be ideal, because even right now a lot of dependencies and plugins that we use aren't maintained anymore for JDK 8 and new (potentially critical) fixes are only available in newer versions that are built for JDK 11/17. Overall, I think it would make sense to discuss the scope of what should be shipped with Iceberg 2.0.0 and where we'd drop support. Eduard On Fri, Feb 16, 2024 at 10:10 AM Fokko Driesprong <fo...@apache.org> wrote: > Hi everyone, > > I want to discuss adding the Hadoop upgrade to the list after moving to > Iceberg 2.0. We still compile against Hadoop 2.7.3 to ensure we support as > many users as possible. Hadoop 2.7.3 was released August 2016 > <https://hadoop.apache.org/release/2.7.3.html> and is not > maintained anymore <https://endoflife.date/apache-hadoop> for a long time. > > My main reason for doing the upgrade is that on the Parquet MR project, > I've been pushing back the Hadoop upgrade to ensure compatibility with > Iceberg. However, at some point, we have to pull the trigger here. This > will simplify things on the Parquet side and avoid having to check if the > Java API Exists and such. > > Since Hadoop 3.3+ officially supports Java 11 > <https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions>, > I would suggest dropping everything below that. I wanted to check on the > mailing list if there are any thoughts and or concerns. > > Kind regards, > Fokko > > >