+1, does this mean we would mark Hadoop 2 deprecated in Druid 27? Also, do we have a broader plan to remove Hadoop in general from core dependencies and make a an optional extension?
On Tue, Jun 27, 2023 at 11:53 PM Karan Kumar <karankumar1...@gmail.com> wrote: > In favour of dropping hadoop 2 support . Another point is the lack of > security and vulnerability fixes in hadoop2. > > > > On Wed, Jun 28, 2023 at 12:17 PM Clint Wylie <cwy...@apache.org> wrote: > > > obvious +1 from me > > > > On Tue, Jun 27, 2023 at 11:42 PM Gian Merlino <g...@apache.org> wrote: > > > > > > I'd like to propose dropping support for Hadoop 2 in Druid 28. Not the > > very > > > next release (which I assume will be Druid 27) but the one after that, > > > likely late 2023 timeframe. > > > > > > In 2021, we had a discussion about moving away from Hadoop 2: > > > https://lists.apache.org/thread/zmc389trnkh6x444so8mdb2h0x0noqq4. For > > > various reasons, it didn't seem like the right time. However, I believe > > now > > > is the right time: > > > > > > 1) We didn't support Hadoop 3 in 2021, but we support it now. There is > > now > > > a Hadoop 3 build profile, as well as convenience binaries on > > > https://druid.apache.org/downloads.html. > > > > > > 2) We have SQL-based ingest with MSQ tasks, which provides a built-in / > > > scalable / robust alternative to using Hadoop at all. > > > > > > 3) It has been an additional two years. Hadoop 2 is that much older, > that > > > much more time has passed since it was superseded by Hadoop 3, and > people > > > have had that much more time to migrate. > > > > > > 4) The original main reason for wanting to move away from Hadoop 2 is > > still > > > relevant. It keeps us on various old dependencies, including an ancient > > > version of Guava, which in turn has been keeping us on an ancient > version > > > of Calcite. The Calcite community has graciously decided to support > this > > > old version of Guava for at least one release, but plans to drop > support > > by > > > Calcite 1.36, leaving us back in the same position. Managing this > > situation > > > is time-consuming for both Druid and Calcite maintainers. > > > > > > 5) Other solutions beyond dropping Hadoop 2 support were proposed in > > 2021, > > > such as reworking Hadoop support to be purely extension based, and > > > reworking extensions to be more isolated from each other. However, > these > > > are both substantially more complex than dropping support, and in the > two > > > years since the original thread, these more complex solutions have not > > been > > > implemented. So, I think we need to move on with the simpler solution > of > > > dropping support. > > > > > > Gian > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org > > For additional commands, e-mail: dev-h...@druid.apache.org > > > > > > -- > Thanks > Karan >