I think the question isn't "Who ... is still using that?" but more "are we actually going to support it?" If we're on a version that old it would appear that we've basically abandoned it, although there do appear to have been refactoring (for other things) commits in the last couple of years. I would be in favor of removal from 5.0, but at the very least, could it be moved into a separate repo/package so that it's not pulling a relatively large dependency subtree from Hadoop into our main codebase?
Cheers, Derek On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan < stefan.mikloso...@netapp.com> wrote: > Hi list, > > I stumbled upon Hadoop package again. I think there was some discussion > about the relevancy of Hadoop code some time ago but I would like to ask > this again. > > Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry > is still using that? > > We might drop a lot of code and some Hadoop dependencies too (3) (even > their scope is "provided"). The version of Hadoop we build upon is 1.0.3 > which was released 10 years ago. This code does not have any tests nor > documentation on the website. > > There seems to be issues like this (2) and it seems like the solution is > to, basically, use Spark Cassandra connector instead which I would say is > quite reasonable. > > Regards > > (1) > https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop > (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p > (3) > https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589 -- +---------------------------------------------------------------+ | Derek Chen-Becker | | GPG Key available at https://keybase.io/dchenbecker and | | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | +---------------------------------------------------------------+