What about asking somebody from Hadoop project to update it directly in Cassandra? I think these people have loads of experience in integrations like this. If we bumped the version to something 3.3.x, refreshed the code and put some tests on top, I think we could just leave it there for couple more years again.
________________________________________ From: Derek Chen-Becker <de...@chen-becker.org> Sent: Thursday, March 9, 2023 15:55 To: dev@cassandra.apache.org Subject: Re: Role of Hadoop code in Cassandra 5.0 NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. I think the question isn't "Who ... is still using that?" but more "are we actually going to support it?" If we're on a version that old it would appear that we've basically abandoned it, although there do appear to have been refactoring (for other things) commits in the last couple of years. I would be in favor of removal from 5.0, but at the very least, could it be moved into a separate repo/package so that it's not pulling a relatively large dependency subtree from Hadoop into our main codebase? Cheers, Derek On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan <stefan.mikloso...@netapp.com<mailto:stefan.mikloso...@netapp.com>> wrote: Hi list, I stumbled upon Hadoop package again. I think there was some discussion about the relevancy of Hadoop code some time ago but I would like to ask this again. Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry is still using that? We might drop a lot of code and some Hadoop dependencies too (3) (even their scope is "provided"). The version of Hadoop we build upon is 1.0.3 which was released 10 years ago. This code does not have any tests nor documentation on the website. There seems to be issues like this (2) and it seems like the solution is to, basically, use Spark Cassandra connector instead which I would say is quite reasonable. Regards (1) https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p (3) https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589 -- +---------------------------------------------------------------+ | Derek Chen-Becker | | GPG Key available at https://keybase.io/dchenbecker and | | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | +---------------------------------------------------------------+