Re: Role of Hadoop code in Cassandra 5.0

Miklosovic, Stefan Thu, 09 Mar 2023 09:37:36 -0800

Deprecation would mean that the code has to be there whole 5.0 so we can remove 
it for real in 6.0?


________________________________________
From: Ekaterina Dimitrova <[email protected]>
Sent: Thursday, March 9, 2023 18:32
To: [email protected]
Subject: Re: Role of Hadoop code in Cassandra 5.0

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Deprecation sounds good to me, but I am not completely sure in which version we 
can do it. If it is possible to add a deprecation warning in the 4.x series or 
at least 4.1.x - I vote for that.

On Thu, 9 Mar 2023 at 12:14, Jacek Lewandowski 
<[email protected]<mailto:[email protected]>> wrote:
Is it possible to deprecate it in the 4.1.x patch release? :)


- - -- --- ----- -------- -------------
Jacek Lewandowski


czw., 9 mar 2023 o 18:11 Brandon Williams 
<[email protected]<mailto:[email protected]>> napisał(a):
This is my feeling too, but I think we should accomplish this by
deprecating it first.  I don't expect anything will change after the
deprecation period.

Kind Regards,
Brandon

On Thu, Mar 9, 2023 at 11:09 AM Jacek Lewandowski
<[email protected]<mailto:[email protected]>> wrote:
>
> I vote for removing it entirely.
>
> thanks
> - - -- --- ----- -------- -------------
> Jacek Lewandowski
>
>
> czw., 9 mar 2023 o 18:07 Miklosovic, Stefan 
> <[email protected]<mailto:[email protected]>> 
> napisał(a):
>>
>> Derek,
>>
>> I have couple more points ... I do not think that extracting it to a 
>> separate repository is "win". That code is on Hadoop 1.0.3. We would be 
>> spending a lot of work on extracting it just to extract 10 years old code 
>> with occasional updates (in my humble opinion just to make it compilable 
>> again if the code around changes). What good is in that? We would have one 
>> more place to take care of ... Now we at least have it all in one place.
>>
>> I believe we have four options:
>>
>> 1) leave it there so it will be like this is for next years with 
>> questionable and diminishing usage
>> 2) update it to Hadoop 3.3 (I wonder who is going to do that)
>> 3) 2) and extract it to a separate repository but if we do 2) we can just 
>> leave it there
>> 4) remove it
>>
>> ________________________________________
>> From: Derek Chen-Becker <[email protected]<mailto:[email protected]>>
>> Sent: Thursday, March 9, 2023 15:55
>> To: [email protected]<mailto:[email protected]>
>> Subject: Re: Role of Hadoop code in Cassandra 5.0
>>
>> NetApp Security WARNING: This is an external email. Do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe.
>>
>>
>>
>> I think the question isn't "Who ... is still using that?" but more "are we 
>> actually going to support it?" If we're on a version that old it would 
>> appear that we've basically abandoned it, although there do appear to have 
>> been refactoring (for other things) commits in the last couple of years. I 
>> would be in favor of removal from 5.0, but at the very least, could it be 
>> moved into a separate repo/package so that it's not pulling a relatively 
>> large dependency subtree from Hadoop into our main codebase?
>>
>> Cheers,
>>
>> Derek
>>
>> On Thu, Mar 9, 2023 at 6:44 AM Miklosovic, Stefan 
>> <[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
>>  wrote:
>> Hi list,
>>
>> I stumbled upon Hadoop package again. I think there was some discussion 
>> about the relevancy of Hadoop code some time ago but I would like to ask 
>> this again.
>>
>> Do you think Hadoop code (1) is still relevant in 5.0? Who in the industry 
>> is still using that?
>>
>> We might drop a lot of code and some Hadoop dependencies too (3) (even their 
>> scope is "provided"). The version of Hadoop we build upon is 1.0.3 which was 
>> released 10 years ago. This code does not have any tests nor documentation 
>> on the website.
>>
>> There seems to be issues like this (2) and it seems like the solution is to, 
>> basically, use Spark Cassandra connector instead which I would say is quite 
>> reasonable.
>>
>> Regards
>>
>> (1) 
>> https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/hadoop
>> (2) https://lists.apache.org/thread/jdy5hdc2l7l29h04dqol5ylroqos1y2p
>> (3) 
>> https://github.com/apache/cassandra/blob/trunk/.build/parent-pom-template.xml#L507-L589
>>
>>
>> --
>> +---------------------------------------------------------------+
>> | Derek Chen-Becker                                             |
>> | GPG Key available at https://keybase.io/dchenbecker and       |
>> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
>> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
>> +---------------------------------------------------------------+
>>

Re: Role of Hadoop code in Cassandra 5.0

Reply via email to