Re: [DISCUSS] Hadoop ingestion support

2025-07-02 Thread Abhishek Agarwal
An alternate approach is to remove Hadoop in 35 entirely but allow backports to 34 release branch. Any bugs with reasonable severity can be backported to the 34 branch. When we make a release, we do a major release and a patch release for 34. I suggest we do this till Druid 36, and then discontinue

Re: [DISCUSS] Hadoop ingestion support

2025-07-01 Thread Karan Kumar
I have not heard any plans about deprecating `*druid-hdfs-storage`* which serves as a deep storage implementation. This thread is strictly about hadoop *ingestion* support. On Wed, Jul 2, 2025 at 8:28 AM Eyal Yurman wrote: > Druid also includes druid-hdfs-storage core extension which bundles > h

Re: [DISCUSS] Hadoop ingestion support

2025-07-01 Thread Eyal Yurman
Druid also includes druid-hdfs-storage core extension which bundles hadoop-client-api. I assume there isn't a plan to deprecate this extension? On Tue, Jul 1, 2025 at 8:19 AM Gian Merlino wrote: > We are in a tough situation where our hand is being forced on dropping > Java 11 support by the Je

Re: [DISCUSS] Hadoop ingestion support

2025-07-01 Thread Gian Merlino
We are in a tough situation where our hand is being forced on dropping Java 11 support by the Jetty 9 EOL situation. It isn't a good idea to continue using Jetty 9 given it's no longer receiving security updates, and Jetty 12 (the only currently-supported version) requires Java 17. However, the

Re: [DISCUSS] Hadoop ingestion support

2025-06-23 Thread Lucas Capistrant
Thanks for your input from Roku user point of view, Krishna. We are definitely in a tough spot here because of Hadoop support preventing us from dropping Java 11 support. And then the domino effect being we can’t upgrade off of EOL dependencies such as Jetty 9. In the Java 11 support discussion, h

Re: [DISCUSS] Hadoop ingestion support

2025-06-18 Thread Krishna Thirumalasetty
Hi everyone, Adding to the voices from Netflix and Target — at Roku Inc., we also rely heavily on Hadoop-based batch ingestion for a significant portion of our Druid datasources. This approach allows us to leverage our existing Hadoop infrastructure efficiently and cost-effectively for large-scale

Re: [DISCUSS] Hadoop ingestion support

2025-06-17 Thread Eyal Yurman
Sharing as another data point - We still use YARN to run Hadoop-based batch ingestion. Very useful on-premise for resource sharing, where autoscaling isn't always an option. But we plan to move to Kubernetes for ingestion sometime next year. On Tue, Jun 17, 2025 at 12:20 PM Gian Merlino wrote:

Re: [DISCUSS] Hadoop ingestion support

2025-06-17 Thread Gian Merlino
I'm on board with this. I also think we should deprecate it ASAP, starting in the next major release. It'd be nice to also build a migration guide that helps people move from Hadoop ingestion to SQL/MSQ ingestion, and from YARN to K8S pod runners. Gian On 2025/06/09 20:10:03 Clint Wylie wrote:

Re: [DISCUSS] Hadoop ingestion support

2025-06-09 Thread Clint Wylie
Following up on this, I want to propose the first release of 2026 for removal, which I think would be Druid 36, to give some lead time for those affected to prepare. On Wed, Apr 9, 2025 at 8:42 AM Frank Chen wrote: > > We don't use Hadoop ingestion, it's OK for us to drop the support of Hadoop. >

Re: [DISCUSS] Hadoop ingestion support

2025-04-10 Thread Lucas Capistrant
Yes, I’m in favor of removing it from the core release and also in favor of officially announcing deprecation with a timeline for removal, if we have not yet. It stinks to lose the Hadoop ingest support, but if that project is going to hold back Druid, it seems we don’t have much choice. Thanks, L

Re: [DISCUSS] Hadoop ingestion support

2025-04-09 Thread Frank Chen
We don't use Hadoop ingestion, it's OK for us to drop the support of Hadoop. We can make an announcement to deprecate it first(from 33?), remove it from official distribution( but keep the ability to build it as above suggested, from 34?), and remove it completely at a proper time. On Wed, Apr

Re: [DISCUSS] Hadoop ingestion support

2025-04-08 Thread Maytas Monsereenusorn
I'm in favor of removing too but we should not rush the removal and make sure we give enough time for users to migrate to other types of ingestion. Similar to what Lucas said, if Hadoop is holding back Druid then we should remove it. Druid also supports many other types of ingestion compared to bac

Re: [DISCUSS] Hadoop ingestion support

2025-04-08 Thread Karan Kumar
Like the plan of having a hadoop profile, not shipping it a part of the apache release and then we can eventually remove it in a release or 2 . Does that work for you folks Maytas, Lucas ? On Mon, Apr 7, 2025 at 3:59 PM Zoltan Haindrich wrote: > Hey, > > I was also bumping into this while I was

Re: [DISCUSS] Hadoop ingestion support

2025-04-07 Thread Zoltan Haindrich
Hey, I was also bumping into this while I was running dependency-checks for Druid-33 * I've encountered a CVE [1] in hadoop-runtime-3.3.6 which is a shaded jar * we have a PR to upgrade to 3.4.0 ; so I checked also 3.4.1 - but they are also affected as they ship with (jetty is 9.4.53.v20231009)

Re: [DISCUSS] Hadoop ingestion support

2025-01-08 Thread Abhishek Agarwal
@Adarsh - FYI since you are the release manager for 32. On Wed, Jan 8, 2025 at 11:53 AM Abhishek Agarwal wrote: > I don't want to kick that can too far down the road either :) We don't > want to give a false hope that it's going to remain around forever. But yes > let's deprecate both Hadoop and

Re: [DISCUSS] Hadoop ingestion support

2025-01-07 Thread Abhishek Agarwal
I don't want to kick that can too far down the road either :) We don't want to give a false hope that it's going to remain around forever. But yes let's deprecate both Hadoop and Java 11 support in the upcoming 32 release. It's unfortunate that Hadoop still doesn't support Java 17. We shouldn't let

Re: [DISCUSS] Hadoop ingestion support

2025-01-07 Thread Karan Kumar
Okay from what I can gather few folks still need hadoop ingestion. So let's kick the can down the road regarding removal of that support but let's agree on the deprecation plan. Since druid 32 is around the corner let's atleast deprecated hadoop ingestion so that any new users are not onboarded to

Re: [DISCUSS] Hadoop ingestion support

2024-12-12 Thread Maytas Monsereenusorn
We at Netflix are in a similar situation to Target Corporation (Lucas C email above). We currently rely on Hadoop ingestion for all our batch ingestion jobs. The main reason for this is that we already have a large Hadoop cluster supporting our Spark workloads that we can leverage for Druid ingesti

Re: [DISCUSS] Hadoop ingestion support

2024-12-12 Thread Lucas Capistrant
Apologies for the empty email… fat fingers. Just wanted to say that we at Target Corporation (USA), still rely heavily on Hadoop ingest. We’d selfishly want support forever, but if forced to pivot to a new ingestion style for our larger batch ingest jobs that currently leverage the cheap compute o

Re: [DISCUSS] Hadoop ingestion support

2024-12-12 Thread Lucas Capistrant
On Wed, Dec 11, 2024 at 9:10 PM Karan Kumar wrote: > +1 for removal of Hadoop based ingestion. It's a maintenance overhead and > stops us from moving to java 17. > I am not aware of any gaps in sql based ingestion which limits users to > move off from hadoop. If there are any, please feel free to

Re: [DISCUSS] Hadoop ingestion support

2024-12-11 Thread Karan Kumar
+1 for removal of Hadoop based ingestion. It's a maintenance overhead and stops us from moving to java 17. I am not aware of any gaps in sql based ingestion which limits users to move off from hadoop. If there are any, please feel free to reach out via slack/github. On Thu, Dec 12, 2024 at 3:22 AM

[DISCUSS] Hadoop ingestion support

2024-12-11 Thread Clint Wylie
Hey everyone, It is about that time again to take a pulse on how commonly Hadoop based ingestion is used with Druid in order to determine if we should keep supporting it or not going forward. In my view, Hadoop based ingestion has unofficially been on life support for quite some time as we do not