Thanks Richard for starting the discussion and Sigee + Davide for sharing your thoughts.
The health of Apache Storm has been a concern for a while now and it is very much a project in maintenance state. Richard has done an amazing job of keeping things going but at times it feels like it is StormCrawler and its community that are keeping it alive. In typical Apache fashion, the project boasts a large number of committers but effectively, there is only a handful of individuals actively involved. Emeritus would probably be a better status for most people there. Anyway, I think it wouldn't necessarily take a lot of effort to keep Storm going. We got rid of the Clojure parts, so it is all in Java which makes it easier for people to contribute. The work on Storm is mostly about upgrading dependencies and the odd improvement here and there - none of which requires a deep understanding of the internal intricacies of the code. And these days, AI is there to help us understand other people's code better anyway. As Richard said, all we need is a few of us in the StormCrawler community to get involved in Storm. Simply validating a RC would already be a fantastic way to help. The alternative is essentially to start a new project on a different platform by borrowing code snippets from SC—similar to how I adapted code from Apache Nutch to SC. It is quite a lot of work with no guarantee of success, whereas what we have now is pretty solid and used by various organisations worldwide. Some of them have downloaded dozens of billions of pages using SC, so overall, SC might have fetched trillions of URLs in total, who knows? Web crawling is a niche activity but people are using StormCrawler. The question is: why don't we get more contributions and potential committers? If our community grows, so does the Storm one. I used to invest time finding who was using SC, convincing them to make it visible, and then contributing to the project. Perhaps that needs to be done again? Julien On Tue, 19 May 2026 at 09:25, Davide Polato <[email protected]> wrote: > Hi Richard, > > The contributor route makes more sense to me than ripping out our cluster > layer for another engine. > > For what it's worth, anyone coming from StormCrawler isn't starting from > zero on Storm. We work on top of its model every day. The internals are the > part we don't know, but that's still a smaller jump than picking up a new > framework from scratch. > > It's a direction I'd be interested in, even if not something I can jump on > immediately. > > Best, > Davide > > Il giorno lun 18 mag 2026 alle ore 16:55 Dávid Szigecsán < > [email protected]> > ha scritto: > > > Hi, > > > > To be honest, it was kind of a coincidence, I showed up there. I started > to > > look a bit deeper into Storm in the last few days, but I still don't know > > much about it. I started to check the project (clone the repository and > > tried to build) with some errors, because I use windows and Storm is not > so > > windows friendly. :D > > Anyway, I did not think I could make a huge impact in it (especially as I > > am unfortunately not the most active member of the community :( ). > > But I don't want to let Storm die. If I can help, I want to. In the last > > few days I read lots about Storm's history and it deserves to live. > > I am going on a long vacation from 24. May to 8. June, but after that I > am > > happy to discuss how I can help. > > > > Regards, > > Sigee > > > > Richard Zowalla <[email protected]> ezt írta (időpont: 2026. máj. 18., H, > > 16:35): > > > > > > > > Hi all, > > > > > > I wanted to raise something that's been on my mind for a while > regarding > > > the sustainability of Apache Storm itself. From what I've observed, > > getting > > > the 3 votes required for releases and decisions has become quite > > cumbersome > > > - sometimes really hard - and that makes me worry about how viable > Storm > > is > > > as a foundation for us going forward. > > > > > > On a more positive note, I noticed that Dávid recently showed up in one > > of > > > the Storm issues. I think it would be a good idea to try to get a few > > more > > > people from the StormCrawler side involved in Storm directly. One thing > > > that helps here: Storm doesn't differentiate between Committer and PMC > - > > > they vote new people straight into the PMC. So it could be a relatively > > > clean way to inject some fresh contributors and voting power into the > > > project. > > > > > > If we don't manage to do something along those lines, I'm afraid we'll > > > have to seriously consider re-inventing or migrating our underlying > > cluster > > > technology to another stream processing framework sooner rather than > > later, > > > and let Storm die (in the attic). > > > I'd rather avoid that if we can since the technology has a proven > record > > > in web crawling projects. > > > > > > Any thoughts? > > > > > > Gruß > > > Richard > > >
