I think that storm-kafka would make sense as a contrib module since it's widely used. I'm not sure what to do with the other storm-contrib modules. I figure the less code that's part of the initial repo the better, because there will be less contribution/legal issues to sort out. How about this - we plan to include storm-kafka under a contrib folder of the Apache Storm project (just because a lot of people depend on it), and we can pull other storm-contrib modules in if community members show initiative in working on and maintaining them?
If that all sounds good I'll update the proposal accordingly. On Sep 4, 2013, at 6:41 PM, Joe Stein <crypt...@gmail.com> wrote: > What does this mean for storm contribs ( > https://github.com/nathanmarz/storm-contrib)? (spouts & bolts) e.g The > Apache Kafka spout already it is hard to know which to use and which is > best for 0.7.X and 0.8.X-betaX... Is the Apache Storm project going to > help corral that or is it only for Storm core as the proposal implies with > only the storm code base https://github.com/nathanmarz/storm being part of > the project? > > A lot of traffic on the existing user list is about spouts (e.g. the Kafka > Spout) and I was not sure if that would still be talked about or funneled > somewhere else or what the thoughts/plans where for the parts built within > Storm that are existing now? > > /******************************************* > Joe Stein > Founder, Principal Consultant > Big Data Open Source Security LLC > http://www.stealth.ly > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop> > ********************************************/ > > > On Wed, Sep 4, 2013 at 4:34 PM, Nathan Marz <nat...@nathanmarz.com> wrote: > >> We definitely need a storm-user list as the existing google groups mailing >> list for Storm is quite active. So we'll need to transition that over. I >> agree on adding a storm-commits list and added it to the proposal. >> >> >> On Wed, Sep 4, 2013 at 11:50 AM, Henry Saputra <henry.sapu...@gmail.com >>> wrote: >> >>> Excited about Storm coming to Apache. Small comment about the mailing >> list, >>> you may want to propose having: >>> * storm-dev >>> * storm-commits >>> * storm-private (with moderated subscriptions) >>> >>> instead for starting into incubator. >>> >>> However, Storm has been a well known open source project, maybe it does >>> valid to have storm-user from the beginning. But I think you may need >>> storm-commits >>> list to separate commits log from dev discussions. >>> Mentors can chime in about this. >>> >>> Thanks, >>> >>> Henry >>> >>> >>> >>> On Wed, Sep 4, 2013 at 1:07 AM, Nathan Marz <nat...@nathanmarz.com> >> wrote: >>> >>>> Hi everyone, >>>> >>>> I'd like to propose Storm to be an Apache Incubator project. After much >>>> thought I believe this is the right next step for the project, and I >> look >>>> forward to hearing everyone's thoughts and feedback! >>>> >>>> Here's a link to the proposal: >>>> https://wiki.apache.org/incubator/StormProposal >>>> >>>> The proposal is also pasted below. >>>> >>>> -Nathan >>>> >>>> >>>> = Storm Proposal = >>>> >>>> == Abstract == >>>> >>>> Storm is a distributed, fault-tolerant, and high-performance realtime >>>> computation system that provides strong guarantees on the processing of >>>> data. >>>> >>>> == Proposal == >>>> >>>> Storm is a distributed real-time computation system. Similar to how >>> Hadoop >>>> provides a set of general primitives for doing batch processing, Storm >>>> provides a set of general primitives for doing real-time computation. >> Its >>>> use cases span stream processing, distributed RPC, continuous >>> computation, >>>> and more. Storm has become a preferred technology for near-realtime >>>> big-data processing by many organizations worldwide (see a partial list >>> at >>>> https://github.com/nathanmarz/storm/wiki/Powered-By). As an open >> source >>>> project, Storm’s developer community has grown rapidly to 46 members. >>>> >>>> == Background == >>>> >>>> The past decade has seen a revolution in data processing. MapReduce, >>>> Hadoop, and related technologies have made it possible to store and >>> process >>>> data at scales previously unthinkable. Unfortunately, these data >>> processing >>>> technologies are not realtime systems, nor are they meant to be. The >> lack >>>> of a "Hadoop of realtime" has become the biggest hole in the data >>>> processing ecosystem. Storm fills that hole. >>>> >>>> Storm was initially developed and deployed at BackType in 2011. After 7 >>>> months of development BackType was acquired by Twitter in July 2011. >>> Storm >>>> was open sourced in September 2011. >>>> >>>> Storm has been under continuous development on its Github repository >>> since >>>> being open-sourced. It has undergone four major releases (0.5, 0.6, >> 0.7, >>>> 0.8) and many minor ones. >>>> >>>> == Rationale == >>>> >>>> Storm is a general platform for low-latency big-data processing. It is >>>> complementary to the existing Apache projects, such as Hadoop. Many >>>> applications are actually exploring using both Hadoop and Storm for >>>> big-data processing. Bringing Storm into Apache is very beneficial to >>> both >>>> Apache community and Storm community. >>>> >>>> The rapid growth of Storm community is empowered by open source. We >>> believe >>>> the Apache foundation is a great fit as the long-term home for Storm, >> as >>> it >>>> provides an established process for community-driven development and >>>> decision making by consensus. This is exactly the model we want for >>> future >>>> Storm development. >>>> >>>> == Initial Goals == >>>> >>>> * Move the existing codebase to Apache >>>> * Integrate with the Apache development process >>>> * Ensure all dependencies are compliant with Apache License version >> 2.0 >>>> * Incremental development and releases per Apache guidelines >>>> >>>> == Current Status == >>>> >>>> Storm has undergone four major releases (0.5, 0.6, 0.7, 0.8) and many >>> minor >>>> ones. Storm 0.9 is about to be released. Storm is being used in >>> production >>>> by over 50 organizations. Storm codebase is currently hosted at >>> github.com >>>> , >>>> which will seed the Apache git repository. >>>> >>>> === Meritocracy === >>>> >>>> We plan to invest in supporting a meritocracy. We will discuss the >>>> requirements in an open forum. Several companies have already expressed >>>> interest in this project, and we intend to invite additional developers >>> to >>>> participate. We will encourage and monitor community participation so >>> that >>>> privileges can be extended to those that contribute. >>>> >>>> === Community === >>>> >>>> The need for a low-latency big-data processing platform in the open >>> source >>>> is tremendous. Storm is currently being used by at least 50 >> organizations >>>> worldwide (see https://github.com/nathanmarz/storm/wiki/Powered-By), >> and >>>> is >>>> the most starred Java project on Github. By bringing Storm into Apache, >>> we >>>> believe that the community will grow even bigger. >>>> >>>> === Core Developers === >>>> >>>> Storm was started by Nathan Marz at BackType, and now has developers >> from >>>> Yahoo!, Microsoft, Alibaba, Infochimps, and many other companies. >>>> >>>> === Alignment === >>>> >>>> In the big-data processing ecosystem, Storm is a very popular >> low-latency >>>> platform, while Hadoop is the primary platform for batch processing. We >>>> believe that it will help the further growth of big-data community by >>>> having Hadoop and Storm aligned within Apache foundation. The alignment >>> is >>>> also beneficial to other Apache communities (such as Zookeeper, Thrift, >>>> Mesos). We could include additional sub-projects, Storm-on-YARN and >>>> Storm-on-Mesos, in the near future. >>>> >>>> == Known Risks == >>>> >>>> === Orphaned Products === >>>> >>>> The risk of the Storm project being abandoned is minimal. There are at >>>> least 50 organizations (Twitter, Yahoo!, Microsoft, Groupon, Baidu, >>>> Alibaba, Alipay, Taobao, PARC, RocketFuel etc) are highly incentivized >> to >>>> continue development. Many of these organizations have built critical >>>> business applications upon Storm, and have devoted significant internal >>>> infrastructure investment in Storm. >>>> >>>> === Inexperience with Open Source === >>>> >>>> Storm has existed as a healthy open source project for several years. >>>> During that time, we have curated an open-source community >> successfully, >>>> attracting over 40 developers from a diverse group of companies >> including >>>> Twitter, Yahoo!, and Alibaba. >>>> >>>> === Homogenous Developers === >>>> >>>> The initial committers are employed by large companies (including >>> Twitter, >>>> Yahoo!, Alibaba, Microsoft) and well-funded startups. Storm has an >> active >>>> community of developers, and we are committed to recruiting additional >>>> committers based on their contributions to the project. >>>> >>>> === Reliance on Salaried Developers === >>>> >>>> It is expected that Storm development will occur on both salaried time >>> and >>>> on volunteer time, after hours. The majority of initial committers are >>> paid >>>> by their employer to contribute to this project. However, they are all >>>> passionate about the project, and we are confident that the project >> will >>>> continue even if no salaried developers contribute to the project. We >> are >>>> committed to recruiting additional committers including non-salaried >>>> developers. >>>> >>>> === Relationships with Other Apache Products === >>>> >>>> As mentioned in the Alignment section, Storm is closely integrated with >>>> Hadoop, >>>> Zookeeper, Thrift, YARN and Mesos in a numerous ways. We look forward >> to >>>> collaborating with those communities, as well as other Apache >> communities >>>> (including Apache S4 which focuses on stateful low-latency processing). >>>> >>>> === An Excessive Fascination with the Apache Brand === >>>> >>>> Storm is already a healthy and well known open source project. This >>>> proposal is not for the purpose of generating publicity. Rather, the >>>> primary benefits to joining Apache are those outlined in the Rationale >>>> section. >>>> >>>> == Documentation == >>>> >>>> The reader will find these websites highly relevant: >>>> >>>> * Storm website: http://storm-project.net >>>> * Storm documentation: https://github.com/nathanmarz/storm/wiki >>>> * Codebase: https://github.com/nathanmarz/storm >>>> * User group: https://groups.google.com/group/storm-user >>>> >>>> == Source and Intellectual Property Submission Plan == >>>> >>>> The Storm codebase is currently hosted on Github: >>>> https://github.com/nathanmarz/storm. >>>> >>>> This is the exact codebase that we would migrate to the Apache >>> foundation. >>>> >>>> The Storm source code is currently licensed under Eclipse Public >> License >>>> Version 1.0. Some source code was contributed under a contributor >>> agreement >>>> based on the Sun contributor agreement (v1.5). More recent code has >> been >>>> contributed under an Apache style agreement (see >> https://dl.dropboxusercontent.com/u/133901206/storm-apache-style-cla.txt >>> ). >>>> >>>> Upon entering Apache, Storm will migrate to an Apache License 2.0 with >>> all >>>> contributions licensed to the Apache Foundation. In certain cases where >>>> individuals or organizations hold copyright, we will ensure they grant >> a >>>> license to the Apache Foundation. Going forward, all commits will be >>>> licensed directly to the Apache foundation through our signed >> Individual >>>> Contributor License Agreements for all committers on the project. >>>> >>>> Yahoo! is also willing to move Storm-on-YARN code from github to be a >>>> subproject of Apache Storm project. Storm-on-YARN is currently licensed >>>> under Apache License 2.0 and receive contribution under Apache style >> CLA. >>>> Upon entering Apache, Yahoo! will sign over copyright to Apache >>> foundation. >>>> >>>> == External Dependencies == >>>> >>>> To the best of our knowledge, all of Storm dependencies (except >> 0MQ/JMQ) >>>> are distributed under Apache compatible licenses. Upon acceptance to >> the >>>> incubator, we would begin a thorough analysis of all transitive >>>> dependencies to verify this fact and introduce license checking into >> the >>>> build and release process (for instance integrating Apache Rat). >>>> >>>> Storm has used 0MQ and JMQ as the default mechanism for internal >>> messaging >>>> layer, and 0MQ/JMQ is licensed under GNU Lesser General Public License. >>>> Recently, we have made Storm messaging layer pluggable, and plan to use >>>> Netty (which is licensed under Apache License v2) as our default >>> messaging >>>> plugin (while keep 0MQ as an optional plugin). >>>> >>>> == Cryptography == >>>> >>>> We do not expect Storm to be a controlled export item due to the use of >>>> encryption. >>>> >>>> Storm enable encryptions via 2 plugins: >>>> >>>> * SASL authentication plugins … Currently, we have provide “no-op” >>>> authentication and digest authentication. In near future, we will >>> introduce >>>> Kerberos authentication. >>>> * Tuple payload serialization plugins … Storm provides plugins for >>>> plain-object serialization and blowfish encryption. >>>> >>>> == Required Resources == >>>> >>>> === Mailing lists === >>>> >>>> * storm-user >>>> * storm-dev >>>> * storm-private (with moderated subscriptions) >>>> >>>> === Subversion Directory === >>>> >>>> Git is the preferred source control system: git://git.apache.org/storm >>>> >>>> === Issue Tracking === >>>> >>>> JIRA Storm (STORM) >>>> >>>> == Initial Committers == >>>> >>>> * Nathan Marz <nathan at nathanmarz dot com> >>>> * James Xu <xumingmingv at gmail dot com> >>>> * Jason Jackson <jason at cvk dot ca> >>>> * Andy Feng <afeng at yahoo-inc dot com> >>>> * Flip Kromer <flip at infochimps dot com> >>>> * David Lao <davidlao at microsoft dot com> >>>> * P. Taylor Goetz <ptgoetz at gmail dot com> >>>> >>>> == Affiliations == >>>> >>>> * Nathan Marz - Nathan’s Startup >>>> * James Xu - Alibaba >>>> * Jason Jackson - Twitter >>>> * Andy Feng - Yahoo! >>>> * Flip Kromer - Infochimps >>>> * David Lao - Microsoft >>>> * P. Taylor Goetz - Health Market Science >>>> >>>> == Sponsors == >>>> >>>> === Champion === >>>> >>>> * Doug Cutting <cutting at apache dot org> >>>> >>>> === Nominated Mentors === >>>> >>>> * Ted Dunning <tdunning at maprtech.com> >>>> * Arvind Prabhaker <arvind at apache dot org> >>>> * Devaraj Das <ddas at hortonworks dot com> >>>> >>>> === Sponsoring Entity === >>>> >>>> The Apache Incubator >> >> >> >> -- >> Twitter: @nathanmarz >> http://nathanmarz.com >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org