All, I'm requesting that the community holds on closing this vote for a few extra days to address concerns raised by the Storm community and others.
Please don't interpret this as a -1, but do interpret it as a need for some extra due dilligence due to code usage from Apache Storm within Heron. John On Sun, Jun 18, 2017 at 6:35 AM John D. Ament <johndam...@apache.org> wrote: > +1, however a few things to note about the proposal (and follow up will be > required when bringing Heron in): > > - There is no ASF 2.0 license (missed when putting together the proposal) > - The IP section doesn't mention anything about a SGA being sent, is your > intention to not send an SGA? > - The NOTICE for the repo indicates there is some source code from Yahoo!. > > - The contents of https://github.com/twitter/heron/tree/master/third_party > seems > to be mostly binary files, and you'll need to clean that up for your first > release. > - Your 3rd party section mentions everything is ASF 2.0, however this > includes glog and similar tools that include an odd buildchain license that > is actually GPL, we'll need to get clearance if this is actually compliant > or not. Some of the contents in third_party are missing license headers. > > John > > > On Fri, Jun 16, 2017 at 4:41 PM Bill Graham <billgra...@gmail.com> wrote: > >> Hi, >> >> Based on the discussion on the incubator mailing list[1] I would like to >> call a vote to add Heron to the Apache Incubator. >> >> The full proposal is available below, and is also available on the Apache >> Incubator wiki at: >> https://wiki.apache.org/incubator/HeronProposal >> >> Please vote: >> [ ] +1, bring Heron into Incubator >> [ ] -1, do not bring Heron into Incubator, because... >> >> The vote will open for 7 days until Friday June 23 at 14:00 PT. >> >> Thank you >> >> 1 - >> >> https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2c9d93b7786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E >> >> >> >> = Heron Proposal = >> >> = Abstract = >> Heron is a real-time, distributed, fault-tolerant stream processing engine >> initially developed by Twitter. >> >> = Proposal = >> >> Heron is a real-time stream processing engine built for high performance, >> ease of manageability, performance predictability and developer >> productivity[1]. We wish to develop a community around Heron to increase >> contributions and see Heron thrive in an open forum. >> >> = Background = >> >> Heron provides the ability for developers to compose directed acyclic >> graphs (DAGs) of real-time query execution logic (i.e. a topology) and >> submit the topology to execute on a pluggable job scheduling system (e.g., >> Apache Aurora, YARN, Marathon, etc). Users can employ either the native >> Heron API or the Apache Storm API to develop the topology. Heron supports >> the Storm API for ease of migration, but beyond that Heron’s architecture >> differs considerably from Storm’s. >> >> Users submit a topology to the scheduler using the Heron client, which >> uses >> the Heron binary libraries to deploy all daemons required to run and >> manage >> the topology. The topology therefore has no reliance on centrally managed >> Heron services, only on a generic job scheduling system, which lends >> itself >> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN (among >> others). >> >> The scheduler runs each topology as a job consisting of multiple >> containers. One of the containers runs the topology master, responsible >> for >> managing the topology. The remaining containers each runs a stream manager >> responsible for data routing, a metrics manager that collects and reports >> various metrics and a number of processes called Heron instances which run >> the user-defined logic on the stream of tuples. Parallelism is achieved >> via >> process-based isolation of Heron instances, which provides predictable >> performance while simplifying debugging. The containers are allocated and >> managed by the scheduler framework based on resource availability of nodes >> in the cluster. The metadata for the topology, such as the physical plan >> and execution details, are stored in the pluggable Heron State Manager >> (e.g. Apache ZooKeeper). >> >> = Rationale = >> >> Heron is a general-purpose, modular and extensible platform that can be >> leveraged to support common, real-time analytics use cases. There is an >> increasing demand for open-source, scalable real-time analytics systems. >> We >> believe that Heron can be leveraged by other organizations to build >> streaming applications that can benefit from its robustness, high >> performance, adaptability to cloud environments and ease of use. Moreover, >> we hope that open-sourcing Heron will help to further evolve the >> technology >> as the project attracts contributors with diverse backgrounds and areas of >> expertise. >> >> We believe the Apache foundation is a great fit as the long-term home for >> Heron, as it provides an established process for community-driven >> development and decision making by consensus. This is exactly the model we >> want for future Heron development. >> >> = Initial Goals = >> >> * Move the existing codebase, website, documentation, and mailing lists >> to >> Apache-hosted infrastructure. >> * Integrate with the Apache development process. >> * Ensure all dependencies are compliant with Apache License version 2.0. >> * Incrementally develop and release per Apache guidelines. >> >> = Current Status = >> >> Heron is a stable project used in production at Twitter since 2014 and >> open >> sourced under the ASL v2 license in 2016. The Heron source code is >> currently hosted at github.com (https://github.com/twitter/heron), which >> will seed the Apache git repository. >> >> = Meritocracy = >> >> By submitting this incubator proposal, we’re expressing our intent to >> build >> a diverse developer community around Heron that will conduct itself >> according to The Apache Way and use a meritocratic means of building it's >> committer base. Several companies and universities have already expressed >> interest in and contributed to Heron. Our goal is to grow the Heron >> community by encouraging open communication, contribution and >> participation >> of all types, and ensuring that contributors are recognized appropriately. >> >> = Community = >> >> Heron is currently being used by Twitter, Google, Machine Zone and >> ndustrial.io and has received significant contributions by Microsoft and >> Streamlio. By bringing Heron into the Apache ecosystem, we believe we can >> attract even more developers who are interested in creating real-time >> systems to build the project's contributor base. >> >> == Core Developers == >> >> Current core developers are engineers from Twitter, Google, Microsoft and >> Streamlio. >> >> == Alignment == >> >> Heron utilizes a number of Apache technologies. Heron leverages Apache >> ZooKeeper for coordination and has scheduler implementations to integrate >> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache >> REEF) >> as well as spout implementations to integrate with Apache Kafka and >> metrics >> implementations to integrate with Scribe. Heron also implements the Apache >> Storm user-level API, which allows topologies written against Storm to run >> in Heron. We believe that having Heron at Apache will help further the >> growth of the streaming compute community, as well as encourage >> cooperation >> and developer cross pollination with other Apache projects. >> >> = Known Risks = >> >> == Orphaned Products == >> >> The risk of the Heron project being abandoned is minimal. It is used in >> production at Twitter and Google and other companies are evaluating or >> adopting it for production use. >> >> == Inexperience with Open Source == >> >> All of the core contributors to the project have considerable experience >> with open source software development. Bill Graham[2], Ashvin Agrawal[3] >> and Supun Kamburugamuve[4], committers on the project, are PMCs on other >> Apache projects and Bill and Ashvin have gone through the Apache incubator >> process. Twitter has already donated numerous projects to the ASF (e.g., >> Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be mentored >> by experienced ASF members that can help with any roadblocks. >> >> == Homogenous Developers == >> >> Initial committers come from 5 separate organizations. Our intention is >> increase the diversity of contributing developers and their affiliations. >> To date github contributions have come from approximately 50 contributors >> from outside the Twitter team. >> >> == Reliance on Salaried Developers == >> >> It is expected that Heron development will occur on both salaried time and >> on volunteer time. The majority of initial committers are paid by their >> employers to contribute to this project. We are committed to recruiting >> additional committers from other organizations as well as non-salaried >> committers to join project. >> >> == Relationships with Other Apache Products == >> >> As mentioned in the Alignment section, Heron implements the Apache Storm >> API and integrates with multiple Apache schedulers (Apache Mesos, Apache >> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache >> Thrift. >> >> == An Excessive Fascination with the Apache Brand == >> >> Heron's popularity is growing in the streaming compute space and we are >> long time supporters of the Apache brand. This proposal is not for the >> purpose of generating publicity through. Rather, the primary benefits to >> joining Apache are those of community building and open decision making >> outlined in the Rationale section. >> >> == Documentation == >> >> This proposal exists online as >> http://wiki.apache.org/incubator/HeronProposal. Extensive documentation >> can >> be found on github at https://twitter.github.io/heron and the source code >> is well documented. >> >> == Source and Intellectual Property Submission Plan == >> >> The Heron codebase is currently hosted on Github: >> https://github.com/twitter/heron. During incubation, the codebase will be >> migrated to Apache infrastructure. The source code is already ASF 2.0 >> licensed. >> >> == External Dependencies == >> >> All external libraries have ASF 2.0 compatible licenses except for pylint. >> The pylint library is GPL licensed, but is only used for pre-build Python >> style checks and is neither bundled with, nor relied upon by, the Heron >> source or binary release artifacts. >> >> == Cryptography == >> >> Heron does not use any cryptography libraries. >> >> = Required Resources = >> >> == Mailing lists == >> >> * priv...@heron.incubator.apache.org (with moderated subscriptions) >> * d...@heron.incubator.apache.org >> * comm...@heron.incubator.apache.org >> * u...@heron.incubator.apache.org >> >> == Subversion Directory == >> >> Git is the preferred source control system: git://git.apache.org/heron >> >> == Issue Tracking == >> >> JIRA: Heron (HERON) >> >> == Initial Committers == >> >> * Andrew Jorgensen (andrew at andrewjorgensen dot com) >> * Ashvin Agrawal (ashvin at apache dot org)* >> * Avrilia Floratou (avrilia dot floratou at gmail dot com) >> * Bill Graham (billgraham at apache dot org)* >> * Brian Hatfield (bmhatfield at gmail dot com) >> * Chris Kellogg (cckellogg at gmail dot com) >> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com) >> * Karthik Ramasamy (karthik at gmail dot com) >> * Maosong Fu (maosongfu at gmail dot com) >> * Neng Lu(freeneng at gmail dot com) >> * Runhang Li (obj dot runhang at gmail dot com) >> * Sanjeev Kulkarni (sanjeevrk at gmail dot com) >> * Supun Kamburugamuve (supun at apache dot org)* >> * Thomas Sun (tom dot ssf at gmail dot com) >> * Yaliang Wang (yaliang dot w dot wang at ieee dot org) >> >> == Affiliations == >> >> * Andrew Jorgensen (Google) >> * Ashvin Agrawal (Microsoft) >> * Avrilia Floratou (Microsoft) >> * Bill Graham (Twitter) >> * Brian Hatfield (Google) >> * Chris Kellogg (Twitter) >> * Huijun Wu (Twitter) >> * Karthik Ramasamy (Streamlio) >> * Maosong Fu (Twitter) >> * Neng Lu (Twitter) >> * Runhang Li (Twitter) >> * Sanjeev Kulkarni (Streamlio) >> * Supun Kamburugamuve (Indiana University) >> * Thomas Sun (Twitter) >> * Yaliang Wang (Twitter) >> >> = Sponsors = >> >> == Champion == >> >> * Julien Le Dem (julien at apache dot org) >> >> == Nominated Mentors == >> >> * Jake Farrell (jfarrell at apache dot org) >> * Jacques Nadeau (jacques at apache dot org) >> * Julien Le Dem (julien at apache dot org) >> * P. Taylor Goetz (ptgoetz at apache dot org) >> >> == Sponsoring Entity == >> >> The Apache Incubator >> >> == Footnotes == >> >> * 1 - Papers detailing Heron are available at >> http://dl.acm.org/citation.cfm?id=2742788 and >> http://sites.computer.org/debull/A15dec/p15.pdf. >> * 2 - http://home.apache.org/phonebook.html?uid=billgraham >> * 3 - http://home.apache.org/phonebook.html?uid=ashvin >> * 4 - http://home.apache.org/phonebook.html?uid=supun >> >