+1 (non-binding) On Mon, Jun 19, 2017 at 6:15 AM, Jake Farrell <jfarr...@apache.org> wrote:
> Thanks John > Comments inline, will ensure that your points are addressed before the > first release candidate. > > -Jake > > > On Sun, Jun 18, 2017 at 6:35 AM, John D. Ament <johndam...@apache.org> > wrote: > >> +1, however a few things to note about the proposal (and follow up will be >> required when bringing Heron in): >> >> - There is no ASF 2.0 license (missed when putting together the proposal) >> > > Will ensure that all licensing checkboxes are addressed before the first > release candidate goes up for a vote. > > > - The IP section doesn't mention anything about a SGA being sent, is your >> intention to not send an SGA? >> > > SGA is not required to be filed prior to an incubator acceptance vote, it > is 100% required before the codebase can be imported by infra, which the > mentors will ensure does occur. (i've all ready asked the project to get > this rolling) > > > >> - The NOTICE for the repo indicates there is some source code from Yahoo!. >> - The contents of >> https://github.com/twitter/heron/tree/master/third_party seems >> to be mostly binary files, and you'll need to clean that up for your first >> release. >> - Your 3rd party section mentions everything is ASF 2.0, however this >> includes glog and similar tools that include an odd buildchain license >> that >> is actually GPL, we'll need to get clearance if this is actually compliant >> or not. Some of the contents in third_party are missing license headers. >> >> > This is similar to other projects using a local third_party cache > directory that have come to the Apache Incubator, Cassandra, Mesos and > Aurora are a couple that jump into mind. We will ensure that this is > addressed and that no source release contains any of these files. > > > >> John >> >> On Fri, Jun 16, 2017 at 4:41 PM Bill Graham <billgra...@gmail.com> wrote: >> >> > Hi, >> > >> > Based on the discussion on the incubator mailing list[1] I would like to >> > call a vote to add Heron to the Apache Incubator. >> > >> > The full proposal is available below, and is also available on the >> Apache >> > Incubator wiki at: >> > https://wiki.apache.org/incubator/HeronProposal >> > >> > Please vote: >> > [ ] +1, bring Heron into Incubator >> > [ ] -1, do not bring Heron into Incubator, because... >> > >> > The vote will open for 7 days until Friday June 23 at 14:00 PT. >> > >> > Thank you >> > >> > 1 - >> > >> > https://lists.apache.org/thread.html/fb91f527ef479bb5df45bf2 >> c9d93b7786c3fa6cdbfeba3128599df79@%3Cgeneral.incubator.apache.org%3E >> > >> > >> > >> > = Heron Proposal = >> > >> > = Abstract = >> > Heron is a real-time, distributed, fault-tolerant stream processing >> engine >> > initially developed by Twitter. >> > >> > = Proposal = >> > >> > Heron is a real-time stream processing engine built for high >> performance, >> > ease of manageability, performance predictability and developer >> > productivity[1]. We wish to develop a community around Heron to increase >> > contributions and see Heron thrive in an open forum. >> > >> > = Background = >> > >> > Heron provides the ability for developers to compose directed acyclic >> > graphs (DAGs) of real-time query execution logic (i.e. a topology) and >> > submit the topology to execute on a pluggable job scheduling system >> (e.g., >> > Apache Aurora, YARN, Marathon, etc). Users can employ either the native >> > Heron API or the Apache Storm API to develop the topology. Heron >> supports >> > the Storm API for ease of migration, but beyond that Heron’s >> architecture >> > differs considerably from Storm’s. >> > >> > Users submit a topology to the scheduler using the Heron client, which >> uses >> > the Heron binary libraries to deploy all daemons required to run and >> manage >> > the topology. The topology therefore has no reliance on centrally >> managed >> > Heron services, only on a generic job scheduling system, which lends >> itself >> > well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN >> (among >> > others). >> > >> > The scheduler runs each topology as a job consisting of multiple >> > containers. One of the containers runs the topology master, responsible >> for >> > managing the topology. The remaining containers each runs a stream >> manager >> > responsible for data routing, a metrics manager that collects and >> reports >> > various metrics and a number of processes called Heron instances which >> run >> > the user-defined logic on the stream of tuples. Parallelism is achieved >> via >> > process-based isolation of Heron instances, which provides predictable >> > performance while simplifying debugging. The containers are allocated >> and >> > managed by the scheduler framework based on resource availability of >> nodes >> > in the cluster. The metadata for the topology, such as the physical plan >> > and execution details, are stored in the pluggable Heron State Manager >> > (e.g. Apache ZooKeeper). >> > >> > = Rationale = >> > >> > Heron is a general-purpose, modular and extensible platform that can be >> > leveraged to support common, real-time analytics use cases. There is an >> > increasing demand for open-source, scalable real-time analytics >> systems. We >> > believe that Heron can be leveraged by other organizations to build >> > streaming applications that can benefit from its robustness, high >> > performance, adaptability to cloud environments and ease of use. >> Moreover, >> > we hope that open-sourcing Heron will help to further evolve the >> technology >> > as the project attracts contributors with diverse backgrounds and areas >> of >> > expertise. >> > >> > We believe the Apache foundation is a great fit as the long-term home >> for >> > Heron, as it provides an established process for community-driven >> > development and decision making by consensus. This is exactly the model >> we >> > want for future Heron development. >> > >> > = Initial Goals = >> > >> > * Move the existing codebase, website, documentation, and mailing >> lists to >> > Apache-hosted infrastructure. >> > * Integrate with the Apache development process. >> > * Ensure all dependencies are compliant with Apache License version >> 2.0. >> > * Incrementally develop and release per Apache guidelines. >> > >> > = Current Status = >> > >> > Heron is a stable project used in production at Twitter since 2014 and >> open >> > sourced under the ASL v2 license in 2016. The Heron source code is >> > currently hosted at github.com (https://github.com/twitter/heron), >> which >> > will seed the Apache git repository. >> > >> > = Meritocracy = >> > >> > By submitting this incubator proposal, we’re expressing our intent to >> build >> > a diverse developer community around Heron that will conduct itself >> > according to The Apache Way and use a meritocratic means of building >> it's >> > committer base. Several companies and universities have already >> expressed >> > interest in and contributed to Heron. Our goal is to grow the Heron >> > community by encouraging open communication, contribution and >> participation >> > of all types, and ensuring that contributors are recognized >> appropriately. >> > >> > = Community = >> > >> > Heron is currently being used by Twitter, Google, Machine Zone and >> > ndustrial.io and has received significant contributions by Microsoft >> and >> > Streamlio. By bringing Heron into the Apache ecosystem, we believe we >> can >> > attract even more developers who are interested in creating real-time >> > systems to build the project's contributor base. >> > >> > == Core Developers == >> > >> > Current core developers are engineers from Twitter, Google, Microsoft >> and >> > Streamlio. >> > >> > == Alignment == >> > >> > Heron utilizes a number of Apache technologies. Heron leverages Apache >> > ZooKeeper for coordination and has scheduler implementations to >> integrate >> > with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via Apache >> REEF) >> > as well as spout implementations to integrate with Apache Kafka and >> metrics >> > implementations to integrate with Scribe. Heron also implements the >> Apache >> > Storm user-level API, which allows topologies written against Storm to >> run >> > in Heron. We believe that having Heron at Apache will help further the >> > growth of the streaming compute community, as well as encourage >> cooperation >> > and developer cross pollination with other Apache projects. >> > >> > = Known Risks = >> > >> > == Orphaned Products == >> > >> > The risk of the Heron project being abandoned is minimal. It is used in >> > production at Twitter and Google and other companies are evaluating or >> > adopting it for production use. >> > >> > == Inexperience with Open Source == >> > >> > All of the core contributors to the project have considerable experience >> > with open source software development. Bill Graham[2], Ashvin Agrawal[3] >> > and Supun Kamburugamuve[4], committers on the project, are PMCs on other >> > Apache projects and Bill and Ashvin have gone through the Apache >> incubator >> > process. Twitter has already donated numerous projects to the ASF (e.g., >> > Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be >> mentored >> > by experienced ASF members that can help with any roadblocks. >> > >> > == Homogenous Developers == >> > >> > Initial committers come from 5 separate organizations. Our intention is >> > increase the diversity of contributing developers and their >> affiliations. >> > To date github contributions have come from approximately 50 >> contributors >> > from outside the Twitter team. >> > >> > == Reliance on Salaried Developers == >> > >> > It is expected that Heron development will occur on both salaried time >> and >> > on volunteer time. The majority of initial committers are paid by their >> > employers to contribute to this project. We are committed to recruiting >> > additional committers from other organizations as well as non-salaried >> > committers to join project. >> > >> > == Relationships with Other Apache Products == >> > >> > As mentioned in the Alignment section, Heron implements the Apache Storm >> > API and integrates with multiple Apache schedulers (Apache Mesos, Apache >> > Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and Apache >> > Thrift. >> > >> > == An Excessive Fascination with the Apache Brand == >> > >> > Heron's popularity is growing in the streaming compute space and we are >> > long time supporters of the Apache brand. This proposal is not for the >> > purpose of generating publicity through. Rather, the primary benefits to >> > joining Apache are those of community building and open decision making >> > outlined in the Rationale section. >> > >> > == Documentation == >> > >> > This proposal exists online as >> > http://wiki.apache.org/incubator/HeronProposal. Extensive documentation >> > can >> > be found on github at https://twitter.github.io/heron and the source >> code >> > is well documented. >> > >> > == Source and Intellectual Property Submission Plan == >> > >> > The Heron codebase is currently hosted on Github: >> > https://github.com/twitter/heron. During incubation, the codebase will >> be >> > migrated to Apache infrastructure. The source code is already ASF 2.0 >> > licensed. >> > >> > == External Dependencies == >> > >> > All external libraries have ASF 2.0 compatible licenses except for >> pylint. >> > The pylint library is GPL licensed, but is only used for pre-build >> Python >> > style checks and is neither bundled with, nor relied upon by, the Heron >> > source or binary release artifacts. >> > >> > == Cryptography == >> > >> > Heron does not use any cryptography libraries. >> > >> > = Required Resources = >> > >> > == Mailing lists == >> > >> > * priv...@heron.incubator.apache.org (with moderated subscriptions) >> > * d...@heron.incubator.apache.org >> > * comm...@heron.incubator.apache.org >> > * u...@heron.incubator.apache.org >> > >> > == Subversion Directory == >> > >> > Git is the preferred source control system: git://git.apache.org/heron >> > >> > == Issue Tracking == >> > >> > JIRA: Heron (HERON) >> > >> > == Initial Committers == >> > >> > * Andrew Jorgensen (andrew at andrewjorgensen dot com) >> > * Ashvin Agrawal (ashvin at apache dot org)* >> > * Avrilia Floratou (avrilia dot floratou at gmail dot com) >> > * Bill Graham (billgraham at apache dot org)* >> > * Brian Hatfield (bmhatfield at gmail dot com) >> > * Chris Kellogg (cckellogg at gmail dot com) >> > * Huijun Wu (huijun dot wu dot 2010 at gmail dot com) >> > * Karthik Ramasamy (karthik at gmail dot com) >> > * Maosong Fu (maosongfu at gmail dot com) >> > * Neng Lu(freeneng at gmail dot com) >> > * Runhang Li (obj dot runhang at gmail dot com) >> > * Sanjeev Kulkarni (sanjeevrk at gmail dot com) >> > * Supun Kamburugamuve (supun at apache dot org)* >> > * Thomas Sun (tom dot ssf at gmail dot com) >> > * Yaliang Wang (yaliang dot w dot wang at ieee dot org) >> > >> > == Affiliations == >> > >> > * Andrew Jorgensen (Google) >> > * Ashvin Agrawal (Microsoft) >> > * Avrilia Floratou (Microsoft) >> > * Bill Graham (Twitter) >> > * Brian Hatfield (Google) >> > * Chris Kellogg (Twitter) >> > * Huijun Wu (Twitter) >> > * Karthik Ramasamy (Streamlio) >> > * Maosong Fu (Twitter) >> > * Neng Lu (Twitter) >> > * Runhang Li (Twitter) >> > * Sanjeev Kulkarni (Streamlio) >> > * Supun Kamburugamuve (Indiana University) >> > * Thomas Sun (Twitter) >> > * Yaliang Wang (Twitter) >> > >> > = Sponsors = >> > >> > == Champion == >> > >> > * Julien Le Dem (julien at apache dot org) >> > >> > == Nominated Mentors == >> > >> > * Jake Farrell (jfarrell at apache dot org) >> > * Jacques Nadeau (jacques at apache dot org) >> > * Julien Le Dem (julien at apache dot org) >> > * P. Taylor Goetz (ptgoetz at apache dot org) >> > >> > == Sponsoring Entity == >> > >> > The Apache Incubator >> > >> > == Footnotes == >> > >> > * 1 - Papers detailing Heron are available at >> > http://dl.acm.org/citation.cfm?id=2742788 and >> > http://sites.computer.org/debull/A15dec/p15.pdf. >> > * 2 - http://home.apache.org/phonebook.html?uid=billgraham >> > * 3 - http://home.apache.org/phonebook.html?uid=ashvin >> > * 4 - http://home.apache.org/phonebook.html?uid=supun >> > >> > >