On Thu, Jun 15, 2017 at 1:24 PM, Bill Graham <billgra...@gmail.com> wrote:
... One of our goals during incubation will be to use open forums of > communication, like the Apache mailing lists, and work to foster a truly > collaborative environment for both Apache Storm and Heron community members > to work within together. > > +1 > > On Thu, Jun 15, 2017 at 10:42 AM, Debo Dutta (dedutta) <dedu...@cisco.com> > wrote: > > > Am happy to help too! > > > > Thx > > Debo > > > > Sent from my iPhone > > > > > On Jun 14, 2017, at 8:05 PM, William Markito Oliveira < > > william.mark...@gmail.com> wrote: > > > > > > Howdy! > > > > > > If Heron is looking for some help around incubation process, I'd love > to > > > help while Geode experience is still fresh in my mind and given that > > it's a > > > project/space that I do have interest. Since I'm not an ASF member, I > > don't > > > think I can offer to be a mentor, but can probably still help and > > > participate on the process. > > > > > > Thanks! > > > > > >> On Wed, Jun 14, 2017 at 7:54 PM, P. Taylor Goetz <ptgo...@gmail.com> > > wrote: > > >> > > >> Hi Bill/Supun, > > >> > > >> Sorry for not being a little more clear. I was asking more about how > the > > >> Heron community would seek to engage with Storm community at the > > >> *community* level as opposed to the technical level (i.e. “Community > > over > > >> Code”). > > >> > > >> I’ve been asked by many why this has never happened, and have always > > >> struggled to answer. Maybe you could help answer that question as well > > as > > >> if and how that might change if Heron were to incubate. > > >> > > >> Another quick question: The proposal mentions Heron being used in > > >> production at Google, but some Google employees I recently spoke to > > seemed > > >> to contradict that. Could you explain? Note that’s nothing that would > > >> preclude the project from incubating, I’m just curious. > > >> > > >> -Taylor > > >> > > >>> On Jun 14, 2017, at 7:35 AM, Supun Kamburugamuve <supu...@gmail.com> > > >> wrote: > > >>> > > >>> Hi Taylor, > > >>> > > >>> For me, one of the interesting differences between Heron and Storm is > > the > > >>> execution model. Storm uses a shared memory model while Heron uses a > > >>> process based model. It will be interesting to see how these two > > evolve. > > >>> > > >>> Thanks, > > >>> Supun.. > > >>> > > >>> On Mon, Jun 12, 2017 at 4:15 PM, Bill Graham <billgra...@gmail.com> > > >> wrote: > > >>> > > >>>> Hi Taylor, > > >>>> > > >>>> Thanks for the mentor offer, we'd be glad to have your help. > > >>>> > > >>>> I think the best place for collaboration would be around the > evolution > > >> of > > >>>> the API. In addition we plan to look more into DSL solutions which > we > > >> could > > >>>> potentially collaborate on. This could be Trident, or Beam or > > something > > >>>> else, but there could be synergies for future development here. > > >>>> > > >>>> thanks, > > >>>> Bill > > >>>> > > >>>> On Fri, Jun 9, 2017 at 8:53 PM, P. Taylor Goetz <ptgo...@gmail.com> > > >> wrote: > > >>>> > > >>>>> Hi Bill, > > >>>>> > > >>>>> Could you comment on how/if the Heron community would be willing to > > >> work > > >>>>> with the Storm community? I've seen a number of new features in > Storm > > >>>> being > > >>>>> ported to Heron, but I have yet to see any attempt by the Heron > > >> community > > >>>>> to engage with the Apache Storm community. > > >>>>> > > >>>>> I don't think it would be too far off to say that the relationship > > >>>> between > > >>>>> Heron and Apache Storm has been somewhat adversarial. The pre- and > > >>>>> post-open sourcing marketing around Heron seemed, at least to me, > > >>>> somewhat > > >>>>> aggressively negative toward Storm. > > >>>>> > > >>>>> As a peer to Apache Storm, how would the proposed "Apache Heron" > > >>>> community > > >>>>> work to collaborate with the Storm community? If Heron is adopting > > API > > >>>>> changes in Storm, then it seems there is an opportunity for > > >>>> collaboration. > > >>>>> > > >>>>> Don't take any of this as an objection to incubating the project. I > > >> would > > >>>>> support it. I would also be willing to be a mentor, if you would > > >> consider > > >>>>> taking on another. > > >>>>> > > >>>>> -Taylor > > >>>>> > > >>>>>> On Jun 8, 2017, at 1:23 PM, Bill Graham <billgra...@gmail.com> > > wrote: > > >>>>>> > > >>>>>> Dear Apache Incubator Community, > > >>>>>> > > >>>>>> We are excited to share our proposal for discussion and feedback > > >>>>>> for entering Apache Incubation. Heron is a real-time, distributed, > > >>>>>> fault-tolerant stream processing engine. > > >>>>>> > > >>>>>> Our proposal can be found at https://wiki.apache.org/ > > >>>>> incubator/HeronProposal > > >>>>>> and is included below. > > >>>>>> > > >>>>>> > > >>>>>> Thank you, > > >>>>>> > > >>>>>> Bill Graham on behalf of the Heron developers > > >>>>>> > > >>>>>> > > >>>>>> # Heron Proposal > > >>>>>> > > >>>>>> ## Abstract > > >>>>>> Heron is a real-time, distributed, fault-tolerant stream > processing > > >>>>> engine > > >>>>>> initially developed by Twitter. > > >>>>>> > > >>>>>> ## Proposal > > >>>>>> > > >>>>>> Heron is a real-time stream processing engine built for high > > >>>> performance, > > >>>>>> ease of manageability, performance predictability and developer > > >>>>>> productivity[1]. We wish to develop a community around Heron to > > >>>> increase > > >>>>>> contributions and see Heron thrive in an open forum. > > >>>>>> > > >>>>>> ## Background > > >>>>>> > > >>>>>> Heron provides the ability for developers to compose directed > > acyclic > > >>>>>> graphs (DAGs) of real-time query execution logic (i.e. a topology) > > and > > >>>>>> submit the topology to execute on a pluggable job scheduling > system > > >>>>> (e.g., > > >>>>>> Apache Aurora, YARN, Marathon, etc). Users can employ either the > > >> native > > >>>>>> Heron API or the Apache Storm API to develop the topology. Heron > > >>>> supports > > >>>>>> the Storm API for ease of migration, but beyond that Heron’s > > >>>> architecture > > >>>>>> differs considerably from Storm’s. > > >>>>>> > > >>>>>> Users submit a topology to the scheduler using the Heron client, > > which > > >>>>> uses > > >>>>>> the Heron binary libraries to deploy all daemons required to run > and > > >>>>> manage > > >>>>>> the topology. The topology therefore has no reliance on centrally > > >>>> managed > > >>>>>> Heron services, only on a generic job scheduling system, which > lends > > >>>>> itself > > >>>>>> well to be run on top of Apache Aurora/Mesos or Apache Hadoop/YARN > > >>>> (among > > >>>>>> others). > > >>>>>> > > >>>>>> The scheduler runs each topology as a job consisting of multiple > > >>>>>> containers. One of the containers runs the topology master, > > >> responsible > > >>>>> for > > >>>>>> managing the topology. The remaining containers each runs a stream > > >>>>> manager > > >>>>>> responsible for data routing, a metrics manager that collects and > > >>>> reports > > >>>>>> various metrics and a number of processes called Heron instances > > which > > >>>>> run > > >>>>>> the user-defined logic on the stream of tuples. Parallelism is > > >> achieved > > >>>>> via > > >>>>>> process-based isolation of Heron instances, which provides > > predictable > > >>>>>> performance while simplifying debugging. The containers are > > allocated > > >>>> and > > >>>>>> managed by the scheduler framework based on resource availability > of > > >>>>> nodes > > >>>>>> in the cluster. The metadata for the topology, such as the > physical > > >>>> plan > > >>>>>> and execution details, are stored in the pluggable Heron State > > Manager > > >>>>>> (e.g. Apache ZooKeeper). > > >>>>>> > > >>>>>> ## Rationale > > >>>>>> > > >>>>>> Heron is a general-purpose, modular and extensible platform that > can > > >> be > > >>>>>> leveraged to support common, real-time analytics use cases. There > is > > >> an > > >>>>>> increasing demand for open-source, scalable real-time analytics > > >>>> systems. > > >>>>> We > > >>>>>> believe that Heron can be leveraged by other organizations to > build > > >>>>>> streaming applications that can benefit from its robustness, high > > >>>>>> performance, adaptability to cloud environments and ease of use. > > >>>>> Moreover, > > >>>>>> we hope that open-sourcing Heron will help to further evolve the > > >>>>> technology > > >>>>>> as the project attracts contributors with diverse backgrounds and > > >> areas > > >>>>> of > > >>>>>> expertise. > > >>>>>> > > >>>>>> We believe the Apache foundation is a great fit as the long-term > > home > > >>>> for > > >>>>>> Heron, as it provides an established process for community-driven > > >>>>>> development and decision making by consensus. This is exactly the > > >> model > > >>>>> we > > >>>>>> want for future Heron development. > > >>>>>> > > >>>>>> ## Initial Goals > > >>>>>> > > >>>>>> * Move the existing codebase, website, documentation, and mailing > > >> lists > > >>>>> to > > >>>>>> Apache-hosted infrastructure. > > >>>>>> * Integrate with the Apache development process. > > >>>>>> * Ensure all dependencies are compliant with Apache License > version > > >>>> 2.0. > > >>>>>> * Incrementally develop and release per Apache guidelines. > > >>>>>> > > >>>>>> ## Current Status > > >>>>>> > > >>>>>> Heron is a stable project used in production at Twitter since 2014 > > and > > >>>>> open > > >>>>>> sourced under the ASL v2 license in 2016. The Heron source code is > > >>>>>> currently hosted at github.com (https://github.com/twitter/heron > ), > > >>>> which > > >>>>>> will seed the Apache git repository. > > >>>>>> > > >>>>>> ### Meritocracy > > >>>>>> > > >>>>>> By submitting this incubator proposal, we’re expressing our intent > > to > > >>>>> build > > >>>>>> a diverse developer community around Heron that will conduct > itself > > >>>>>> according to The Apache Way and use a meritocratic means of > building > > >>>> it's > > >>>>>> committer base. Several companies and universities have already > > >>>> expressed > > >>>>>> interest in and contributed to Heron. Our goal is to grow the > Heron > > >>>>>> community by encouraging open communication, contribution and > > >>>>> participation > > >>>>>> of all types, and ensuring that contributors are recognized > > >>>>> appropriately. > > >>>>>> > > >>>>>> ### Community > > >>>>>> > > >>>>>> Heron is currently being used by Twitter, Google, Machine Zone and > > >>>>>> ndustrial.io and has received significant contributions by > > Microsoft > > >>>> and > > >>>>>> Streamlio. By bringing Heron into the Apache ecosystem, we believe > > we > > >>>> can > > >>>>>> attract even more developers who are interested in creating > > real-time > > >>>>>> systems to build the project's contributor base. > > >>>>>> > > >>>>>> ### Core Developers > > >>>>>> > > >>>>>> Current core developers are engineers from Twitter, Google, > > Microsoft > > >>>> and > > >>>>>> Streamlio. > > >>>>>> > > >>>>>> ### Alignment > > >>>>>> > > >>>>>> Heron utilizes a number of Apache technologies. Heron leverages > > Apache > > >>>>>> ZooKeeper for coordination and has scheduler implementations to > > >>>> integrate > > >>>>>> with Apache Mesos, Apache Aurora and Apache Hadoop's YARN (via > > Apache > > >>>>> REEF) > > >>>>>> as well as spout implementations to integrate with Apache Kafka > and > > >>>>> metrics > > >>>>>> implementations to integrate with Scribe. Heron also implements > the > > >>>>> Apache > > >>>>>> Storm user-level API, which allows topologies written against > Storm > > to > > >>>>> run > > >>>>>> in Heron. We believe that having Heron at Apache will help further > > the > > >>>>>> growth of the streaming compute community, as well as encourage > > >>>>> cooperation > > >>>>>> and developer cross pollination with other Apache projects. > > >>>>>> > > >>>>>> ## Known Risks > > >>>>>> > > >>>>>> ### Orphaned Products > > >>>>>> > > >>>>>> The risk of the Heron project being abandoned is minimal. It is > used > > >> in > > >>>>>> production at Twitter and Google and other companies are > evaluating > > or > > >>>>>> adopting it for production use. > > >>>>>> > > >>>>>> ### Inexperience with Open Source > > >>>>>> > > >>>>>> All of the core contributors to the project have considerable > > >>>> experience > > >>>>>> with open source software development. Bill Graham[2], Ashvin > > >>>> Agrawal[3] > > >>>>>> and Supun Kamburugamuve[4], committers on the project, are PMCs on > > >>>> other > > >>>>>> Apache projects and Bill and Ashvin have gone through the Apache > > >>>>> incubator > > >>>>>> process. Twitter has already donated numerous projects to the ASF > > >>>> (e.g., > > >>>>>> Apache Mesos, Apache Aurora, Apache Parquet). We also plan to be > > >>>> mentored > > >>>>>> by experienced ASF members that can help with any roadblocks. > > >>>>>> > > >>>>>> ### Homogenous Developers > > >>>>>> > > >>>>>> Initial committers come from 5 separate organizations. Our > intention > > >> is > > >>>>>> increase the diversity of contributing developers and their > > >>>> affiliations. > > >>>>>> To date github contributions have come from approximately 50 > > >>>> contributors > > >>>>>> from outside the Twitter team. > > >>>>>> > > >>>>>> ### Reliance on Salaried Developers > > >>>>>> > > >>>>>> It is expected that Heron development will occur on both salaried > > time > > >>>>> and > > >>>>>> on volunteer time. The majority of initial committers are paid by > > >> their > > >>>>>> employers to contribute to this project. We are committed to > > >> recruiting > > >>>>>> additional committers from other organizations as well as > > non-salaried > > >>>>>> committers to join project. > > >>>>>> > > >>>>>> ### Relationships with Other Apache Products > > >>>>>> > > >>>>>> As mentioned in the Alignment section, Heron implements the Apache > > >>>> Storm > > >>>>>> API and integrates with multiple Apache schedulers (Apache Mesos, > > >>>> Apache > > >>>>>> Aurora and Apache Hadoop's YARN) as well as Apache ZooKeeper and > > >> Apache > > >>>>>> Thrift. > > >>>>>> > > >>>>>> ### An Excessive Fascination with the Apache Brand > > >>>>>> > > >>>>>> Heron's popularity is growing in the streaming compute space and > we > > >> are > > >>>>>> long time supporters of the Apache brand. This proposal is not for > > the > > >>>>>> purpose of generating publicity through. Rather, the primary > > benefits > > >>>> to > > >>>>>> joining Apache are those of community building and open decision > > >> making > > >>>>>> outlined in the Rationale section. > > >>>>>> > > >>>>>> ## Documentation > > >>>>>> > > >>>>>> This proposal exists online as http://wiki.apache.org/ > > >>>>>> incubator/HeronProposal. Extensive documentation can be found on > > >> github > > >>>>> at > > >>>>>> https://twitter.github.io/heron and the source code is well > > >>>> documented. > > >>>>>> > > >>>>>> ## Source and Intellectual Property Submission Plan > > >>>>>> > > >>>>>> The Heron codebase is currently hosted on Github: > > >>>>>> https://github.com/twitter/heron. During incubation, the codebase > > >> will > > >>>>> be > > >>>>>> migrated to Apache infrastructure. The source code is already ASF > > 2.0 > > >>>>>> licensed. > > >>>>>> > > >>>>>> ## External Dependencies > > >>>>>> > > >>>>>> All external libraries have ASF 2.0 compatible licenses except for > > >>>>> pylint. > > >>>>>> The pylint library is GPL licensed, but is only used for pre-build > > >>>> Python > > >>>>>> style checks and is neither bundled with, nor relied upon by, the > > >> Heron > > >>>>>> source or binary release artifacts. > > >>>>>> > > >>>>>> ## Cryptography > > >>>>>> > > >>>>>> Heron does not use any cryptography libraries. > > >>>>>> > > >>>>>> ## Required Resources > > >>>>>> > > >>>>>> ### Mailing lists > > >>>>>> > > >>>>>> priv...@heron.incubator.apache.org (with moderated subscriptions) > > >>>>>> d...@heron.incubator.apache.org > > >>>>>> comm...@heron.incubator.apache.org > > >>>>>> u...@heron.incubator.apache.org > > >>>>>> > > >>>>>> ## Subversion Directory > > >>>>>> > > >>>>>> Git is the preferred source control system: git:// > > >> git.apache.org/heron > > >>>>>> > > >>>>>> ## Issue Tracking > > >>>>>> > > >>>>>> JIRA: Heron (HERON) > > >>>>>> > > >>>>>> ## Initial Committers > > >>>>>> > > >>>>>> * Andrew Jorgensen (andrew at andrewjorgensen dot com) > > >>>>>> * Ashvin Agrawal (ashvin at apache dot org)* > > >>>>>> * Avrilia Floratou (avrilia dot floratou at gmail dot com) > > >>>>>> * Bill Graham (billgraham at apache dot org)* > > >>>>>> * Brian Hatfield (bmhatfield at gmail dot com) > > >>>>>> * Chris Kellogg (cckellogg at gmail dot com) > > >>>>>> * Huijun Wu (huijun dot wu dot 2010 at gmail dot com) > > >>>>>> * Karthik Ramasamy (karthik at gmail dot com) > > >>>>>> * Maosong Fu (maosongfu at gmail dot com) > > >>>>>> * Neng Lu(freeneng at gmail dot com) > > >>>>>> * Runhang Li (obj dot runhang at gmail dot com) > > >>>>>> * Sanjeev Kulkarni (sanjeevrk at gmail dot com) > > >>>>>> * Supun Kamburugamuve (supun at apache dot org)* > > >>>>>> * Thomas Sun (tom dot ssf at gmail dot com) > > >>>>>> * Yaliang Wang (yaliang dot w dot wang at ieee dot org) > > >>>>>> > > >>>>>> ## Affiliations > > >>>>>> > > >>>>>> * Andrew Jorgensen (Google) > > >>>>>> * Ashvin Agrawal (Microsoft) > > >>>>>> * Avrilia Floratou (Microsoft) > > >>>>>> * Bill Graham (Twitter) > > >>>>>> * Brian Hatfield (Google) > > >>>>>> * Chris Kellogg (Twitter) > > >>>>>> * Huijun Wu (Twitter) > > >>>>>> * Karthik Ramasamy (Streamlio) > > >>>>>> * Maosong Fu (Twitter) > > >>>>>> * Neng Lu (Twitter) > > >>>>>> * Runhang Li (Twitter) > > >>>>>> * Sanjeev Kulkarni (Streamlio) > > >>>>>> * Supun Kamburugamuve (Indiana University) > > >>>>>> * Thomas Sun (Twitter) > > >>>>>> * Yaliang Wang (Twitter) > > >>>>>> > > >>>>>> ## Sponsors > > >>>>>> > > >>>>>> ### Champion > > >>>>>> > > >>>>>> * Julien Le Dem (julien at apache dot org) > > >>>>>> > > >>>>>> ### Nominated Mentors > > >>>>>> > > >>>>>> * Jake Farrell (jfarrell at apache dot org) > > >>>>>> * Jacques Nadeau (jacques at apache dot org) > > >>>>>> * Julien Le Dem (julien at apache dot org) > > >>>>>> > > >>>>>> ### Sponsoring Entity > > >>>>>> > > >>>>>> The Apache Incubator > > >>>>>> > > >>>>>> ### Footnotes > > >>>>>> > > >>>>>> 1 - Papers detailing Heron are available at > > >> http://dl.acm.org/citation > > >>>> . > > >>>>>> cfm?id=2742788 and http://sites.computer.org/ > debull/A15dec/p15.pdf. > > >>>>>> 2 - http://home.apache.org/phonebook.html?uid=billgraham > > >>>>>> 3 - http://home.apache.org/phonebook.html?uid=ashvin > > >>>>>> 4 - http://home.apache.org/phonebook.html?uid=supun > > >>>>> > > >>>>> ------------------------------------------------------------ > > --------- > > >>>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > >>>>> For additional commands, e-mail: general-h...@incubator.apache.org > > >>>>> > > >>>>> > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Supun Kamburugamuve > > >>> Member, Apache Software Foundation; http://www.apache.org > > >>> E-mail: supun@apache.o <supu...@gmail.com>rg; Mobile: +1 812 219 > 2563 > > >>> <(812)%20219-2563> > > >> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > >> For additional commands, e-mail: general-h...@incubator.apache.org > > >> > > >> > > > > > > > > > -- > > > ~/William > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > >