+1 (from the cheap seats). Best regards,
Pierre Smits ORRTIZ.COM <http://www.orrtiz.com> OFBiz based solutions & services OFBiz Extensions Marketplace http://oem.ofbizci.net/oci-2/ On Wed, May 31, 2017 at 10:18 PM, tim shea <tim.s...@oracle.com> wrote: > +1 (non-binding) > > Great project (and I've used it). > > > On 5/31/17 11:59 AM, Kostas Sakellis wrote: > >> +1 (non-binding) >> >> On Wed, May 31, 2017 at 11:46 AM, Andrew Purtell < >> andrew.purt...@gmail.com> >> wrote: >> >> +1 (binding) >>> >>> On May 31, 2017, at 6:03 AM, Sean Busbey <bus...@apache.org> wrote: >>>> >>>> Hi folks! >>>> >>>> I'm calling a vote to accept "Livy" into the Apache Incubator. >>>> >>>> The full proposal is available below, and is also available in the wiki: >>>> >>>> https://wiki.apache.org/incubator/LivyProposal >>>> >>>> For additional context, please see the discussion thread: >>>> >>>> https://s.apache.org/incubator-livy-proposal-thread >>>> >>>> Please cast your vote: >>>> >>>> [ ] +1, bring Livy into Incubator >>>> [ ] -1, do not bring Livy into Incubator, because... >>>> >>>> The vote will open at least for 72 hours and only votes from the >>>> >>> Incubator >>> >>>> PMC are binding. >>>> >>>> I start with my vote: >>>> +1 >>>> >>>> ---- >>>> >>>> = Abstract = >>>> >>>> Livy is web service that exposes a REST interface for managing long >>>> >>> running >>> >>>> Apache Spark contexts in your cluster. With Livy, new applications can >>>> be >>>> built on top of Apache Spark that require fine grained interaction with >>>> >>> many >>> >>>> Spark contexts. >>>> >>>> = Proposal = >>>> >>>> Livy is an open-source REST service for Apache Spark. Livy enables >>>> applications to submit Spark applications and retrieve results without a >>>> co-location requirement on the Spark cluster. >>>> >>>> We propose to contribute the Livy codebase and associated artifacts >>>> (e.g. >>>> documentation, web-site context etc) to the Apache Software Foundation. >>>> >>>> = Background = >>>> >>>> Apache Spark is a fast and general purpose distributed compute engine, >>>> >>> with >>> >>>> a versatile API. It enables processing of large quantities of static >>>> data >>>> distributed over a cluster of machines, as well as processing of >>>> >>> continuous >>> >>>> streams of data. It is the preferred distributed data processing engine >>>> >>> for >>> >>>> data engineering, stream processing and data science workloads. Each >>>> >>> Spark >>> >>>> application uses a construct called the SparkContext, which is the >>>> application’s connection or entry point to the Spark engine. Each Spark >>>> application will have its own SparkContext. >>>> >>>> Livy enables clients to interact with one or more Spark sessions through >>>> >>> the >>> >>>> Livy Server, which acts as a proxy layer. Livy Clients have fine grained >>>> control over the lifecycle of the Spark sessions, as well as the ability >>>> >>> to >>> >>>> submit jobs and retrieve results, all over HTTP. Clients have two modes >>>> >>> of >>> >>>> interaction: RPC Client API, available in Java and Python, which allows >>>> results to be retrieved as Java or Python objects. The serialization and >>>> deserialization of the results is handled by the Livy framework. HTTP >>>> >>> based >>> >>>> API that allows submission of code snippets, and retrieval of the >>>> >>> results in >>> >>>> different formats. >>>> >>>> Multi-tenant resource allocation and security: Livy enables multiple >>>> independent Spark sessions to be managed simultaneously. Multiple >>>> clients >>>> can also interact simultaneously with the same Spark session and share >>>> >>> the >>> >>>> resources of that Spark session. Livy can also enforce secure, >>>> >>> authenticated >>> >>>> communication between the clients and their respective Spark sessions. >>>> >>>> More information on Livy can be found at the existing open source >>>> >>> website: >>> >>>> http://livy.io/ >>>> >>>> = Rationale = >>>> >>>> Users want to use Spark’s powerful processing engine and API as the data >>>> processing backend for interactive applications. However, the job >>>> >>> submission >>> >>>> and application interaction mechanisms built into Apache Spark are >>>> insufficient and cumbersome for multi-user interactive applications. >>>> >>>> The primary mechanism for applications to submit Spark jobs is via >>>> spark-submit >>>> (http://spark.apache.org/docs/latest/submitting-applications.html), >>>> >>> which is >>> >>>> available as a command line tool as well as a programmatic API. However, >>>> spark-submit has the following limitations that make it difficult to >>>> >>> build >>> >>>> interactive applications: It is slow: each invocation of spark-submit >>>> involves a setup phase where cluster resources are acquired, new >>>> >>> processes >>> >>>> are forked, etc. This setup phase runs for many seconds, or even >>>> minutes, >>>> and hence is too slow for interactive applications. It is cumbersome and >>>> lacks flexibility: application code and dependencies have to be >>>> >>> pre-compiled >>> >>>> and submitted as jars, and can not be submitted interactively. >>>> >>>> Apache Spark comes with an ODBC/JDBC server, which can be used to submit >>>> >>> SQL >>> >>>> queries to Spark. However, this solution is limited to SQL and does not >>>> allow the client to leverage the rest of the Spark API, such as RDDs, >>>> >>> MLlib >>> >>>> and Streaming. >>>> >>>> A third way of using Spark is via its command-line shell, which allows >>>> >>> the >>> >>>> interactive submission of snippets of Spark code. However, the shell >>>> >>> entails >>> >>>> running Spark code on the client machine and hence is not a viable >>>> >>> mechanism >>> >>>> for remote clients to submit Spark jobs. >>>> >>>> Livy solves the limitations of the above three mechanisms, and provides >>>> >>> the >>> >>>> full Spark API as a multi-tenant service to remote clients. >>>> >>>> Since the open source release of Livy in late 2015, we have seen >>>> >>> tremendous >>> >>>> interest among a diverse set of application developers and ISVs that >>>> >>> want to >>> >>>> build applications with Apache Spark. To make Livy a robust and flexible >>>> solution that will enable a broad and growing set of applications, it is >>>> important to grow a large and varied community of contributors. >>>> >>>> = Initial Goals = >>>> >>>> * Move existing codebase, website, documentation and mailing lists to >>>> Apache-hosted infrastructure >>>> * Work with the infrastructure team to implement and approve our code >>>> review, build, and testing workflows in the context of the ASF >>>> * Incremental development and releases per Apache guidelines >>>> >>>> = Current Status = >>>> >>>> The Livy project began at Cloudera, as a part of the Hue project. >>>> >>> Cloudera >>> >>>> soon realized the broad applicability of Livy, and separated it out into >>>> >>> an >>> >>>> independent project in Nov 2015. >>>> >>>> == Releases == >>>> >>>> Livy has undergone two public releases, tagged here: >>>> >>>> * https://github.com/cloudera/livy/releases/tag/v0.2.0 >>>> * https://github.com/cloudera/livy/releases/tag/v0.3.0 >>>> >>>> Tarballs and zip files were created for each release and hosted on >>>> >>> github. >>> >>>> Upon joining the incubator, we will adopt a more typical ASF release >>>> process. >>>> >>>> == Source == >>>> >>>> Livy’s source is currently hosted on Github at: >>>> https://github.com/cloudera/livy >>>> >>>> This repository will be transitioned to Apache’s git hosting during >>>> incubation. >>>> >>>> == Code review == >>>> >>>> Livy’s code reviews are currently public and hosted on github as pull >>>> request reviews at: https://github.com/cloudera/livy/pulls >>>> The Livy developer community so far is happy with github pull request >>>> reviews and hopes to continue this after being admitted to the ASF. >>>> >>>> == Issue Tracking == >>>> >>>> Livy’s bug and feature tracking is hosted on JIRA at: >>>> https://issues.cloudera.org/projects/LIVY/summary >>>> This JIRA instance contains bugs and development discussion dating back >>>> 1 >>>> year and will provide an initial seed for the ASF JIRA >>>> >>>> == Community Discussion == >>>> >>>> Livy has several public discussion forums: >>>> >>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-dev >>>> * https://groups.google.com/a/cloudera.org/forum/#!forum/livy-user >>>> >>>> == Development Practices == >>>> >>>> The Livy project follows a review before commit philosophy. Every commit >>>> automatically runs through the unit tests and generates coverage reports >>>> presented as a pull request comment. Our experience with this process >>>> >>> leads >>> >>>> us to believe that it helps ease new contributors into the project. They >>>> >>> get >>> >>>> feedback quickly on common mistakes, lowering the burden on reviewers. >>>> >>> Those >>> >>>> same reviewers get to lead by example, showing the new contributors that >>>> >>> we >>> >>>> value feedback within our community even when changes are done by more >>>> experienced folks. >>>> >>>> == Meritocracy == >>>> >>>> We believe strongly in meritocracy when electing committers and PMC >>>> >>> members. >>> >>>> In the past few months, the project has added two new committers from >>>> two >>>> different organisations, in recognition of their significant >>>> >>> contributions >>> >>>> to the project. We will encourage contributions and participation of all >>>> types, and ensure that contributors are appropriately recognized. >>>> >>>> == Community == >>>> >>>> Though Livy is relatively new as a standalone open source project, it >>>> has >>>> already seen promising growth in its community across several >>>> >>> organizations: >>> >>>> Cloudera is the original development sponsor for Livy >>>> Microsoft pushed the development of the interpreter fixing high >>>> >>> availability >>> >>>> issues and adding additional features. >>>> Hortonworks has contributed the security features to Livy allowing >>>> >>> kerberos >>> >>>> and impersonation to work with Spark >>>> IBM is starting to make contributions to the Livy project >>>> A number of other patches contributed by community members >>>> >>>> Livy currently relies on Google Groups for mailing lists. These lists >>>> >>> have >>> >>>> been active since the end of 2015/start of 2016. Currently, Livy’s user >>>> mailing list has 173 subscribers and has hosted a total of 227 topic >>>> threads. Livy’s developer list has 49 subscribers and has hosted 79 >>>> topic >>>> threads. >>>> >>>> == Core Developers == >>>> >>>> The early contributions to Livy were made by Cloudera engineers. In >>>> 2016, >>>> engineers from Microsoft and Hortonworks joined the core developer >>>> community. >>>> >>>> == Alignment == >>>> >>>> Livy is built upon Apache Spark, and other Apache projects like Apache >>>> Hadoop YARN. It’s used as a building block by Apache Zeppelin. These >>>> community connections combined with our focus on development practices >>>> >>> that >>> >>>> emphasize community engagement with a path to meritocratic recognition >>>> naturally align us with the ASF. >>>> >>>> = Known Risks = >>>> >>>> == Orphaned Products == >>>> >>>> The risk of Livy being abandoned is low because it is supported by three >>>> major big-data software vendors. Moreover, Livy is already used to power >>>> multiple releases of services and products used in production. >>>> >>>> == Inexperience with Open Source == >>>> >>>> Several of the initial committers are experienced open source >>>> developers, >>>> several being committers and/or PMC members on other ASF projects >>>> (Spark, >>>> YARN). >>>> >>>> == Homogenous Developers == >>>> >>>> The project already has a diverse developer base. It has contributions >>>> >>> from >>> >>>> 3 major organisations (Cloudera, Microsoft and Hortonworks), and is used >>>> >>> in >>> >>>> diverse applications, in diverse settings (On-Prem and Cloud). >>>> >>>> == Reliance on salaried Developers == >>>> >>>> The contributions to the Livy project to date have been made by salaried >>>> engineers from Cloudera, Microsoft and Hortonworks. One of the >>>> >>> individuals >>> >>>> on the initial committer list has since left Microsoft and is currently >>>> unaffiliated. The remaining contributors are from Cloudera and >>>> >>> Hortonworks. >>> >>>> Since there are at least two major organizations involved, the risk of >>>> reliance on a single group of salaried developers is mitigated. The Livy >>>> user base is diverse, with users from across the globe, including users >>>> >>> from >>> >>>> academic settings. We aim to further diversify the Livy user and >>>> >>> contributor >>> >>>> base. >>>> >>>> == Relationships with other Apache projects == >>>> >>>> Livy is closely tied to the Apache Spark project and currently addresses >>>> >>> the >>> >>>> scenarios for a REST based batch and interactive gateway for Spark jobs >>>> >>> on >>> >>>> YARN. Given the growing number of integrations with Livy, keeping it >>>> >>> outside >>> >>>> of Apache Spark aligns with the desire of the Apache Spark community to >>>> reduce the number of external dependencies in the Spark project. >>>> Specifically, the Apache Spark community has previously expressed a >>>> >>> desire >>> >>>> to keep job servers independent from the project.<<FootNote(See, for >>>> example, discussion of the Ooyala Spark Job Server in SPARK-818)>> >>>> Furthermore, while Livy common usage is closely tied to Spark >>>> deployments >>>> right now, its core building blocks can be reused elsewhere. Livy’s >>>> >>> Remote >>> >>>> REPL could be used as a library for interactive scenarios in non-Spark >>>> projects. In the future, integrations with cluster managers like Apache >>>> Mesos and others could also be added. >>>> >>>> The features provided by Livy have already been integrated with existing >>>> projects like Jupyter and Apache Zeppelin for their interactive Spark >>>> use >>>> cases. This validates the need for a project like Livy and provides an >>>> active downstream user base that the Livy community can interact with to >>>> seed future interest in the project. >>>> >>>> Livy serves a similar purpose to Apache Toree (incubating) but differs >>>> in >>>> making session management, security and impersonation a focal design >>>> >>> point. >>> >>>> == An Excessive Fascination with the Apache Brand == >>>> >>>> The primary motivation for submitting Livy to the ASF is to grow a >>>> >>> diverse >>> >>>> and strong community. We wish to encourage diverse organisations, >>>> >>> including >>> >>>> ISVs, to adopt Livy and contribute to Livy without any concerns about >>>> ownership or licensing. >>>> >>>> = Documentation = >>>> >>>> Documentation can be found on the Livy website http://livy.io/ >>>> >>>> The Livy web site is version controlled on the ‘gh-pages’ branch of the >>>> above repository. >>>> Additional documentation is provided on the github wiki: >>>> https://github.com/cloudera/livy/wiki >>>> APis are documented within the source code as JavaDoc style >>>> documentation >>>> comments. >>>> >>>> = Initial Source = >>>> >>>> The initial source code for Livy is hosted at >>>> https://github.com/cloudera/livy >>>> >>>> = Source and Intellectual Property submission plan = >>>> >>>> The Livy codebase and web site is currently hosted on GitHub and will be >>>> transitioned to the ASF repositories during incubation. Livy is already >>>> licensed under the Apache 2.0 license. Cloudera has collected ICLAs and >>>> CCLAs from all committers. There are, however, some contributions >>>> >>> recently >>> >>>> from authors that have not signed the CCLA and ICLA. If necessary for a >>>> successful SGA, we’ll seek the necessary documentation or replace the >>>> contributions. >>>> >>>> The “Livy” name is not a registered trademark. We will need to do a >>>> trademark search and make sure it is available for the Apache Foundation >>>> prior to graduation. >>>> >>>> Cloudera currently owns the domain name: http://livy.io/. Once all the >>>> documentation has moved over to ASF infrastructure, the main landing >>>> page >>>> will become livy.incubator.apache.org and the old domain will just act >>>> >>> as a >>> >>>> redirect. >>>> >>>> = External Dependencies = >>>> >>>> The list below covers the non-Apache dependencies of the project and >>>> >>> their >>> >>>> licenses. >>>> >>>> * Jetty: Apache 2.0 >>>> * Dropwizard Metrics: Apache 2.0 >>>> * FasterXML Jackson: Apache 2.0 >>>> * Netty: Apache 2.0 >>>> * Scala: BSD >>>> * Py4J: BSD >>>> * Scalatra: BSD >>>> >>>> Build/test-only dependencies: >>>> >>>> * Mockito: MIT >>>> * JUnit: Eclipse >>>> >>>> = Required Resources = >>>> >>>> == Mailing Lists == >>>> >>>> * priv...@livy.incubator.apache.org (PPMC) >>>> * d...@livy.incubator.apache.org (dev mailing list) >>>> * u...@livy.incubator.apache.org (User questions) >>>> * comm...@livy.incubator.apache.org (subscribers shouldn’t be able to >>>> >>> post) >>> >>>> * iss...@livy.incubator.apache.org (subscribers shouldn’t be able to >>>> >>> post) >>> >>>> == Git Repository == >>>> >>>> git://git.apache.org/incubator-livy >>>> >>>> == Issue Tracking == >>>> >>>> We would like to import our current JIRA project into the ASF JIRA, such >>>> that our historical commit message and code comments continue to >>>> >>> reference >>> >>>> the appropriate bug numbers. >>>> >>>> = Initial Committers = >>>> >>>> * Marcelo Vanzin (van...@cloudera.com) >>>> * Alex Man (alex@alexman.space) >>>> * Jeff Zhang (zjf...@gmail.com) >>>> * Saisai Shao (ss...@hortonworks.com) >>>> * Kostas Sakellis (kos...@cloudera.com) >>>> >>>> = Affiliations = >>>> >>>> The initial set of committers includes people employed by Cloudera and >>>> Hortonworks as well as one currently independent contributor. >>>> >>>> = Additional Interested Contributors = >>>> >>>> Those interested in getting involved with the project as we enter >>>> >>> incubation >>> >>>> are encouraged to list themselves here. >>>> >>>> * Ismaël Mejía (ieme...@apache.org) >>>> >>>> = Sponsors = >>>> >>>> == Champion == >>>> >>>> Sean Busbey (bus...@apache.org) >>>> >>>> == Nominated Mentors == >>>> >>>> * Bikas Saha (bi...@apache.org) >>>> * Brock Noland (br...@phdata.io) >>>> * Luciano Resende (lrese...@apache.org) >>>> >>>> == Sponsoring Entity == >>>> >>>> We ask that the Incubator PMC sponsor this proposal. >>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>>> For additional commands, e-mail: general-h...@incubator.apache.org >>>> >>>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>> For additional commands, e-mail: general-h...@incubator.apache.org >>> >>> >>> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >