Re: [RESULT] [VOTE] Accept Coral into the Apache Incubator

Craig Russell Sun, 04 Feb 2018 21:27:38 -0800

Please add the proposal to the official incubator proposal wiki list

https://wiki.apache.org/incubator/ProjectProposals


Craig

> On Feb 4, 2018, at 1:10 PM, Byung-Gon Chun <bgc...@gmail.com> wrote:
> 
> Hi,
> 
> 72 hours has passed and the vote for accepting Coral into the Apache
> Incubator has passed with:
> 
> 9 binding "+1" votes,  1 non-binding "+1" votes,  and no "-1” votes.
> 
> Binding votes:
> Kevin A. McGrail
> Davor Bonaci
> Dave Fisher
> Hyunsik Choi
> Leif Hedstrom
> Jean-Baptiste Onofré
> Romain Manni-Bucau
> Mark Struberg
> Byung-Gon Chun
> 
> Non-binding votes:
> Clebert Suconic
> 
> Thanks to everyone who voted.
> 
> On Thu, Feb 1, 2018 at 11:07 PM, Byung-Gon Chun <bgc...@gmail.com> wrote:
> 
>> Hi all,
>> 
>> I would like to start a VOTE to propose the Coral project as a podling
>> into the Apache Incubator.
>> 
>> The ASF voting rules are described at https://www.apache.org/foundation/
>> voting.html
>> 
>> A vote for accepting a new Apache Incubator podling is a majority vote for
>> which only Incubator PMC member votes are binding.
>> 
>> This vote will run for at least 72 hours. Please VOTE as follows.
>> [] +1 Accept Coral into the Apache Incubator
>> [] +0 Abstain
>> [] -1 Do not accept Coral into the Apache Incubator because ...
>> 
>> The proposal is listed below, but you can also access it on the wiki:
>> https://wiki.apache.org/incubator/CoralProposal
>> 
>> = CoralProposal =
>> 
>> == Abstract ==
>> Coral is a data processing system for flexible employment with different 
>> execution scenarios for various deployment characteristics on clusters.
>> 
>> == Proposal ==
>> Today, there is a wide variety of data processing systems with different 
>> designs for better performance and datacenter efficiency. They include 
>> processing data on specific resource environments and running jobs with 
>> specific attributes. Although each system successfully solves the problems 
>> it targets, most systems are designed in the way that runtime behaviors are 
>> built tightly inside the system core to hide the complexity of distributed 
>> computing. This makes it hard for a single system to support different 
>> deployment characteristics with different runtime behaviors without 
>> substantial effort.
>> 
>> Coral is a data processing system that aims to flexibly control the runtime 
>> behaviors of a job to adapt to varying deployment characteristics. Moreover, 
>> it provides a means of extending the system’s capabilities and incorporating 
>> the extensions to the flexible job execution.
>> 
>> In order to be able to easily modify runtime behaviors to adapt to varying 
>> deployment characteristics, Coral exposes runtime behaviors to be flexibly 
>> configured and modified at both compile-time and runtime through a set of 
>> high-level graph pass interfaces.
>> 
>> We hope to contribute to the big data processing community by enabling more 
>> flexibility and extensibility in job executions. Furthermore, we can benefit 
>> more together as a community when we work together as a community to mature 
>> the system with more use cases and understanding of diverse deployment 
>> characteristics. The Apache Software Foundation is the perfect place to 
>> achieve these aspirations.
>> 
>> == Background ==
>> Many data processing systems have distinctive runtime behaviors optimized 
>> and configured for specific deployment characteristics like different 
>> resource environments and for handling special job attributes.
>> 
>> For example, much research have been conducted to overcome the challenge of 
>> running data processing jobs on cheap, unreliable transient resources. 
>> Likewise, techniques for disaggregating different types of resources, like 
>> memory, CPU and GPU, are being actively developed to use datacenter 
>> resources more efficiently. Many researchers are also working to run data 
>> processing jobs in even more diverse environments, such as across distant 
>> datacenters. Similarly, for special job attributes, many works take 
>> different approaches, such as runtime optimization, to solve problems like 
>> data skew, and to optimize systems for data processing jobs with small-scale 
>> input data.
>> 
>> Although each of the systems performs well with the jobs and in the 
>> environments they target, they perform poorly with unconsidered cases, and 
>> do not consider supporting multiple deployment characteristics on a single 
>> system in their designs.
>> 
>> For an application writer to optimize an application to perform well on a 
>> certain system engraved with its underlying behaviors, it requires a deep 
>> understanding of the system itself, which is an overhead that often requires 
>> a lot of time and effort. Moreover, for a developer to modify such system 
>> behaviors, it requires modifications of the system core, which requires an 
>> even deeper understanding of the system itself.
>> 
>> With this background, Coral is designed to represent all of its jobs as an 
>> Intermediate Representation (IR) DAG. In the Coral compiler, user 
>> applications from various programming models (ex. Apache Beam) are 
>> submitted, transformed to an IR DAG, and optimized/customized for the 
>> deployment characteristics. In the IR DAG optimization phase, the DAG is 
>> modified through a series of compiler “passes” which reshape or annotate the 
>> DAG with an expression of the underlying runtime behaviors. The IR DAG is 
>> then submitted as an execution plan for the Coral runtime. The runtime 
>> includes the unmodified parts of data processing in the backbone which is 
>> transparently integrated with configurable components exposed for further 
>> extension.
>> 
>> == Rationale ==
>> Coral’s vision lies in providing means for flexibly supporting a wide 
>> variety of job execution scenarios for users while facilitating system 
>> developers to extend the execution framework with various functionalities at 
>> the same time. The capabilities of the system can be extended as it grows to 
>> meet a more variety of execution scenarios. We require inputs from users and 
>> developers from diverse domains in order to make it a more thriving and 
>> useful project. The Apache Software Foundation provides the best tools and 
>> community to support this vision.
>> 
>> == Initial Goals ==
>> Initial goals will be to move the existing codebase to Apache and integrate 
>> with the Apache development process. We further plan to develop our system 
>> to meet the needs for more execution scenarios for a more variety of 
>> deployment characteristics.
>> 
>> == Current Status ==
>> Coral codebase is currently hosted in a repository at github.com. The 
>> current version has been developed by system developers at Seoul National 
>> University, Viva Republica, Samsung, and LG.
>> 
>> == Meritocracy ==
>> We plan to strongly support meritocracy. We will discuss the requirements in 
>> an open forum, and those that continuously contribute to Coral with the 
>> passion to strengthen the system will be invited as committers. Contributors 
>> that enrich Coral by providing various use cases, various implementations of 
>> the configurable components including ideas for optimization techniques will 
>> be especially welcome. Committers with a deep understanding of the system’s 
>> technical aspects as a whole and its philosophy will definitely be voted as 
>> the PMC. We will monitor community participation so that privileges can be 
>> extended to those that contribute.
>> 
>> == Community ==
>> We hope to expand our contribution community by becoming an Apache incubator 
>> project. The contributions will come from both users and system developers 
>> interested in flexibility and extensibility of job executions that Coral can 
>> support. We expect users to mainly contribute to diversify the use cases and 
>> deployment characteristics, and developers to  contribute to implement them.
>> 
>> == Alignment ==
>> Apache Spark is one of many popular data processing frameworks. The system 
>> is designed towards optimizing jobs using RDDs in memory and many other 
>> optimizations built tightly within the framework. In contrast to Spark, 
>> Coral aims to provide more flexibility for job execution in an easy manner.
>> 
>> Apache Tez enables developers to build complex task DAGs with control over 
>> the control plane of job execution. In Coral, a high-level programming layer 
>> (ex. Apache Beam) is automatically converted to a basic IR DAG and can be 
>> converted to any IR DAG through a series of easy user writable passes, that 
>> can both reshape and modify the annotation (of execution properties) of the 
>> DAG. Moreover, Coral leaves more parts of the job execution configurable, 
>> such as the scheduler and the data plane. As opposed to providing a set of 
>> properties for solid optimization, Coral’s configurable parts can be easily 
>> extended and explored by implementing the pre-defined interfaces. For 
>> example, an arbitrary intermediate data store can be added.
>> 
>> Coral currently supports Apache Beam programs and we are working on 
>> supporting Apache Spark programs as well. Coral also utilizes Apache REEF 
>> for container management, which allows Coral to run in Apache YARN and 
>> Apache Mesos clusters. If necessary, we plan to contribute to and 
>> collaborate with these other Apache projects for the benefit of all. We plan 
>> to extend such integrations with more Apache softwares. Apache software 
>> foundation already hosts many major big-data systems, and we expect to help 
>> further growth of the big-data community by having Coral within the Apache 
>> foundation.
>> 
>> == Known Risks ==
>> === Orphaned Products ===
>> The risk of the Coral project being orphaned is minimal. There is already 
>> plenty of work that arduously support different deployment characteristics, 
>> and we propose a general way to implement them with flexible and extensible 
>> configuration knobs. The domain of data processing is already of high 
>> interest, and this domain is expected to evolve continuously with various 
>> other purposes, such as resource disaggregation and using transient 
>> resources for better datacenter resource utilization.
>> 
>> === Inexperience with Open Source ===
>> The initial committers include PMC members and committers of other Apache 
>> projects. They have experience with open source projects, starting from 
>> their incubation to the top-level. They have been involved in the open 
>> source development process, and are familiar with releasing code under an 
>> open source license.
>> 
>> === Homogeneous Developers ===
>> The initial set of committers is from a limited set of organizations, but we 
>> expect to attract new contributors from diverse organizations and will thus 
>> grow organically once approved for incubation. Our prior experience with 
>> other open source projects will help various contributors to actively 
>> participate in our project.
>> 
>> === Reliance on Salaried Developers ===
>> Many developers are from Seoul National University. This is not applicable.
>> 
>> === Relationships with Other Apache Products ===
>> Coral positions itself among multiple Apache products. It runs on Apache 
>> REEF for container management. It also utilizes many useful development 
>> tools including Apache Maven, Apache Log4J, and multiple Apache Commons 
>> components. Coral supports the Apache Beam programming model for user 
>> applications. We are currently working on supporting the Apache Spark 
>> programming APIs as well.
>> 
>> === An Excessive Fascination with the Apache Brand ===
>> We hope to make Coral a powerful system for data processing, meeting various 
>> needs for different deployment characteristics, under a more variety of 
>> environments. We see the limitations of simply putting code on GitHub, and 
>> we believe the Apache community will help the growth of Coral for the 
>> project to become a positively impactful and innovative open source 
>> software. We believe Coral is a great fit for the Apache Software Foundation 
>> due to the collaboration it aims to achieve from the big data processing 
>> community.
>> 
>> == Documentation ==
>> The current documentation for Coral is at https://snuspl.github.io/coral/.
>> 
>> == Initial Source ==
>> The Coral codebase is currently hosted at https://github.com/snuspl/coral.
>> 
>> == External Dependencies ==
>> To the best of our knowledge, all Coral dependencies are distributed under 
>> Apache compatible licenses. Upon acceptance to the incubator, we would begin 
>> a thorough analysis of all transitive dependencies to verify this fact and 
>> further introduce license checking into the build and release process.
>> 
>> == Cryptography ==
>> Not applicable.
>> 
>> == Required Resources ==
>> === Mailing Lists ===
>> We will operate two mailing lists as follows:
>>   * Coral PMC discussions: priv...@coral.incubator.apache.org
>>   * Coral developers: d...@coral.incubator.apache.org
>> 
>> === Git Repositories ===
>> Upon incubation: https://github.com/apache/incubator-coral.
>> After the incubation, we would like to move the existing repo 
>> https://github.com/snuspl/coral to the Apache infrastructure
>> 
>> === Issue Tracking ===
>> Coral currently tracks its issues using the Github issue tracker: 
>> https://github.com/snuspl/coral/issues. We plan to migrate to Apache JIRA.
>> 
>> == Initial Committers ==
>>  * Byung-Gon Chun
>>  * Jeongyoon Eo
>>  * Geon-Woo Kim
>>  * Joo Yeon Kim
>>  * Gyewon Lee
>>  * Jung-Gil Lee
>>  * Sanha Lee
>>  * Wooyeon Lee
>>  * Yunseong Lee
>>  * JangHo Seo
>>  * Won Wook Song
>>  * Taegeon Um
>>  * Youngseok Yang
>> 
>> == Affiliations ==
>>  * SNU (Seoul National University)
>>    * Byung-Gon Chun
>>    * Jeongyoon Eo
>>    * Geon-Woo Kim
>>    * Gyewon Lee
>>    * Sanha Lee
>>    * Wooyeon Lee
>>    * Yunseong Lee
>>    * JangHo Seo
>>    * Won Wook Song
>>    * Taegeon Um
>>    * Youngseok Yang
>> 
>>  * LG
>>    * Jung-Gil Lee
>> 
>>  * Samsung
>>    * Joo Yeon Kim
>> 
>>  * Viva Republica
>>    * Geon-Woo Kim
>> 
>> == Sponsors ==
>> === Champions ===
>> Byung-Gon Chun
>> 
>> === Mentors ===
>>  * Hyunsik Choi
>>  * Byung-Gon Chun
>>  * Jean-Baptiste Onofré
>>  * Markus Weimer
>>  * Reynold Xin
>> 
>> === Sponsoring Entity ===
>> The Apache Incubator
>> 
>> 
>> Thanks!
>> Byung-Gon Chun
>> 
> 
> 
> 
> -- 
> Byung-Gon Chun

Craig L Russell
Secretary, Apache Software Foundation
c...@apache.org http://db.apache.org/jdo


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [RESULT] [VOTE] Accept Coral into the Apache Incubator

Reply via email to