[RESULT] [VOTE] Accept Coral into the Apache Incubator

Byung-Gon Chun Sun, 04 Feb 2018 13:12:06 -0800

Hi,

72 hours has passed and the vote for accepting Coral into the Apache
Incubator has passed with:


9 binding "+1" votes,  1 non-binding "+1" votes,  and no "-1” votes.

Binding votes:
Kevin A. McGrail
Davor Bonaci
Dave Fisher
Hyunsik Choi
Leif Hedstrom
Jean-Baptiste Onofré
Romain Manni-Bucau
Mark Struberg
Byung-Gon Chun

Non-binding votes:
Clebert Suconic

Thanks to everyone who voted.

On Thu, Feb 1, 2018 at 11:07 PM, Byung-Gon Chun <bgc...@gmail.com> wrote:

> Hi all,
>
> I would like to start a VOTE to propose the Coral project as a podling
> into the Apache Incubator.
>
> The ASF voting rules are described at https://www.apache.org/foundation/
> voting.html
>
> A vote for accepting a new Apache Incubator podling is a majority vote for
> which only Incubator PMC member votes are binding.
>
> This vote will run for at least 72 hours. Please VOTE as follows.
> [] +1 Accept Coral into the Apache Incubator
> [] +0 Abstain
> [] -1 Do not accept Coral into the Apache Incubator because ...
>
> The proposal is listed below, but you can also access it on the wiki:
> https://wiki.apache.org/incubator/CoralProposal
>
> = CoralProposal =
>
> == Abstract ==
> Coral is a data processing system for flexible employment with different 
> execution scenarios for various deployment characteristics on clusters.
>
> == Proposal ==
> Today, there is a wide variety of data processing systems with different 
> designs for better performance and datacenter efficiency. They include 
> processing data on specific resource environments and running jobs with 
> specific attributes. Although each system successfully solves the problems it 
> targets, most systems are designed in the way that runtime behaviors are 
> built tightly inside the system core to hide the complexity of distributed 
> computing. This makes it hard for a single system to support different 
> deployment characteristics with different runtime behaviors without 
> substantial effort.
>
> Coral is a data processing system that aims to flexibly control the runtime 
> behaviors of a job to adapt to varying deployment characteristics. Moreover, 
> it provides a means of extending the system’s capabilities and incorporating 
> the extensions to the flexible job execution.
>
> In order to be able to easily modify runtime behaviors to adapt to varying 
> deployment characteristics, Coral exposes runtime behaviors to be flexibly 
> configured and modified at both compile-time and runtime through a set of 
> high-level graph pass interfaces.
>
> We hope to contribute to the big data processing community by enabling more 
> flexibility and extensibility in job executions. Furthermore, we can benefit 
> more together as a community when we work together as a community to mature 
> the system with more use cases and understanding of diverse deployment 
> characteristics. The Apache Software Foundation is the perfect place to 
> achieve these aspirations.
>
> == Background ==
> Many data processing systems have distinctive runtime behaviors optimized and 
> configured for specific deployment characteristics like different resource 
> environments and for handling special job attributes.
>
> For example, much research have been conducted to overcome the challenge of 
> running data processing jobs on cheap, unreliable transient resources. 
> Likewise, techniques for disaggregating different types of resources, like 
> memory, CPU and GPU, are being actively developed to use datacenter resources 
> more efficiently. Many researchers are also working to run data processing 
> jobs in even more diverse environments, such as across distant datacenters. 
> Similarly, for special job attributes, many works take different approaches, 
> such as runtime optimization, to solve problems like data skew, and to 
> optimize systems for data processing jobs with small-scale input data.
>
> Although each of the systems performs well with the jobs and in the 
> environments they target, they perform poorly with unconsidered cases, and do 
> not consider supporting multiple deployment characteristics on a single 
> system in their designs.
>
> For an application writer to optimize an application to perform well on a 
> certain system engraved with its underlying behaviors, it requires a deep 
> understanding of the system itself, which is an overhead that often requires 
> a lot of time and effort. Moreover, for a developer to modify such system 
> behaviors, it requires modifications of the system core, which requires an 
> even deeper understanding of the system itself.
>
> With this background, Coral is designed to represent all of its jobs as an 
> Intermediate Representation (IR) DAG. In the Coral compiler, user 
> applications from various programming models (ex. Apache Beam) are submitted, 
> transformed to an IR DAG, and optimized/customized for the deployment 
> characteristics. In the IR DAG optimization phase, the DAG is modified 
> through a series of compiler “passes” which reshape or annotate the DAG with 
> an expression of the underlying runtime behaviors. The IR DAG is then 
> submitted as an execution plan for the Coral runtime. The runtime includes 
> the unmodified parts of data processing in the backbone which is 
> transparently integrated with configurable components exposed for further 
> extension.
>
> == Rationale ==
> Coral’s vision lies in providing means for flexibly supporting a wide variety 
> of job execution scenarios for users while facilitating system developers to 
> extend the execution framework with various functionalities at the same time. 
> The capabilities of the system can be extended as it grows to meet a more 
> variety of execution scenarios. We require inputs from users and developers 
> from diverse domains in order to make it a more thriving and useful project. 
> The Apache Software Foundation provides the best tools and community to 
> support this vision.
>
> == Initial Goals ==
> Initial goals will be to move the existing codebase to Apache and integrate 
> with the Apache development process. We further plan to develop our system to 
> meet the needs for more execution scenarios for a more variety of deployment 
> characteristics.
>
> == Current Status ==
> Coral codebase is currently hosted in a repository at github.com. The current 
> version has been developed by system developers at Seoul National University, 
> Viva Republica, Samsung, and LG.
>
> == Meritocracy ==
> We plan to strongly support meritocracy. We will discuss the requirements in 
> an open forum, and those that continuously contribute to Coral with the 
> passion to strengthen the system will be invited as committers. Contributors 
> that enrich Coral by providing various use cases, various implementations of 
> the configurable components including ideas for optimization techniques will 
> be especially welcome. Committers with a deep understanding of the system’s 
> technical aspects as a whole and its philosophy will definitely be voted as 
> the PMC. We will monitor community participation so that privileges can be 
> extended to those that contribute.
>
> == Community ==
> We hope to expand our contribution community by becoming an Apache incubator 
> project. The contributions will come from both users and system developers 
> interested in flexibility and extensibility of job executions that Coral can 
> support. We expect users to mainly contribute to diversify the use cases and 
> deployment characteristics, and developers to  contribute to implement them.
>
> == Alignment ==
> Apache Spark is one of many popular data processing frameworks. The system is 
> designed towards optimizing jobs using RDDs in memory and many other 
> optimizations built tightly within the framework. In contrast to Spark, Coral 
> aims to provide more flexibility for job execution in an easy manner.
>
> Apache Tez enables developers to build complex task DAGs with control over 
> the control plane of job execution. In Coral, a high-level programming layer 
> (ex. Apache Beam) is automatically converted to a basic IR DAG and can be 
> converted to any IR DAG through a series of easy user writable passes, that 
> can both reshape and modify the annotation (of execution properties) of the 
> DAG. Moreover, Coral leaves more parts of the job execution configurable, 
> such as the scheduler and the data plane. As opposed to providing a set of 
> properties for solid optimization, Coral’s configurable parts can be easily 
> extended and explored by implementing the pre-defined interfaces. For 
> example, an arbitrary intermediate data store can be added.
>
> Coral currently supports Apache Beam programs and we are working on 
> supporting Apache Spark programs as well. Coral also utilizes Apache REEF for 
> container management, which allows Coral to run in Apache YARN and Apache 
> Mesos clusters. If necessary, we plan to contribute to and collaborate with 
> these other Apache projects for the benefit of all. We plan to extend such 
> integrations with more Apache softwares. Apache software foundation already 
> hosts many major big-data systems, and we expect to help further growth of 
> the big-data community by having Coral within the Apache foundation.
>
> == Known Risks ==
> === Orphaned Products ===
> The risk of the Coral project being orphaned is minimal. There is already 
> plenty of work that arduously support different deployment characteristics, 
> and we propose a general way to implement them with flexible and extensible 
> configuration knobs. The domain of data processing is already of high 
> interest, and this domain is expected to evolve continuously with various 
> other purposes, such as resource disaggregation and using transient resources 
> for better datacenter resource utilization.
>
> === Inexperience with Open Source ===
> The initial committers include PMC members and committers of other Apache 
> projects. They have experience with open source projects, starting from their 
> incubation to the top-level. They have been involved in the open source 
> development process, and are familiar with releasing code under an open 
> source license.
>
> === Homogeneous Developers ===
> The initial set of committers is from a limited set of organizations, but we 
> expect to attract new contributors from diverse organizations and will thus 
> grow organically once approved for incubation. Our prior experience with 
> other open source projects will help various contributors to actively 
> participate in our project.
>
> === Reliance on Salaried Developers ===
> Many developers are from Seoul National University. This is not applicable.
>
> === Relationships with Other Apache Products ===
> Coral positions itself among multiple Apache products. It runs on Apache REEF 
> for container management. It also utilizes many useful development tools 
> including Apache Maven, Apache Log4J, and multiple Apache Commons components. 
> Coral supports the Apache Beam programming model for user applications. We 
> are currently working on supporting the Apache Spark programming APIs as well.
>
> === An Excessive Fascination with the Apache Brand ===
> We hope to make Coral a powerful system for data processing, meeting various 
> needs for different deployment characteristics, under a more variety of 
> environments. We see the limitations of simply putting code on GitHub, and we 
> believe the Apache community will help the growth of Coral for the project to 
> become a positively impactful and innovative open source software. We believe 
> Coral is a great fit for the Apache Software Foundation due to the 
> collaboration it aims to achieve from the big data processing community.
>
> == Documentation ==
> The current documentation for Coral is at https://snuspl.github.io/coral/.
>
> == Initial Source ==
> The Coral codebase is currently hosted at https://github.com/snuspl/coral.
>
> == External Dependencies ==
> To the best of our knowledge, all Coral dependencies are distributed under 
> Apache compatible licenses. Upon acceptance to the incubator, we would begin 
> a thorough analysis of all transitive dependencies to verify this fact and 
> further introduce license checking into the build and release process.
>
> == Cryptography ==
> Not applicable.
>
> == Required Resources ==
> === Mailing Lists ===
> We will operate two mailing lists as follows:
>    * Coral PMC discussions: priv...@coral.incubator.apache.org
>    * Coral developers: d...@coral.incubator.apache.org
>
> === Git Repositories ===
> Upon incubation: https://github.com/apache/incubator-coral.
> After the incubation, we would like to move the existing repo 
> https://github.com/snuspl/coral to the Apache infrastructure
>
> === Issue Tracking ===
> Coral currently tracks its issues using the Github issue tracker: 
> https://github.com/snuspl/coral/issues. We plan to migrate to Apache JIRA.
>
> == Initial Committers ==
>   * Byung-Gon Chun
>   * Jeongyoon Eo
>   * Geon-Woo Kim
>   * Joo Yeon Kim
>   * Gyewon Lee
>   * Jung-Gil Lee
>   * Sanha Lee
>   * Wooyeon Lee
>   * Yunseong Lee
>   * JangHo Seo
>   * Won Wook Song
>   * Taegeon Um
>   * Youngseok Yang
>
> == Affiliations ==
>   * SNU (Seoul National University)
>     * Byung-Gon Chun
>     * Jeongyoon Eo
>     * Geon-Woo Kim
>     * Gyewon Lee
>     * Sanha Lee
>     * Wooyeon Lee
>     * Yunseong Lee
>     * JangHo Seo
>     * Won Wook Song
>     * Taegeon Um
>     * Youngseok Yang
>
>   * LG
>     * Jung-Gil Lee
>
>   * Samsung
>     * Joo Yeon Kim
>
>   * Viva Republica
>     * Geon-Woo Kim
>
> == Sponsors ==
> === Champions ===
> Byung-Gon Chun
>
> === Mentors ===
>   * Hyunsik Choi
>   * Byung-Gon Chun
>   * Jean-Baptiste Onofré
>   * Markus Weimer
>   * Reynold Xin
>
> === Sponsoring Entity ===
> The Apache Incubator
>
>
> Thanks!
> Byung-Gon Chun
>



-- 
Byung-Gon Chun

[RESULT] [VOTE] Accept Coral into the Apache Incubator

Reply via email to