Please add the proposal to the official incubator proposal wiki list https://wiki.apache.org/incubator/ProjectProposals
Craig > On Feb 4, 2018, at 1:10 PM, Byung-Gon Chun <bgc...@gmail.com> wrote: > > Hi, > > 72 hours has passed and the vote for accepting Coral into the Apache > Incubator has passed with: > > 9 binding "+1" votes, 1 non-binding "+1" votes, and no "-1” votes. > > Binding votes: > Kevin A. McGrail > Davor Bonaci > Dave Fisher > Hyunsik Choi > Leif Hedstrom > Jean-Baptiste Onofré > Romain Manni-Bucau > Mark Struberg > Byung-Gon Chun > > Non-binding votes: > Clebert Suconic > > Thanks to everyone who voted. > > On Thu, Feb 1, 2018 at 11:07 PM, Byung-Gon Chun <bgc...@gmail.com> wrote: > >> Hi all, >> >> I would like to start a VOTE to propose the Coral project as a podling >> into the Apache Incubator. >> >> The ASF voting rules are described at https://www.apache.org/foundation/ >> voting.html >> >> A vote for accepting a new Apache Incubator podling is a majority vote for >> which only Incubator PMC member votes are binding. >> >> This vote will run for at least 72 hours. Please VOTE as follows. >> [] +1 Accept Coral into the Apache Incubator >> [] +0 Abstain >> [] -1 Do not accept Coral into the Apache Incubator because ... >> >> The proposal is listed below, but you can also access it on the wiki: >> https://wiki.apache.org/incubator/CoralProposal >> >> = CoralProposal = >> >> == Abstract == >> Coral is a data processing system for flexible employment with different >> execution scenarios for various deployment characteristics on clusters. >> >> == Proposal == >> Today, there is a wide variety of data processing systems with different >> designs for better performance and datacenter efficiency. They include >> processing data on specific resource environments and running jobs with >> specific attributes. Although each system successfully solves the problems >> it targets, most systems are designed in the way that runtime behaviors are >> built tightly inside the system core to hide the complexity of distributed >> computing. This makes it hard for a single system to support different >> deployment characteristics with different runtime behaviors without >> substantial effort. >> >> Coral is a data processing system that aims to flexibly control the runtime >> behaviors of a job to adapt to varying deployment characteristics. Moreover, >> it provides a means of extending the system’s capabilities and incorporating >> the extensions to the flexible job execution. >> >> In order to be able to easily modify runtime behaviors to adapt to varying >> deployment characteristics, Coral exposes runtime behaviors to be flexibly >> configured and modified at both compile-time and runtime through a set of >> high-level graph pass interfaces. >> >> We hope to contribute to the big data processing community by enabling more >> flexibility and extensibility in job executions. Furthermore, we can benefit >> more together as a community when we work together as a community to mature >> the system with more use cases and understanding of diverse deployment >> characteristics. The Apache Software Foundation is the perfect place to >> achieve these aspirations. >> >> == Background == >> Many data processing systems have distinctive runtime behaviors optimized >> and configured for specific deployment characteristics like different >> resource environments and for handling special job attributes. >> >> For example, much research have been conducted to overcome the challenge of >> running data processing jobs on cheap, unreliable transient resources. >> Likewise, techniques for disaggregating different types of resources, like >> memory, CPU and GPU, are being actively developed to use datacenter >> resources more efficiently. Many researchers are also working to run data >> processing jobs in even more diverse environments, such as across distant >> datacenters. Similarly, for special job attributes, many works take >> different approaches, such as runtime optimization, to solve problems like >> data skew, and to optimize systems for data processing jobs with small-scale >> input data. >> >> Although each of the systems performs well with the jobs and in the >> environments they target, they perform poorly with unconsidered cases, and >> do not consider supporting multiple deployment characteristics on a single >> system in their designs. >> >> For an application writer to optimize an application to perform well on a >> certain system engraved with its underlying behaviors, it requires a deep >> understanding of the system itself, which is an overhead that often requires >> a lot of time and effort. Moreover, for a developer to modify such system >> behaviors, it requires modifications of the system core, which requires an >> even deeper understanding of the system itself. >> >> With this background, Coral is designed to represent all of its jobs as an >> Intermediate Representation (IR) DAG. In the Coral compiler, user >> applications from various programming models (ex. Apache Beam) are >> submitted, transformed to an IR DAG, and optimized/customized for the >> deployment characteristics. In the IR DAG optimization phase, the DAG is >> modified through a series of compiler “passes” which reshape or annotate the >> DAG with an expression of the underlying runtime behaviors. The IR DAG is >> then submitted as an execution plan for the Coral runtime. The runtime >> includes the unmodified parts of data processing in the backbone which is >> transparently integrated with configurable components exposed for further >> extension. >> >> == Rationale == >> Coral’s vision lies in providing means for flexibly supporting a wide >> variety of job execution scenarios for users while facilitating system >> developers to extend the execution framework with various functionalities at >> the same time. The capabilities of the system can be extended as it grows to >> meet a more variety of execution scenarios. We require inputs from users and >> developers from diverse domains in order to make it a more thriving and >> useful project. The Apache Software Foundation provides the best tools and >> community to support this vision. >> >> == Initial Goals == >> Initial goals will be to move the existing codebase to Apache and integrate >> with the Apache development process. We further plan to develop our system >> to meet the needs for more execution scenarios for a more variety of >> deployment characteristics. >> >> == Current Status == >> Coral codebase is currently hosted in a repository at github.com. The >> current version has been developed by system developers at Seoul National >> University, Viva Republica, Samsung, and LG. >> >> == Meritocracy == >> We plan to strongly support meritocracy. We will discuss the requirements in >> an open forum, and those that continuously contribute to Coral with the >> passion to strengthen the system will be invited as committers. Contributors >> that enrich Coral by providing various use cases, various implementations of >> the configurable components including ideas for optimization techniques will >> be especially welcome. Committers with a deep understanding of the system’s >> technical aspects as a whole and its philosophy will definitely be voted as >> the PMC. We will monitor community participation so that privileges can be >> extended to those that contribute. >> >> == Community == >> We hope to expand our contribution community by becoming an Apache incubator >> project. The contributions will come from both users and system developers >> interested in flexibility and extensibility of job executions that Coral can >> support. We expect users to mainly contribute to diversify the use cases and >> deployment characteristics, and developers to contribute to implement them. >> >> == Alignment == >> Apache Spark is one of many popular data processing frameworks. The system >> is designed towards optimizing jobs using RDDs in memory and many other >> optimizations built tightly within the framework. In contrast to Spark, >> Coral aims to provide more flexibility for job execution in an easy manner. >> >> Apache Tez enables developers to build complex task DAGs with control over >> the control plane of job execution. In Coral, a high-level programming layer >> (ex. Apache Beam) is automatically converted to a basic IR DAG and can be >> converted to any IR DAG through a series of easy user writable passes, that >> can both reshape and modify the annotation (of execution properties) of the >> DAG. Moreover, Coral leaves more parts of the job execution configurable, >> such as the scheduler and the data plane. As opposed to providing a set of >> properties for solid optimization, Coral’s configurable parts can be easily >> extended and explored by implementing the pre-defined interfaces. For >> example, an arbitrary intermediate data store can be added. >> >> Coral currently supports Apache Beam programs and we are working on >> supporting Apache Spark programs as well. Coral also utilizes Apache REEF >> for container management, which allows Coral to run in Apache YARN and >> Apache Mesos clusters. If necessary, we plan to contribute to and >> collaborate with these other Apache projects for the benefit of all. We plan >> to extend such integrations with more Apache softwares. Apache software >> foundation already hosts many major big-data systems, and we expect to help >> further growth of the big-data community by having Coral within the Apache >> foundation. >> >> == Known Risks == >> === Orphaned Products === >> The risk of the Coral project being orphaned is minimal. There is already >> plenty of work that arduously support different deployment characteristics, >> and we propose a general way to implement them with flexible and extensible >> configuration knobs. The domain of data processing is already of high >> interest, and this domain is expected to evolve continuously with various >> other purposes, such as resource disaggregation and using transient >> resources for better datacenter resource utilization. >> >> === Inexperience with Open Source === >> The initial committers include PMC members and committers of other Apache >> projects. They have experience with open source projects, starting from >> their incubation to the top-level. They have been involved in the open >> source development process, and are familiar with releasing code under an >> open source license. >> >> === Homogeneous Developers === >> The initial set of committers is from a limited set of organizations, but we >> expect to attract new contributors from diverse organizations and will thus >> grow organically once approved for incubation. Our prior experience with >> other open source projects will help various contributors to actively >> participate in our project. >> >> === Reliance on Salaried Developers === >> Many developers are from Seoul National University. This is not applicable. >> >> === Relationships with Other Apache Products === >> Coral positions itself among multiple Apache products. It runs on Apache >> REEF for container management. It also utilizes many useful development >> tools including Apache Maven, Apache Log4J, and multiple Apache Commons >> components. Coral supports the Apache Beam programming model for user >> applications. We are currently working on supporting the Apache Spark >> programming APIs as well. >> >> === An Excessive Fascination with the Apache Brand === >> We hope to make Coral a powerful system for data processing, meeting various >> needs for different deployment characteristics, under a more variety of >> environments. We see the limitations of simply putting code on GitHub, and >> we believe the Apache community will help the growth of Coral for the >> project to become a positively impactful and innovative open source >> software. We believe Coral is a great fit for the Apache Software Foundation >> due to the collaboration it aims to achieve from the big data processing >> community. >> >> == Documentation == >> The current documentation for Coral is at https://snuspl.github.io/coral/. >> >> == Initial Source == >> The Coral codebase is currently hosted at https://github.com/snuspl/coral. >> >> == External Dependencies == >> To the best of our knowledge, all Coral dependencies are distributed under >> Apache compatible licenses. Upon acceptance to the incubator, we would begin >> a thorough analysis of all transitive dependencies to verify this fact and >> further introduce license checking into the build and release process. >> >> == Cryptography == >> Not applicable. >> >> == Required Resources == >> === Mailing Lists === >> We will operate two mailing lists as follows: >> * Coral PMC discussions: priv...@coral.incubator.apache.org >> * Coral developers: d...@coral.incubator.apache.org >> >> === Git Repositories === >> Upon incubation: https://github.com/apache/incubator-coral. >> After the incubation, we would like to move the existing repo >> https://github.com/snuspl/coral to the Apache infrastructure >> >> === Issue Tracking === >> Coral currently tracks its issues using the Github issue tracker: >> https://github.com/snuspl/coral/issues. We plan to migrate to Apache JIRA. >> >> == Initial Committers == >> * Byung-Gon Chun >> * Jeongyoon Eo >> * Geon-Woo Kim >> * Joo Yeon Kim >> * Gyewon Lee >> * Jung-Gil Lee >> * Sanha Lee >> * Wooyeon Lee >> * Yunseong Lee >> * JangHo Seo >> * Won Wook Song >> * Taegeon Um >> * Youngseok Yang >> >> == Affiliations == >> * SNU (Seoul National University) >> * Byung-Gon Chun >> * Jeongyoon Eo >> * Geon-Woo Kim >> * Gyewon Lee >> * Sanha Lee >> * Wooyeon Lee >> * Yunseong Lee >> * JangHo Seo >> * Won Wook Song >> * Taegeon Um >> * Youngseok Yang >> >> * LG >> * Jung-Gil Lee >> >> * Samsung >> * Joo Yeon Kim >> >> * Viva Republica >> * Geon-Woo Kim >> >> == Sponsors == >> === Champions === >> Byung-Gon Chun >> >> === Mentors === >> * Hyunsik Choi >> * Byung-Gon Chun >> * Jean-Baptiste Onofré >> * Markus Weimer >> * Reynold Xin >> >> === Sponsoring Entity === >> The Apache Incubator >> >> >> Thanks! >> Byung-Gon Chun >> > > > > -- > Byung-Gon Chun Craig L Russell Secretary, Apache Software Foundation c...@apache.org http://db.apache.org/jdo --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org