Done. Thanks! -Gon
On Mon, Feb 5, 2018 at 2:26 PM, Craig Russell <apache....@gmail.com> wrote: > Please add the proposal to the official incubator proposal wiki list > > https://wiki.apache.org/incubator/ProjectProposals > > Craig > > > On Feb 4, 2018, at 1:10 PM, Byung-Gon Chun <bgc...@gmail.com> wrote: > > > > Hi, > > > > 72 hours has passed and the vote for accepting Coral into the Apache > > Incubator has passed with: > > > > 9 binding "+1" votes, 1 non-binding "+1" votes, and no "-1” votes. > > > > Binding votes: > > Kevin A. McGrail > > Davor Bonaci > > Dave Fisher > > Hyunsik Choi > > Leif Hedstrom > > Jean-Baptiste Onofré > > Romain Manni-Bucau > > Mark Struberg > > Byung-Gon Chun > > > > Non-binding votes: > > Clebert Suconic > > > > Thanks to everyone who voted. > > > > On Thu, Feb 1, 2018 at 11:07 PM, Byung-Gon Chun <bgc...@gmail.com> > wrote: > > > >> Hi all, > >> > >> I would like to start a VOTE to propose the Coral project as a podling > >> into the Apache Incubator. > >> > >> The ASF voting rules are described at https://www.apache.org/ > foundation/ > >> voting.html > >> > >> A vote for accepting a new Apache Incubator podling is a majority vote > for > >> which only Incubator PMC member votes are binding. > >> > >> This vote will run for at least 72 hours. Please VOTE as follows. > >> [] +1 Accept Coral into the Apache Incubator > >> [] +0 Abstain > >> [] -1 Do not accept Coral into the Apache Incubator because ... > >> > >> The proposal is listed below, but you can also access it on the wiki: > >> https://wiki.apache.org/incubator/CoralProposal > >> > >> = CoralProposal = > >> > >> == Abstract == > >> Coral is a data processing system for flexible employment with > different execution scenarios for various deployment characteristics on > clusters. > >> > >> == Proposal == > >> Today, there is a wide variety of data processing systems with > different designs for better performance and datacenter efficiency. They > include processing data on specific resource environments and running jobs > with specific attributes. Although each system successfully solves the > problems it targets, most systems are designed in the way that runtime > behaviors are built tightly inside the system core to hide the complexity > of distributed computing. This makes it hard for a single system to support > different deployment characteristics with different runtime behaviors > without substantial effort. > >> > >> Coral is a data processing system that aims to flexibly control the > runtime behaviors of a job to adapt to varying deployment characteristics. > Moreover, it provides a means of extending the system’s capabilities and > incorporating the extensions to the flexible job execution. > >> > >> In order to be able to easily modify runtime behaviors to adapt to > varying deployment characteristics, Coral exposes runtime behaviors to be > flexibly configured and modified at both compile-time and runtime through a > set of high-level graph pass interfaces. > >> > >> We hope to contribute to the big data processing community by enabling > more flexibility and extensibility in job executions. Furthermore, we can > benefit more together as a community when we work together as a community > to mature the system with more use cases and understanding of diverse > deployment characteristics. The Apache Software Foundation is the perfect > place to achieve these aspirations. > >> > >> == Background == > >> Many data processing systems have distinctive runtime behaviors > optimized and configured for specific deployment characteristics like > different resource environments and for handling special job attributes. > >> > >> For example, much research have been conducted to overcome the > challenge of running data processing jobs on cheap, unreliable transient > resources. Likewise, techniques for disaggregating different types of > resources, like memory, CPU and GPU, are being actively developed to use > datacenter resources more efficiently. Many researchers are also working to > run data processing jobs in even more diverse environments, such as across > distant datacenters. Similarly, for special job attributes, many works take > different approaches, such as runtime optimization, to solve problems like > data skew, and to optimize systems for data processing jobs with > small-scale input data. > >> > >> Although each of the systems performs well with the jobs and in the > environments they target, they perform poorly with unconsidered cases, and > do not consider supporting multiple deployment characteristics on a single > system in their designs. > >> > >> For an application writer to optimize an application to perform well on > a certain system engraved with its underlying behaviors, it requires a deep > understanding of the system itself, which is an overhead that often > requires a lot of time and effort. Moreover, for a developer to modify such > system behaviors, it requires modifications of the system core, which > requires an even deeper understanding of the system itself. > >> > >> With this background, Coral is designed to represent all of its jobs as > an Intermediate Representation (IR) DAG. In the Coral compiler, user > applications from various programming models (ex. Apache Beam) are > submitted, transformed to an IR DAG, and optimized/customized for the > deployment characteristics. In the IR DAG optimization phase, the DAG is > modified through a series of compiler “passes” which reshape or annotate > the DAG with an expression of the underlying runtime behaviors. The IR DAG > is then submitted as an execution plan for the Coral runtime. The runtime > includes the unmodified parts of data processing in the backbone which is > transparently integrated with configurable components exposed for further > extension. > >> > >> == Rationale == > >> Coral’s vision lies in providing means for flexibly supporting a wide > variety of job execution scenarios for users while facilitating system > developers to extend the execution framework with various functionalities > at the same time. The capabilities of the system can be extended as it > grows to meet a more variety of execution scenarios. We require inputs from > users and developers from diverse domains in order to make it a more > thriving and useful project. The Apache Software Foundation provides the > best tools and community to support this vision. > >> > >> == Initial Goals == > >> Initial goals will be to move the existing codebase to Apache and > integrate with the Apache development process. We further plan to develop > our system to meet the needs for more execution scenarios for a more > variety of deployment characteristics. > >> > >> == Current Status == > >> Coral codebase is currently hosted in a repository at github.com. The > current version has been developed by system developers at Seoul National > University, Viva Republica, Samsung, and LG. > >> > >> == Meritocracy == > >> We plan to strongly support meritocracy. We will discuss the > requirements in an open forum, and those that continuously contribute to > Coral with the passion to strengthen the system will be invited as > committers. Contributors that enrich Coral by providing various use cases, > various implementations of the configurable components including ideas for > optimization techniques will be especially welcome. Committers with a deep > understanding of the system’s technical aspects as a whole and its > philosophy will definitely be voted as the PMC. We will monitor community > participation so that privileges can be extended to those that contribute. > >> > >> == Community == > >> We hope to expand our contribution community by becoming an Apache > incubator project. The contributions will come from both users and system > developers interested in flexibility and extensibility of job executions > that Coral can support. We expect users to mainly contribute to diversify > the use cases and deployment characteristics, and developers to contribute > to implement them. > >> > >> == Alignment == > >> Apache Spark is one of many popular data processing frameworks. The > system is designed towards optimizing jobs using RDDs in memory and many > other optimizations built tightly within the framework. In contrast to > Spark, Coral aims to provide more flexibility for job execution in an easy > manner. > >> > >> Apache Tez enables developers to build complex task DAGs with control > over the control plane of job execution. In Coral, a high-level programming > layer (ex. Apache Beam) is automatically converted to a basic IR DAG and > can be converted to any IR DAG through a series of easy user writable > passes, that can both reshape and modify the annotation (of execution > properties) of the DAG. Moreover, Coral leaves more parts of the job > execution configurable, such as the scheduler and the data plane. As > opposed to providing a set of properties for solid optimization, Coral’s > configurable parts can be easily extended and explored by implementing the > pre-defined interfaces. For example, an arbitrary intermediate data store > can be added. > >> > >> Coral currently supports Apache Beam programs and we are working on > supporting Apache Spark programs as well. Coral also utilizes Apache REEF > for container management, which allows Coral to run in Apache YARN and > Apache Mesos clusters. If necessary, we plan to contribute to and > collaborate with these other Apache projects for the benefit of all. We > plan to extend such integrations with more Apache softwares. Apache > software foundation already hosts many major big-data systems, and we > expect to help further growth of the big-data community by having Coral > within the Apache foundation. > >> > >> == Known Risks == > >> === Orphaned Products === > >> The risk of the Coral project being orphaned is minimal. There is > already plenty of work that arduously support different deployment > characteristics, and we propose a general way to implement them with > flexible and extensible configuration knobs. The domain of data processing > is already of high interest, and this domain is expected to evolve > continuously with various other purposes, such as resource disaggregation > and using transient resources for better datacenter resource utilization. > >> > >> === Inexperience with Open Source === > >> The initial committers include PMC members and committers of other > Apache projects. They have experience with open source projects, starting > from their incubation to the top-level. They have been involved in the open > source development process, and are familiar with releasing code under an > open source license. > >> > >> === Homogeneous Developers === > >> The initial set of committers is from a limited set of organizations, > but we expect to attract new contributors from diverse organizations and > will thus grow organically once approved for incubation. Our prior > experience with other open source projects will help various contributors > to actively participate in our project. > >> > >> === Reliance on Salaried Developers === > >> Many developers are from Seoul National University. This is not > applicable. > >> > >> === Relationships with Other Apache Products === > >> Coral positions itself among multiple Apache products. It runs on > Apache REEF for container management. It also utilizes many useful > development tools including Apache Maven, Apache Log4J, and multiple Apache > Commons components. Coral supports the Apache Beam programming model for > user applications. We are currently working on supporting the Apache Spark > programming APIs as well. > >> > >> === An Excessive Fascination with the Apache Brand === > >> We hope to make Coral a powerful system for data processing, meeting > various needs for different deployment characteristics, under a more > variety of environments. We see the limitations of simply putting code on > GitHub, and we believe the Apache community will help the growth of Coral > for the project to become a positively impactful and innovative open source > software. We believe Coral is a great fit for the Apache Software > Foundation due to the collaboration it aims to achieve from the big data > processing community. > >> > >> == Documentation == > >> The current documentation for Coral is at https://snuspl.github.io/ > coral/. > >> > >> == Initial Source == > >> The Coral codebase is currently hosted at https://github.com/snuspl/ > coral. > >> > >> == External Dependencies == > >> To the best of our knowledge, all Coral dependencies are distributed > under Apache compatible licenses. Upon acceptance to the incubator, we > would begin a thorough analysis of all transitive dependencies to verify > this fact and further introduce license checking into the build and release > process. > >> > >> == Cryptography == > >> Not applicable. > >> > >> == Required Resources == > >> === Mailing Lists === > >> We will operate two mailing lists as follows: > >> * Coral PMC discussions: priv...@coral.incubator.apache.org > >> * Coral developers: d...@coral.incubator.apache.org > >> > >> === Git Repositories === > >> Upon incubation: https://github.com/apache/incubator-coral. > >> After the incubation, we would like to move the existing repo > https://github.com/snuspl/coral to the Apache infrastructure > >> > >> === Issue Tracking === > >> Coral currently tracks its issues using the Github issue tracker: > https://github.com/snuspl/coral/issues. We plan to migrate to Apache JIRA. > >> > >> == Initial Committers == > >> * Byung-Gon Chun > >> * Jeongyoon Eo > >> * Geon-Woo Kim > >> * Joo Yeon Kim > >> * Gyewon Lee > >> * Jung-Gil Lee > >> * Sanha Lee > >> * Wooyeon Lee > >> * Yunseong Lee > >> * JangHo Seo > >> * Won Wook Song > >> * Taegeon Um > >> * Youngseok Yang > >> > >> == Affiliations == > >> * SNU (Seoul National University) > >> * Byung-Gon Chun > >> * Jeongyoon Eo > >> * Geon-Woo Kim > >> * Gyewon Lee > >> * Sanha Lee > >> * Wooyeon Lee > >> * Yunseong Lee > >> * JangHo Seo > >> * Won Wook Song > >> * Taegeon Um > >> * Youngseok Yang > >> > >> * LG > >> * Jung-Gil Lee > >> > >> * Samsung > >> * Joo Yeon Kim > >> > >> * Viva Republica > >> * Geon-Woo Kim > >> > >> == Sponsors == > >> === Champions === > >> Byung-Gon Chun > >> > >> === Mentors === > >> * Hyunsik Choi > >> * Byung-Gon Chun > >> * Jean-Baptiste Onofré > >> * Markus Weimer > >> * Reynold Xin > >> > >> === Sponsoring Entity === > >> The Apache Incubator > >> > >> > >> Thanks! > >> Byung-Gon Chun > >> > > > > > > > > -- > > Byung-Gon Chun > > Craig L Russell > Secretary, Apache Software Foundation > c...@apache.org http://db.apache.org/jdo > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > -- Byung-Gon Chun