Thank you for all the information! It looks like Surf doesn't work. If possible, we'd like to keep Onyx. Another name we came up with is Coral.
Thanks! -Gon On Sun, Jan 28, 2018 at 4:21 AM, Leif Hedstrom <zw...@apache.org> wrote: > Did we rule out Onyx for sure? Just because some other project might use > it on say github doesn’t necessarily exclude us from having an Apache Onyx? > > FWIW, I agree that surf is too similar in pronunciation to Apache serf. :) > > Cheers, > > — Leif > > > On Jan 27, 2018, at 07:31, Dave Fisher <dave2w...@comcast.net> wrote: > > > > Checking “Serf Software” which sounds the same. > > > > (1) there is already Apache Serf > > (2) Serf is a product from Hashicorp at https://www.serf.io/. This > would definitely confuse as it is apparently comparable to ZooKeeper. > > > > Regards, > > Dave > > > > Sent from my iPhone > > > >> On Jan 27, 2018, at 3:12 AM, sebb <seb...@gmail.com> wrote: > >> > >> A brief search for 'Surf Software' shows quite a few hits. > >> I have not looked to see if they would be likely to be confused with > >> this project or cause problems for others. > >> > >> But it as though there might be a problem: > >> Surfer - Golden Software > >> surf @ sourceforge > >> Surf Software company > >> > >> > >>> On 27 January 2018 at 08:03, Byung-Gon Chun <bgc...@gmail.com> wrote: > >>> Since we cannot use the name Onyx, we would like to change the project > name > >>> to Surf. > >>> I hope that this name works. > >>> > >>> -Gon > >>> > >>> --- > >>> Byung-Gon Chun > >>> > >>> > >>>> On Sat, Jan 27, 2018 at 4:57 AM, Byung-Gon Chun <bgc...@gmail.com> > wrote: > >>>> > >>>> > >>>> > >>>>> On Sat, Jan 27, 2018 at 4:09 AM, Davor Bonaci <da...@apache.org> > wrote: > >>>>> > >>>>> Great work -- I think this technology has a lot of promise, and I'd > love > >>>>> to > >>>>> see its evolution inside the Foundation. > >>>>> > >>>>> > >>>> Thanks, Davor! > >>>> > >>>> > >>>>> Parts of it, like the Onyx Intermediate Representation [1], overlap > with > >>>>> the work-in-progress inside the Apache Beam project ("portability"). > We'd > >>>>> love to work together on this -- would you be open to such > collaboration? > >>>>> If so, it may not be necessary to start from scratch, and leverage > the > >>>>> work > >>>>> already done. > >>>>> > >>>>> > >>>> Sure. We're open to collaboration. > >>>> > >>>> > >>>>> Regarding the name, Onyx would likely have to be renamed, due to a > >>>>> conflict > >>>>> with a related technology [2]. > >>>>> > >>>>> > >>>> Thanks for pointing it out. It's difficult to come up with a good > short > >>>> name. :) > >>>> Do you have any suggestion? > >>>> > >>>> Thanks! > >>>> -Gon > >>>> > >>>> --- > >>>> Byung-Gon Chun > >>>> > >>>> > >>>> > >>>>> Davor > >>>>> > >>>>> [1] https://snuspl.github.io/onyx/docs/ir/ > >>>>> [2] http://www.onyxplatform.org/ > >>>>> > >>>>>> On Thu, Jan 25, 2018 at 3:28 PM, Byung-Gon Chun <bgc...@gmail.com> > wrote: > >>>>>> > >>>>>> Dear Apache Incubator Community, > >>>>>> > >>>>>> Please accept the following proposal for presentation and > discussion: > >>>>>> https://wiki.apache.org/incubator/OnyxProposal > >>>>>> > >>>>>> Onyx is a data processing system that aims to flexibly control the > >>>>> runtime > >>>>>> behaviors of a job to adapt to varying deployment characteristics > (e.g., > >>>>>> harnessing transient resources in datacenters, cross-datacenter > >>>>> deployment, > >>>>>> changing runtime based on job characteristics, etc.). Onyx provides > >>>>> ways to > >>>>>> extend the system’s capabilities and incorporate the extensions to > the > >>>>>> flexible job execution. > >>>>>> Onyx translates a user program (e.g., Apache Beam, Apache Spark) > into an > >>>>>> Intermediate Representation (IR) DAG, which Onyx optimizes and > deploys > >>>>>> based on a deployment policy. > >>>>>> > >>>>>> I've attached the proposal below. > >>>>>> > >>>>>> Best regards, > >>>>>> Byung-Gon Chun > >>>>>> > >>>>>> = OnyxProposal = > >>>>>> > >>>>>> == Abstract == > >>>>>> Onyx is a data processing system for flexible employment with > >>>>>> different execution scenarios for various deployment characteristics > >>>>>> on clusters. > >>>>>> > >>>>>> == Proposal == > >>>>>> Today, there is a wide variety of data processing systems with > >>>>>> different designs for better performance and datacenter efficiency. > >>>>>> They include processing data on specific resource environments and > >>>>>> running jobs with specific attributes. Although each system > >>>>>> successfully solves the problems it targets, most systems are > designed > >>>>>> in the way that runtime behaviors are built tightly inside the > system > >>>>>> core to hide the complexity of distributed computing. This makes it > >>>>>> hard for a single system to support different deployment > >>>>>> characteristics with different runtime behaviors without substantial > >>>>>> effort. > >>>>>> > >>>>>> Onyx is a data processing system that aims to flexibly control the > >>>>>> runtime behaviors of a job to adapt to varying deployment > >>>>>> characteristics. Moreover, it provides a means of extending the > >>>>>> system’s capabilities and incorporating the extensions to the > flexible > >>>>>> job execution. > >>>>>> > >>>>>> In order to be able to easily modify runtime behaviors to adapt to > >>>>>> varying deployment characteristics, Onyx exposes runtime behaviors > to > >>>>>> be flexibly configured and modified at both compile-time and runtime > >>>>>> through a set of high-level graph pass interfaces. > >>>>>> > >>>>>> We hope to contribute to the big data processing community by > enabling > >>>>>> more flexibility and extensibility in job executions. Furthermore, > we > >>>>>> can benefit more together as a community when we work together as a > >>>>>> community to mature the system with more use cases and understanding > >>>>>> of diverse deployment characteristics. The Apache Software > Foundation > >>>>>> is the perfect place to achieve these aspirations. > >>>>>> > >>>>>> == Background == > >>>>>> Many data processing systems have distinctive runtime behaviors > >>>>>> optimized and configured for specific deployment characteristics > like > >>>>>> different resource environments and for handling special job > >>>>>> attributes. > >>>>>> > >>>>>> For example, much research have been conducted to overcome the > >>>>>> challenge of running data processing jobs on cheap, unreliable > >>>>>> transient resources. Likewise, techniques for disaggregating > different > >>>>>> types of resources, like memory, CPU and GPU, are being actively > >>>>>> developed to use datacenter resources more efficiently. Many > >>>>>> researchers are also working to run data processing jobs in even > more > >>>>>> diverse environments, such as across distant datacenters. Similarly, > >>>>>> for special job attributes, many works take different approaches, > such > >>>>>> as runtime optimization, to solve problems like data skew, and to > >>>>>> optimize systems for data processing jobs with small-scale input > data. > >>>>>> > >>>>>> Although each of the systems performs well with the jobs and in the > >>>>>> environments they target, they perform poorly with unconsidered > cases, > >>>>>> and do not consider supporting multiple deployment characteristics > on > >>>>>> a single system in their designs. > >>>>>> > >>>>>> For an application writer to optimize an application to perform well > >>>>>> on a certain system engraved with its underlying behaviors, it > >>>>>> requires a deep understanding of the system itself, which is an > >>>>>> overhead that often requires a lot of time and effort. Moreover, > for a > >>>>>> developer to modify such system behaviors, it requires modifications > >>>>>> of the system core, which requires an even deeper understanding of > the > >>>>>> system itself. > >>>>>> > >>>>>> With this background, Onyx is designed to represent all of its jobs > as > >>>>>> an Intermediate Representation (IR) DAG. In the Onyx compiler, user > >>>>>> applications from various programming models (ex. Apache Beam) are > >>>>>> submitted, transformed to an IR DAG, and optimized/customized for > the > >>>>>> deployment characteristics. In the IR DAG optimization phase, the > DAG > >>>>>> is modified through a series of compiler “passes” which reshape or > >>>>>> annotate the DAG with an expression of the underlying runtime > >>>>>> behaviors. The IR DAG is then submitted as an execution plan for the > >>>>>> Onyx runtime. The runtime includes the unmodified parts of data > >>>>>> processing in the backbone which is transparently integrated with > >>>>>> configurable components exposed for further extension. > >>>>>> > >>>>>> == Rationale == > >>>>>> Onyx’s vision lies in providing means for flexibly supporting a wide > >>>>>> variety of job execution scenarios for users while facilitating > system > >>>>>> developers to extend the execution framework with various > >>>>>> functionalities at the same time. The capabilities of the system can > >>>>>> be extended as it grows to meet a more variety of execution > scenarios. > >>>>>> We require inputs from users and developers from diverse domains in > >>>>>> order to make it a more thriving and useful project. The Apache > >>>>>> Software Foundation provides the best tools and community to support > >>>>>> this vision. > >>>>>> > >>>>>> == Initial Goals == > >>>>>> Initial goals will be to move the existing codebase to Apache and > >>>>>> integrate with the Apache development process. We further plan to > >>>>>> develop our system to meet the needs for more execution scenarios > for > >>>>>> a more variety of deployment characteristics. > >>>>>> > >>>>>> == Current Status == > >>>>>> Onyx codebase is currently hosted in a repository at github.com. > The > >>>>>> current version has been developed by system developers at Seoul > >>>>>> National University, Viva Republica, Samsung, and LG. > >>>>>> > >>>>>> == Meritocracy == > >>>>>> We plan to strongly support meritocracy. We will discuss the > >>>>>> requirements in an open forum, and those that continuously > contribute > >>>>>> to Onyx with the passion to strengthen the system will be invited as > >>>>>> committers. Contributors that enrich Onyx by providing various use > >>>>>> cases, various implementations of the configurable components > >>>>>> including ideas for optimization techniques will be especially > >>>>>> welcome. Committers with a deep understanding of the system’s > >>>>>> technical aspects as a whole and its philosophy will definitely be > >>>>>> voted as the PMC. We will monitor community participation so that > >>>>>> privileges can be extended to those that contribute. > >>>>>> > >>>>>> == Community == > >>>>>> We hope to expand our contribution community by becoming an Apache > >>>>>> incubator project. The contributions will come from both users and > >>>>>> system developers interested in flexibility and extensibility of job > >>>>>> executions that Onyx can support. We expect users to mainly > contribute > >>>>>> to diversify the use cases and deployment characteristics, and > >>>>>> developers to contribute to implement them. > >>>>>> > >>>>>> == Alignment == > >>>>>> Apache Spark is one of many popular data processing frameworks. The > >>>>>> system is designed towards optimizing jobs using RDDs in memory and > >>>>>> many other optimizations built tightly within the framework. In > >>>>>> contrast to Spark, Onyx aims to provide more flexibility for job > >>>>>> execution in an easy manner. > >>>>>> > >>>>>> Apache Tez enables developers to build complex task DAGs with > control > >>>>>> over the control plane of job execution. In Onyx, a high-level > >>>>>> programming layer (ex. Apache Beam) is automatically converted to a > >>>>>> basic IR DAG and can be converted to any IR DAG through a series of > >>>>>> easy user writable passes, that can both reshape and modify the > >>>>>> annotation (of execution properties) of the DAG. Moreover, Onyx > leaves > >>>>>> more parts of the job execution configurable, such as the scheduler > >>>>>> and the data plane. As opposed to providing a set of properties for > >>>>>> solid optimization, Onyx’s configurable parts can be easily extended > >>>>>> and explored by implementing the pre-defined interfaces. For > example, > >>>>>> an arbitrary intermediate data store can be added. > >>>>>> > >>>>>> Onyx currently supports Apache Beam programs and we are working on > >>>>>> supporting Apache Spark programs as well. Onyx also utilizes Apache > >>>>>> REEF for container management, which allows Onyx to run in Apache > YARN > >>>>>> and Apache Mesos clusters. If necessary, we plan to contribute to > and > >>>>>> collaborate with these other Apache projects for the benefit of all. > >>>>>> We plan to extend such integrations with more Apache softwares. > Apache > >>>>>> software foundation already hosts many major big-data systems, and > we > >>>>>> expect to help further growth of the big-data community by having > Onyx > >>>>>> within the Apache foundation. > >>>>>> > >>>>>> == Known Risks == > >>>>>> === Orphaned Products === > >>>>>> The risk of the Onyx project being orphaned is minimal. There is > >>>>>> already plenty of work that arduously support different deployment > >>>>>> characteristics, and we propose a general way to implement them with > >>>>>> flexible and extensible configuration knobs. The domain of data > >>>>>> processing is already of high interest, and this domain is expected > to > >>>>>> evolve continuously with various other purposes, such as resource > >>>>>> disaggregation and using transient resources for better datacenter > >>>>>> resource utilization. > >>>>>> > >>>>>> === Inexperience with Open Source === > >>>>>> The initial committers include PMC members and committers of other > >>>>>> Apache projects. They have experience with open source projects, > >>>>>> starting from their incubation to the top-level. They have been > >>>>>> involved in the open source development process, and are familiar > with > >>>>>> releasing code under an open source license. > >>>>>> > >>>>>> === Homogeneous Developers === > >>>>>> The initial set of committers is from a limited set of > organizations, > >>>>>> but we expect to attract new contributors from diverse organizations > >>>>>> and will thus grow organically once approved for incubation. Our > prior > >>>>>> experience with other open source projects will help various > >>>>>> contributors to actively participate in our project. > >>>>>> > >>>>>> === Reliance on Salaried Developers === > >>>>>> Many developers are from Seoul National University. This is not > >>>>> applicable. > >>>>>> > >>>>>> === Relationships with Other Apache Products === > >>>>>> Onyx positions itself among multiple Apache products. It runs on > >>>>>> Apache REEF for container management. It also utilizes many useful > >>>>>> development tools including Apache Maven, Apache Log4J, and multiple > >>>>>> Apache Commons components. Onyx supports the Apache Beam programming > >>>>>> model for user applications. We are currently working on supporting > >>>>>> the Apache Spark programming APIs as well. > >>>>>> > >>>>>> === An Excessive Fascination with the Apache Brand === > >>>>>> We hope to make Onyx a powerful system for data processing, meeting > >>>>>> various needs for different deployment characteristics, under a more > >>>>>> variety of environments. We see the limitations of simply putting > code > >>>>>> on GitHub, and we believe the Apache community will help the growth > of > >>>>>> Onyx for the project to become a positively impactful and innovative > >>>>>> open source software. We believe Onyx is a great fit for the Apache > >>>>>> Software Foundation due to the collaboration it aims to achieve from > >>>>>> the big data processing community. > >>>>>> > >>>>>> == Documentation == > >>>>>> The current documentation for Onyx is at > https://snuspl.github.io/onyx/ > >>>>> . > >>>>>> > >>>>>> == Initial Source == > >>>>>> The Onyx codebase is currently hosted at > https://github.com/snuspl/onyx > >>>>> . > >>>>>> > >>>>>> == External Dependencies == > >>>>>> To the best of our knowledge, all Onyx dependencies are distributed > >>>>>> under Apache compatible licenses. Upon acceptance to the incubator, > we > >>>>>> would begin a thorough analysis of all transitive dependencies to > >>>>>> verify this fact and further introduce license checking into the > build > >>>>>> and release process. > >>>>>> > >>>>>> == Cryptography == > >>>>>> Not applicable. > >>>>>> > >>>>>> == Required Resources == > >>>>>> === Mailing Lists === > >>>>>> We will operate two mailing lists as follows: > >>>>>> * Onyx PMC discussions: priv...@onyx.incubator.apache.org > >>>>>> * Onyx developers: d...@onyx.incubator.apache.org > >>>>>> > >>>>>> === Git Repositories === > >>>>>> Upon incubation: https://github.com/apache/incubator-onyx. > >>>>>> After the incubation, we would like to move the existing repo > >>>>>> https://github.com/snuspl/onyx to the Apache infrastructure > >>>>>> > >>>>>> === Issue Tracking === > >>>>>> Onyx currently tracks its issues using the Github issue tracker: > >>>>>> https://github.com/snuspl/onyx/issues. We plan to migrate to Apache > >>>>>> JIRA. > >>>>>> > >>>>>> == Initial Committers == > >>>>>> * Byung-Gon Chun > >>>>>> * Jeongyoon Eo > >>>>>> * Geon-Woo Kim > >>>>>> * Joo Yeon Kim > >>>>>> * Gyewon Lee > >>>>>> * Jung-Gil Lee > >>>>>> * Sanha Lee > >>>>>> * Wooyeon Lee > >>>>>> * Yunseong Lee > >>>>>> * JangHo Seo > >>>>>> * Won Wook Song > >>>>>> * Taegeon Um > >>>>>> * Youngseok Yang > >>>>>> > >>>>>> == Affiliations == > >>>>>> * SNU (Seoul National University) > >>>>>> * Byung-Gon Chun > >>>>>> * Jeongyoon Eo > >>>>>> * Geon-Woo Kim > >>>>>> * Gyewon Lee > >>>>>> * Sanha Lee > >>>>>> * Wooyeon Lee > >>>>>> * Yunseong Lee > >>>>>> * JangHo Seo > >>>>>> * Won Wook Song > >>>>>> * Taegeon Um > >>>>>> * Youngseok Yang > >>>>>> > >>>>>> * LG > >>>>>> * Jung-Gil Lee > >>>>>> > >>>>>> * Samsung > >>>>>> * Joo Yeon Kim > >>>>>> > >>>>>> * Viva Republica > >>>>>> * Geon-Woo Kim > >>>>>> > >>>>>> == Sponsors == > >>>>>> === Champions === > >>>>>> Byung-Gon Chun > >>>>>> > >>>>>> === Mentors === > >>>>>> * Hyunsik Choi > >>>>>> * Byung-Gon Chun > >>>>>> * Markus Weimer > >>>>>> * Reynold Xin > >>>>>> > >>>>>> === Sponsoring Entity === > >>>>>> The Apache Incubator > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Byung-Gon Chun > >>>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Byung-Gon Chun > >>>> > >>> > >>> > >>> > >>> -- > >>> Byung-Gon Chun > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> For additional commands, e-mail: general-h...@incubator.apache.org > >> > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > -- Byung-Gon Chun