On 25.02.2013 05:44, Arun C Murthy wrote: > Thanks to all who voted. Obviously, I'm +1 (binding) on the proposal. > > With 14 +1s (10 binding) the vote passes. > > I'll start the work to get the podling started. > > thanks, > Arun > > On Feb 19, 2013, at 8:26 PM, Arun C Murthy wrote: > >> Hi Folks, >> >> Thanks for participating in the discussion. I'd like to call a VOTE for >> acceptance of Apache Tez into the Incubator. I'll let the vote run till into >> this weekend (Sun 2/24 6pm PST). >> >> [ ] +1 Accept Apache Tez into the Incubator >> [ ] +0 Don't care. >> [ ] -1 Don't accept Apache Tez into the Incubator because... >> >> Full proposal is pasted at the bottom of this email, and the corresponding >> wiki is http://wiki.apache.org/incubator/TezProposal. >> >> Only VOTEs from Incubator PMC members are binding, but all are welcome to >> express their thoughts. >> >> Here's my +1 (binding). >> >> thanks, >> Arun >> >> PS: From the initial discussion, the only changes are that I've added one >> new mentor and 2 new committers. All the new additions come from the >> non-major employer while we continue to strive to further diversify during >> the incubation. Thanks. >> >> ---- >> >> = Tez = >> >> == Abstract == >> Tez is an effort to develop a generic application framework which can be used >> to process arbitrarily complex data-processing tasks and also a re-usable set >> of data-processing primitives which can be used by other projects. >> >> == Proposal == >> Tez is a proposal to develop a generic application which can be used to >> process complex data-processing task DAGs and runs natively on Apache Hadoop >> YARN. YARN is a generic resource-management system on which currently >> applications like MapReduce already exist. MapReduce is a specific, and >> constrained, DAG - which is not optimal for several frameworks like Apache >> Hive >> and Apache Pig. Furthermore, we propose to develop a re-usable set of >> libraries of data-processing primitives such as sorting, merging, >> data-shuffling, intermediate data management etc. which are necessary for >> Tez >> which we envision can be used directly by other projects. >> >> == Background == >> Apache Hadoop MapReduce has emerged as the assembly-language on which other >> frameworks like Apache Pig and Apache Hive have been built. However, it has >> been well accepted that MapReduce produces very constrained task DAGs for >> each >> job which results in Apache Pig and Apache Hive requiring multiple MapReduce >> jobs for several queries. By providing a more expressive DAG of tasks for a >> job, Tez attempts to provide significantly enhanced data-processing >> capabilities for projects like Apache Pig, Apache Hive, Cascading etc. >> >> == Rationale == >> There is an important gap that Tez fulfills in the Apache Hadoop ecosystem of >> allowing for more expressive task DAGs for data-processing applications such >> as Apache Pig, Apache Hive, Cascading etc. >> >> With emergence of Apache Hadoop YARN, there is a strong need for a >> common DAG application which can then be shared by Apache Pig, Apache Hive, >> Cascading etc. >> >> == Initial Goals == >> The initial goals for this project are to specify the detailed requirements >> and architecture, and then develop the initial implementation including the >> DAG ApplicationMaster to run natively inside Apache Hadoop YARN. >> >> == Current Status == >> Significant work has been completed to identify the initial requirements and >> define the overall system architecture. There is a patch available in the >> internal Hortonworks git repository which can act as the initial seed. >> >> === Meritocracy === >> We plan to invest in supporting a meritocracy. We will discuss the >> requirements >> in an open forum. Several companies have already expressed interest in this >> project, and we intend to invite additional developers to participate. >> We will encourage and monitor community participation so that privileges can >> be >> extended to those that contribute. >> >> === Community === >> The need for a generic DAG application for data processing in the open >> source is >> tremendous, so there is a potential for a very large community. We believe >> that Tez's extensible architecture will further encourage community >> participation. >> Also, related Apache projects (eg, Pig, Hive) have very large and active >> communities, and we expect that over time Tez will also attract a large >> community. >> >> === Core Developers === >> The developers on the initial committers list include people very experienced >> in the Apache Hadoop ecosystem: >> >> * Alan Gates <gates at apache dot org> >> * Arun C Murthy <acmurthy at apache dot org> >> * Ashutosh Chauhan <hashutosh at apache dot org> >> * Bikas Saha <bikas at apache dot org> >> * Chris Douglas <cdouglas at apache dot org> >> * Daryn Sharp <daryn at apache dot org> >> * Devaraj Das <ddas at apache dot org> >> * Gopal Vijayaraghavan <gopal at hortonworks dot com> >> * Gunther Hagleitner <ghagleitner at hortonworks dot com> >> * Hitesh Shah <hitesh at apache dot org> >> * Jason Lowe <jlowe at apache dot org> >> * Jean Xu <jeanxu at facebook dot com> >> * Jitendra Pandey <jitendra at apache dot org> >> * Julien Le Dem <julien at apache dot org> >> * Kevin Wilfong <kevinwilfong at apache dot org> >> * Mike Liddell <mike dot lidell at microsoft dot com> >> * Namit Jain <namit at apache dot org> >> * Nathan Roberts <nroberts at yahoo dash inc dot com> >> * Owen O'Malley <omalley at apache dot org> >> * Robert Evans <bobby at apache dot org> >> * Siddharth Seth <sseth at apache dot org> >> * Tom White <tomwhite at apache dot org> >> * Thomas Graves <tgraves at apache dot org> >> * Vikram Dixit <vikram at apache dot org> >> * Vinod Kumar Vavilapalli <vinodkv at apache dot org> >> * William Graham <billgraham at apache dot org> >> >> We realize that though we have significant employer diversity already, >> additional diversity is always better, and we will work >> aggressively to recruit developers from additional companies. >> >> === Alignment === >> The initial committers strongly believe that a standard task DAG >> application on Apache Hadoop YARN will gain broader adoption as an open >> source, >> community driven project, where the community can contribute not only to the >> core components, but also to a growing collection of applications which will >> be based on top of Tez. Our hope is that the Apache Hive, Apache Pig, >> Cascading and other communities will find tremendous value in Tez and will >> adopt >> it en masse. >> >> == Known Risks == >> >> === Orphaned Products === >> The contributors are leading users and vendors in the Apache Hadoop >> ecosystem, >> with significant open source experience, so the risk of being orphaned is >> relatively low. The project could be at risk if vendors decided to change >> their strategies in the market. In such an event, the current committers >> plan to continue working on the project on their own time, though the >> progress will likely be slower. We plan to mitigate this risk by >> recruiting additional committers. >> >> === Inexperience with Open Source === >> The initial committers include veteran Apache members (Committers, PMC >> members >> and Apache Members) and other developers who have varying degrees of >> experience >> with open source projects. All have been involved with source code that has >> been released under an open source license, and several also have experience >> developing code with an open source development process. >> >> === Homogenous Developers === >> The initial committers are employed by a number of companies, including >> Cloudera, Facebook, Hortonworks, Microsoft, Twitter and Yahoo. We are >> committed >> to recruiting additional committers from other companies based on their >> contributions to the project even though we do have significant diversity >> already. >> >> === Reliance on Salaried Developers === >> It is expected that Tez development will occur on both salaried time and on >> volunteer time, after hours. The majority of initial committers are paid by >> their employer to contribute to this project. However, they are all >> passionate >> about the project, and we are confident that the project will continue even >> if >> no salaried developers contribute to the project. We are committed to >> recruiting >> additional committers including non-salaried developers. >> >> === Relationships with Other Apache Products === >> As mentioned in the Alignment section, Tez is closely integrated with Hadoop, >> Hive and Pig in a numerous ways. We look forward to collaborating with >> those communities, as well as other Apache communities. >> >> === An Excessive Fascination with the Apache Brand === >> Tez solves a real need for generic task DAG management in the Apache Hadoop >> ecosystem, something which has been addressed in a very ad hoc manner so far >> by multiple Apache projects. Our rationale for developing Tez as an Apache >> project is detailed in the Rationale section. We believe that the Apache >> brand >> and community process will help us attract more contributors to this >> project, >> and help establish ubiquitous APIs. >> >> == Documentation == >> http://wiki.apache.org/incubator/TezProposal >> >> == Initial Source == >> Available as a patch. >> >> == Cryptography == >> Tez will eventually support encryption on the wire. This is not one of the >> initial >> goals, and we do not expect Tez to be a controlled export item due to the >> use >> of encryption. >> >> == Required Resources == >> >> === Mailing List === >> * tez-private >> * tez-dev >> * tez-user >> >> === Subversion Directory === >> Git is the preferred source control system: git://git.apache.org/tez >> >> === Issue Tracking === >> >> JIRA Tez (TEZ) >> >> == Initial Committers == >> * Alan Gates <gates at apache dot org> >> * Arun C Murthy <acmurthy at apache dot org> >> * Ashutosh Chauhan <hashutosh at apache dot org> >> * Bikas Saha <bikas at apache dot org> >> * Chris Douglas <cdouglas at apache dot org> >> * Daryn Sharp <daryn at apache dot org> >> * Devaraj Das <ddas at apache dot org> >> * Gopal Vijayaraghavan <gopal at hortonworks dot com> >> * Gunther Hagleitner <ghagleitner at hortonworks dot com> >> * Hitesh Shah <hitesh at apache dot org> >> * Jason Lowe <jlowe at apache dot org> >> * Jean Xu <jeanxu at facebook dot com> >> * Jitendra Pandey <jitendra at apache dot org> >> * Julien Le Dem <julien at apache dot org> >> * Kevin Wilfong <kevinwilfong at apache dot org> >> * Mike Liddell <mike dot lidell at microsoft dot com> >> * Namit Jain <namit at apache dot org> >> * Nathan Roberts <nroberts at yahoo dash inc dot com> >> * Owen O'Malley <omalley at apache dot org> >> * Robert Evans <bobby at apache dot org> >> * Siddharth Seth <sseth at apache dot org> >> * Tom White <tomwhite at apache dot org> >> * Thomas Graves <tgraves at apache dot org> >> * Vikram Dixit <vikram at apache dot org> >> * Vinod Kumar Vavilapalli <vinodkv at apache dot org> >> * William Graham <billgraham at apache dot org> >> >> == Affiliations == >> The initial committers are employees of Cloudera, Facebook, Hortonworks, >> Microsoft, Twitter and Yahoo Inc. >> >> * Alan Gates - Hortonworks >> * Arun C Murthy - Hortonworks >> * Ashutosh Chauhan - Hortonworks >> * Bikas Saha - Hortonworks >> * Chris Douglas - Microsoft >> * Daryn Sharp - Yahoo >> * Devaraj Das - Hortonworks >> * Gopal Vijayaraghavan - Hortonworks >> * Gunther Hagleitner - Hortonworks >> * Hitesh Shah - Hortonworks >> * Jason Lowe - Yahoo >> * Jean Xu - Facebook >> * Jitendra Pandey - Hortonworks >> * Julien Le Dem - Twitter >> * Kevin Wilfong - Facebook >> * Mike Liddell - Microsoft >> * Namit Jain - Facebook >> * Nathan Roberts - Yahoo >> * Owen O'Malley - Hortonworks >> * Robert Evans - Yahoo >> * Siddharth Seth - Hortonworks >> * Tom White - Cloudera >> * Thomas Graves - Yahoo >> * Vikram Dixit - Hortonworks >> * Vinod Kumar Vavilapalli - Hortonworks >> * William Graham - Twitter >> >> The nominated mentors are employees of Hortonworks, LinkedIn, >> NASA JPL and Microsoft. >> >> * Alan Gates - Hortonworks >> * Arun C Murthy - Hortonworks >> * Chris Douglas - Microsoft >> * Chris Mattman - NASA JPL >> * Jakob Homan - LinkedIn >> * Owen O'Malley - Hortonworks >> >> == Sponsors == >> >> === Champion === >> Arun C Murthy <acmurthy at apache dot org> >> >> === Nominated Mentors === >> * Alan Gates <gates at apache dot org> – Architect at Hortonworks. >> Committer for Pig. >> * Arun C Murthy <acmurthy at apache dot org> – Architect at Hortonworks. >> Committer for Hadoop. >> * Chris Douglas <cdouglas at apache dot org> - Sr. Research Engineer at >> Microsoft. Committer for Hadoop. >> * Chris Mattman <mattmann at apache dot org> - Sr. Computer Scientist, NASA >> JPL. Committer for Nutch, OODT and Tika. >> * Jakob Homan <jghoman at apache dot org> – Sr. Software Engineer, >> LinkedIn. Committer for Hadoop, Kafka, Giraph. >> * Owen O'Malley <omalley at apache dot org> – Architect at Hortonworks. >> Committer for Hadoop, Ambari. >> >> === Sponsoring Entity === >> Incubator >> > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org