Hi Chris, could you re-send the tally up VOTE result with subject prefixed with [RESULT] ?
- Henry On Wed, May 21, 2014 at 3:56 PM, Chris Aniszczyk <caniszc...@gmail.com> wrote: > With 18 +1 votes (and 10+ as binding votes), I'll consider this vote a > success. > > I'll proceed with the next steps. > > Thank you! > > > > On Sun, May 18, 2014 at 3:57 PM, Todd Lipcon <t...@cloudera.com> wrote: > >> +1 from me (the proposed Champion) >> >> -Todd >> >> >> On Sun, May 18, 2014 at 2:15 PM, Chris Aniszczyk <caniszc...@gmail.com >> >wrote: >> >> > Based on the results of the discussion thread: >> > >> > >> http://mail-archives.apache.org/mod_mbox/incubator-general/201405.mbox/%3CCAJg1wMRGhLu4P7LeVQB%2B5K0C-fr-pw2448uj%3D6-3zHag4F1EbA%40mail.gmail.com%3E >> > >> > I would like to call a vote on accepting Parquet into the incubator. >> > https://wiki.apache.org/incubator/ParquetProposal >> > >> > [ ] +1 Accept Parquet into the Incubator >> > [ ] +0 Indifferent to the acceptance of Parquet >> > [ ] -1 Do not accept Parquet because ... >> > >> > The vote will be open until Thursday May 22nd 18:00 UTC. >> > >> > = Parquet Proposal = >> > >> > == Abstract == >> > Parquet is a columnar storage format for Hadoop. >> > >> > == Proposal == >> > >> > We created Parquet to make the advantages of compressed, efficient >> columnar >> > data representation available to any project in the Hadoop ecosystem, >> > regardless of the choice of data processing framework, data model, or >> > programming language. >> > >> > == Background == >> > >> > Parquet is built from the ground up with complex nested data structures >> in >> > mind, and uses the repetition/definition level approach to encoding such >> > data structures, as popularized by Google Dremel ( >> > https://blog.twitter.com/2013/dremel-made-simple-with-parquet). We >> believe >> > this approach is superior to simple flattening of nested name spaces. >> > >> > Parquet is built to support very efficient compression and encoding >> > schemes. Parquet allows compression schemes to be specified on a >> per-column >> > level, and is future-proofed to allow adding more encodings as they are >> > invented and implemented. We separate the concepts of encoding and >> > compression, allowing parquet consumers to implement operators that work >> > directly on encoded data without paying decompression and decoding >> penalty >> > when possible. >> > >> > == Rationale == >> > >> > Parquet is built to be used by anyone. We believe that an efficient, >> > well-implemented columnar storage substrate should be useful to all >> > frameworks without the cost of extensive and difficult to set up >> > dependencies. >> > >> > Furthermore, the rapid growth of Parquet community is empowered by open >> > source. We believe the Apache foundation is a great fit as the long-term >> > home for Parquet, as it provides an established process for >> > community-driven development and decision making by consensus. This is >> > exactly the model we want for future Parquet development. >> > >> > == Initial Goals == >> > >> > * Move the existing codebase to Apache >> > * Integrate with the Apache development process >> > * Ensure all dependencies are compliant with Apache License version 2.0 >> > * Incremental development and releases per Apache guidelines >> > >> > == Current Status == >> > >> > Parquet has undergone 2 major releases: >> > https://github.com/Parquet/parquet-format/releases of the core format >> and >> > 22 releases: https://github.com/Parquet/parquet-mr/releases of the >> > supporting set of Java libraries. >> > >> > The Parquet source is currently hosted at GitHub, which will seed the >> > Apache git repository. >> > >> > === Meritocracy === >> > >> > We plan to invest in supporting a meritocracy. We will discuss the >> > requirements in an open forum. Several companies have already expressed >> > interest in this project, and we intend to invite additional developers >> to >> > participate. We will encourage and monitor community participation so >> that >> > privileges can be extended to those that contribute. >> > >> > === Community === >> > >> > There is a large need for an advanced columnar storage format for Hadoop. >> > Parquet is being used in production by many organizations (see >> > https://github.com/Parquet/parquet-mr/blob/master/PoweredBy.md) >> > >> > * Cloudera: https://twitter.com/HenryR/statuses/324222874011451392 >> > * Criteo: https://twitter.com/julsimon/statuses/312114074911666177 >> > * Salesforce: >> https://twitter.com/TwitterOSS/statuses/392734610116726784 >> > * Stripe: https://twitter.com/avibryant/statuses/391339949250715648 >> > * Twitter: https://twitter.com/J_/statuses/315844725611581441 >> > >> > By bringing Parquet into Apache, we believe that the community will grow >> > even bigger. >> > >> > === Core Developers === >> > >> > Parquet was initially developed as a collaboration between Twitter, >> > Cloudera and Criteo. >> > >> > See >> > >> > >> https://blog.twitter.com/2013/announcing-parquet-10-columnar-storage-for-hadoop >> > >> > === Alignment === >> > >> > We believe that having Parquet at Apache will help further the growth of >> > the big-data community, as it will encourage cooperation within the >> greater >> > ecosystem of projects spawned by Apache Hadoop. The alignment is also >> > beneficial to other Apache communities (such as Hadoop, Hive, Avro). >> > >> > == Known Risks == >> > >> > === Orphaned Products === >> > >> > The risk of the Parquet project being abandoned is minimal. There are >> many >> > organizations using Parquet in production, including Twitter, Cloudera, >> > Stripe, and Salesforce ( >> > http://blog.cloudera.com/blog/2013/10/parquet-at-salesforce-com/). >> > >> > === Inexperience with Open Source === >> > >> > Parquet has existed as a healthy open source for one year. During that >> > time, we have curated an open-source community successfully, attracting >> > over 40 contributors (see >> > https://github.com/Parquet/parquet-mr/graphs/contributors) from a >> diverse >> > group of companies. >> > Several of the core contributors to the project are deeply familiar with >> > OSS and Apache specifically: Julien Le Dem was until recently the PMC >> Chair >> > for Apache Pig, and Dmitriy Ryaboy, Aniket Mokashi, and Jonathan Coveney >> > are also Apache Pig committers with contributions to several other Apache >> > projects. Todd Lipcon and Tom White are committers to Apache Hadoop and >> > multiple other related projects. Brock Noland is a Hive committer. >> > >> > === Homogenous Developers === >> > >> > The initial committers come from a number of companies and countries. >> > Parquet has an active community of developers, and we are committed to >> > recruiting additional committers based on their contributions to the >> > project. The java library component alone has contributions from 31 >> > individual github accounts, 14 of which contributed over 1000 lines of >> > code. >> > >> > === Reliance on Salaried Developers === >> > >> > It is expected that Parquet development will occur on both salaried time >> > and on volunteer time, after hours. The majority of initial committers >> are >> > paid by their employers to contribute to this project. However, they are >> > all passionate about the project, and we are confident that the project >> > will continue even if no salaried developers contribute to the project. >> As >> > evidence of this statement, we present the GitHub punchcard (see >> > https://github.com/Parquet/parquet-mr/graphs/punch-card) showing that a >> > lot >> > of activity happens on weekends. We are committed to recruiting >> additional >> > committers including non-salaried developers. >> > >> > === Relationships with Other Apache Products === >> > >> > As mentioned in the Alignment section, Parquet is closely related to >> > Hadoop. It provides an API that allowed it to be easily integrated with >> > many other apache projects: Pig, Hive, Avro, Thrift, Spark, Drill, >> Crunch, >> > Tajo. Some of the features it provides are similar to the ORC file format >> > which is part of the Hive project. However Parquet focused on being >> > framework agnostic and language independent and has been really >> successful >> > to that end. On top of the Apache projects mentioned above, Parquet is >> also >> > integrated with other open source projects, including Protocol Buffers, >> > Cloudera Impala or Scrooge. We look forward to continue collaborating >> with >> > those communities, as well as other Apache communities. >> > >> > === An Excessive Fascination with the Apache Brand === >> > >> > Parquet is an already healthy and well known open source project. This >> > proposal is not for the purpose of generating publicity. Rather, the >> > primary benefits to joining Apache are those outlined in the Rationale >> > section. >> > >> > == Documentation == >> > >> > Documentation is currently located as README markdown files: >> > >> > * https://github.com/Parquet/parquet-format >> > * https://github.com/Parquet/parquet-mr >> > >> > == Source and Intellectual Property Submission Plan == >> > >> > The Parquet codebase is currently hosted on Github: >> > https://github.com/Parquet. >> > >> > These are the codebases that we would migrate to the Apache foundation. >> > >> > == External Dependencies == >> > >> > >> > * Junit: EPL >> > * Apache Commons: ALv2 >> > * Apache Thrift: ALv2 >> > * Apache Maven: ALv2 >> > * Apache Avro: ALv2 >> > * Apache Hadoop: ALv2 >> > * Google Guava: ALv2 >> > * Google Protobuf: New BSD License >> > >> > == Cryptography == >> > >> > We do not expect Parquet to be a controlled export item due to the use of >> > encryption. >> > >> > == Required Resources == >> > >> > === Mailing lists === >> > >> > * priv...@parquet.incubator.apache.org >> > * comm...@parquet.incubator.apache.org >> > * d...@parquet.incubator.apache.org >> > >> > == Subversion Directory == >> > >> > Git is the preferred source control system: >> > >> > * git://git.apache.org/parquet-format >> > * git://git.apache.org/parquet-mr >> > >> > == Issue Tracking == >> > >> > We'd like to keep using the Git review and issue tracking tools. >> > Controlling Pull requests closing through git commit messages in >> > git.apache.org >> > >> > == Initial Committers == >> > >> > * Aniket Mokashi <aniket...@gmail.com> >> > * Brock Noland <br...@apache.org> >> > * Chris Aniszczyk <caniszc...@gmail.com> >> > * Dmitriy Ryaboy <dvrya...@apache.org> >> > * Jake Farrell <jfarr...@apache.org> >> > * Jonathan Coveney <jcove...@gmail.com> >> > * Julien Le Dem <jul...@apache.org> >> > * Lukas Nalezenec <lukas.naleze...@gmail.com> >> > * Marcel Kornacker <mar...@cloudera.com> >> > * Mickael Lacour >> > * Nong Li <n...@cloudera.com> >> > * Remy Pecqueur >> > * Ryan Blue <b...@cloudera.com> >> > * Tianshuo Deng <dengtians...@gmail.com> >> > * Tom White <tomwh...@apache.org> >> > * Wesley Peck >> > >> > == Affiliations == >> > >> > * Aniket Mokashi - Twitter >> > * Brock Noland - Cloudera >> > * Chris Aniszczyk - Twitter >> > * Dmitriy Ryaboy - Twitter >> > * Jake Farrell >> > * Jonathan Coveney - Twitter >> > * Julien Le Dem - Twitter >> > * Lukas Nalezenec >> > * Marcel Kornacker - Cloudera >> > * Mickael Lacour - Criteo >> > * Nong Li - Cloudera >> > * Remy Pecqueur - Criteo >> > * Ryan Blue - Cloudera >> > * Tianshuo Deng - Twitter >> > * Tom White - Cloudera >> > * Wesley Peck - ARRIS, Inc. >> > >> > == Sponsors == >> > >> > === Champion === >> > >> > * Todd Lipcon >> > >> > === Nominated Mentors === >> > >> > * Tom White >> > * Chris Mattmann >> > * Jake Farrell >> > * Roman Shaposhnik >> > >> > === Sponsoring Entity === >> > >> > The Apache Incubator >> > >> > -- >> > Cheers, >> > >> > Chris Aniszczyk >> > http://aniszczyk.org >> > +1 512 961 6719 >> > >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> > > > > -- > Cheers, > > Chris Aniszczyk > http://aniszczyk.org > +1 512 961 6719 --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org