+1 (non-binding) Terence
On Fri, Mar 4, 2016 at 1:13 AM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > +1 (binding) > > Regards > JB > > > On 03/04/2016 02:29 AM, Poorna Chandra wrote: > >> Hi All, >> >> Tephra proposal was sent out for discussion last week. The proposal is >> available at https://wiki.apache.org/incubator/TephraProposal >> >> Please vote to accept Tephra into the Apache Incubator. The vote will be >> open for the next 72 hours. >> >> [ ] +1 Accept Tephra as an Apache Incubator podling. >> [ ] +0 Abstain. >> [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ... >> >> Thanks, >> Poorna. >> >> ------ >> >> = Abstract = >> >> Tephra is a system for providing globally consistent transactions on >> top of Apache HBase and other storage engines. >> >> = Proposal = >> >> Tephra is a transaction engine for distributed data stores like Apache >> HBase. >> It provides ACID semantics for concurrent data operations that span over >> region >> boundaries in HBase using Optimistic Concurrency Control. >> >> = Background = >> >> HBase provides strong consistency with row- or region-level ACID >> operations. However, it sacrifices cross-region and cross-table >> consistency in favor of scalability. This trade-off requires application >> developers to handle the complexity of ensuring consistency when their >> modifications span region boundaries. By providing support for global >> transactions that span regions, tables, or multiple RPCs, >> Tephra simplifies application development on top of HBase, without a >> significant impact on performance or scalability for many workloads. >> >> Tephra leverages HBase’s native data versioning to provide multi-versioned >> concurrency control (MVCC) for transactional reads and writes. >> With MVCC capability, each transaction sees its own consistent “snapshot” >> of >> data, providing snapshot isolation of concurrent transactions. >> MVCC along with conflict detection and handling enables Optimistic >> Concurrency >> Control. >> >> Tephra consists of three main components: >> * Transaction Server – maintains global view of transaction state, >> assigns >> new transaction IDs and performs conflict detection; >> * Transaction Client – coordinates start, commit, and rollback of >> transactions; and >> * Transaction Processor Coprocessor – applies filtering to the data >> read (based >> on a given transaction’s state) and cleans up any data from old >> (no longer visible) transactions. >> >> Although Tephra only supports HBase now, it can be extended to support >> transactions on any store that has multi-versioning and rollback >> support. The transactions >> can span over multiple stores and storage paradigms. >> >> = Rationale = >> >> Tephra has simple abstractions which can be used by an application to >> add transaction support over HBase. By abstracting away transaction >> handling using Tephra, the application is freed of >> transaction logic, and the application developer can focus on the use >> case. >> Also, Tephra can be extended to support transactions on data sources other >> than HBase. >> >> By making Tephra an Apache open source project, we believe that there will >> be wider adoption and more opportunities for Tephra to be integrated >> into other Apache projects. >> >> = Current Status = >> >> Tephra was built at Cask Data Inc. initially as part of >> open-source framework Cask Data Application Platform (CDAP) >> [[http://cdap.io/]]. >> It was later converted into an independent open source project with >> Apache 2.0 License [[https://github.com/caskdata/tephra]]. >> >> Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra >> has been deployed at multiple companies. >> >> Apache Phoenix is using Tephra as transaction engine in the next release. >> >> == Meritocracy == >> >> Our intent with this incubator proposal is to start building a diverse >> developer community around Tephra following the Apache meritocracy model. >> Since Tephra was initially developed in early 2013, we have had fast >> adoption and contributions within Cask Data. We are looking forward to >> new contributors. We wish to build a community based on Apache's >> meritocracy principles, working with those who contribute significantly to >> the project and welcoming them to be committers both during the incubation >> process and beyond. >> >> == Community == >> >> Core developers of Tephra are at Cask Data. Recently the developer >> community >> has expanded to include folks from Apache Phoenix. We hope to extend our >> contributor base significantly and we will invite all who are interested >> in working on distributed transaction engine. >> >> == Core Developers == >> >> A few engineers from Cask Data and outside have developed Tephra: >> Andreas Neumann, Terence Yim, Gary Helmling, Andrew Purtell and >> Poorna Chandra. >> >> >> == Alignment == >> >> The ASF is the natural choice to host the Tephra project as its goal of >> encouraging community-driven open source projects fits with our vision for >> Tephra. >> >> Additionally, many other projects with which we are familiar and expect >> Tephra to integrate with, such as Phoenix, Zookeeper, HDFS, log4j, and >> others >> mentioned in the External Dependencies section are Apache projects, and >> Tephra will benefit by close proximity to them. >> >> = Known Risks = >> >> == Orphaned Products == >> >> There is very little risk of Tephra being orphaned, as it is a key part of >> Cask Data’s products. The core Tephra developers plan to continue to work >> on Tephra, and Cask Data has funding in place to support their efforts >> going forward. >> Also with Phoenix using Tephra for transactions, Phoenix developers are >> keen on contributing to Tephra. >> >> >> == Inexperience with Open Source == >> >> Several of the core developers have experience with open source >> development. Andreas Neumann is an Apache committer for Oozie and Twill. >> Terence Yim is an Apache committer for Helix and Twill. Poorna Chandra >> is an Apache committer for Twill. Gary Helmling is a committer for >> Apache Twill and a committer and PMC member for Apache HBase. >> James Taylor is PMC chair for Apache Phoenix, PMC member of Apache >> Calcite, >> and an IPMC member. >> >> == Homogeneous Developers == >> >> The current core developers are all Cask Data employees. However, we >> intend to establish a developer community that includes independent and >> corporate contributors. We are encouraging new contributors via our >> mailing >> lists, public presentations, and personal contacts, and we will continue >> to >> do so. >> >> Apache Phoenix developers have already contributed several patches to >> Tephra, >> and have expressed interest in becoming long term contributors. >> >> == Reliance on Salaried Developers == >> >> Currently, these developers are paid to work on Tephra. Once the project >> has >> built a community, we expect to attract committers, developers and >> community >> other than the current core developers. However, because Cask Data >> products use Tephra internally, the reliance on salaried developers is >> unlikely to change, at least in the near term. >> >> == Relationships with Other Apache Products == >> >> Tephra is deeply integrated with Apache projects. Tephra provides >> transactions >> over Apache HBase, and uses Apache Twill and Apache Zookeeper for >> coordination. >> A number of other Apache projects are Tephra dependencies, and are >> listed in the External Dependencies section. >> >> In addition, Apache Phoenix is using Tephra as the transaction engine. >> >> == An Excessive Fascination with the Apache Brand == >> >> While we respect the reputation of the Apache brand and have no doubt that >> it will attract contributors and users, our interest is primarily to give >> Tephra a solid home as an open source project following an established >> development model. We have also given additional reasons in the Rationale >> and Alignment sections. >> >> = Documentation = >> >> The current documentation for Tephra is at >> https://github.com/caskdata/tephra. >> >> = Initial Source = >> >> Tephra codebase is currently hosted at https://github.com/caskdata/tephra >> . >> >> = Source and Intellectual Property Submission Plan = >> >> Tephra codebase is currently licensed under Apache 2.0 license. >> Cask Data owns the trademark for "Tephra". As part of the incubation >> process >> Cask Data will transfer the trademark to Apache Foundation. >> >> = External Dependencies = >> >> The dependencies all have Apache-compatible licenses: >> * dropwizard metrics (Apache 2.0) >> * fastutil (Apache 2.0) >> * gson (Apache 2.0) >> * guava-libraries (Apache 2.0) >> * guice (Apache 2.0) >> * hadoop (Apache 2.0) >> * hbase (Apache 2.0) >> * hdfs (Apache 2.0) >> * junit (EPL v1.0) >> * logback (EPL v1.0 ) >> * slf4j (MIT) >> * thrift (Apache 2.0) >> * twill (Apache 2.0) >> * zookeeper (Apache 2.0) >> >> = Cryptography = >> >> Tephra does not use cryptography itself, however it can run on secure >> Hadoop, >> which uses Kerberos. >> >> = Required Resources = >> >> == Mailing Lists == >> >> * tephra-private for private PMC discussions (with moderated >> subscriptions) >> * tephra-dev for technical discussions among contributors >> * tephra-commits for notification about commits >> >> == Subversion Directory == >> >> Git is the preferred source control system: git://git.apache.org/tephra >> >> == Issue Tracking == >> >> JIRA Tephra (TEPHRA) >> >> == Other Resources == >> >> The existing code already has unit tests, so we would like a Hudson >> instance to run them whenever a new patch is submitted. This can be added >> after project creation. >> >> = Initial Committers = >> >> * Andreas Neumann <anew at apache dot org> >> * Terence Yim <chtyim at apache dot org> >> * Poorna Chandra <poorna at apache dot org> >> * Gokul Gunasekaran <gokul at cask dot co> >> * James Taylor <jamestaylor at apache dot org> >> * Thomas D'Silva <tdsilva at apache dot org> >> * Gary Helmling <garyh at apache dot org> >> >> = Affiliations = >> >> * Andreas Neumann (Cask Data) >> * Terence Yim (Cask Data) >> * Poorna Chandra (Cask Data) >> * Gokul Gunasekaran (Cask Data) >> * James Taylor (Salesforce.com) >> * Thomas D'Silva (Salesforce.com) >> * Gary Helmling (Facebook) >> >> = Sponsors = >> >> == Champion == >> >> James Taylor <jamestaylor at apache dot org> (V.P., Apache Phoenix) >> >> == Nominated Mentors == >> >> * James Taylor <jamestaylor at apache dot org> >> * Lars Hofhansl <larsh at apache dot org> >> * Andrew Purtell <apurtell at apache dot org> >> * Alan Gates <gates at apache dot org> >> * Henry Saputra <hsaputra at apache dot org> >> >> == Sponsoring Entity == >> >> We are requesting that the Incubator sponsor this project. >> >> > -- > Jean-Baptiste Onofré > jbono...@apache.org > http://blog.nanthrax.net > Talend - http://www.talend.com > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >