This looks pretty good. I can also join as a mentor. Enis
On Sun, May 10, 2015 at 8:06 PM, Stack <st...@duboce.net> wrote: > On Sun, May 10, 2015 at 7:13 PM, Konstantin Boudnik <c...@apache.org> > wrote: > > > I think it'd be great to have SQL platform for Hadoop > > > > +1 > > > > I am mentoring 4 projects at the moment, but if you need a 1/2 time > mentor > > - > > count me in ;) > > > > Cos > > > > > We'll take you up on your kind offer if we can't get someone less loaded. > > Thanks Cos, > > St.Ack > > > > On Fri, May 08, 2015 at 02:59PM, Stack wrote: > > > I would like to start up a discussion on Trafodion joining the ASF as > an > > > incubating project. > > > > > > Trafodion is a webscale SQL-on-Hadoop solution that enables > transactional > > > or operational workloads on Hadoop, . > > > > > > The proposal is available on the wiki here: > > > https://wiki.apache.org/incubator/TrafodionProposal#preview > > > > > > The proposal text is also attached to the end of this email. > > > > > > Trafodion is a rich, storied SQL engine that has recently been ported > to > > > run on HBase and Hadoop. I think it would make for a fine addition to > the > > > Apache family of projects It would be good to hear what others think. > > > > > > Thank you in advance for giving the proposal a read. > > > > > > Yours, > > > St.Ack > > > > > > > > > Trafodion Apache Incubator Proposal > > > > > > Abstract > > > > > > Trafodion is a webscale SQL-on-Hadoop solution enabling transactional > or > > > operational workloads on Hadoop. > > > > > > Proposal > > > > > > Apache Trafodion builds on the scalability, elasticity, and flexibility > > of > > > Hadoop. Trafodion extends Hadoop to provide guaranteed transactional > > > integrity, enabling new kinds of big data applications to run on > Hadoop. > > Key > > > features of Apache Trafodion include: > > > > > > * Full-functioned ANSI SQL language support > > > * JDBC/ODBC connectivity for Linux/Windows clients > > > * Distributed ACID transaction protection across multiple statements, > > > tables and rows > > > * Performance improvements for OLTP workloads with compile-time and > > > run-time optimizations > > > * Support for large data sets using a parallel-aware query optimizer > > > * ANSI SQL security and data integrity constraints including > referential > > > integrity > > > > > > Hewlett-Packard Company submits this proposal to donate its Apache > > License, > > > Version 2.0 open source project known as Trafodion, its source code, > > > documentation, and web site content to the Apache Software Foundation > in > > > order to build an open source community > > > > > > Background > > > > > > Trafodion is an open source project sponsored by HP, incubated at HP > Labs > > > and HP-IT, to develop an enterprise-class SQL-on-Hadoop solution > > targeting > > > big data transactional or operational workloads. HP publically > announced > > > the open source project and uploaded the source code to GitHub in June > > 2014. > > > > > > The SQL compiler, optimizer and executor components of Trafodion have a > > > rich heritage. Under development since 1993, they were released as > > > commercial closed source software in various flavors such as HP NonStop > > > SQL/MX and HP Neoview. NonStop SQL/MX was designed for online > transaction > > > processing on HP’s NonStop (formerly Tandem) fault-tolerant servers and > > is > > > known for its high availability, scalability, and performance. Hundreds > > of > > > companies and thousands of servers are running mission-critical > > > applications today on NonStop SQL/MX. In addition, much of these > > components > > > today are running internal to HP as the core of its Enterprise Data > > > Warehouse (EDW), managing over a PB of data. > > > > > > Starting in 2013, the software was modified to run on HBase and a new > > > distributed transaction manager was written to run as an HBase > > co-processor. > > > > > > Unlike most NOSQL and other SQL-on-Hadoop open source projects, > Trafodion > > > provides comprehensive ANSI SQL language support including > > full-functioned > > > data definition (DDL), data manipulation (DML), transaction control > (TCL) > > > and database utility support. > > > > > > Trafodion provides comprehensive and standard SQL data manipulation > > support > > > including SELECT, INSERT, UPDATE, DELETE, and UPSERT/MERGE syntax with > > > language options including join variants, unions, where predicates, > > > aggregations (group by and having), sort ordering, sampling, correlated > > and > > > nested sub-queries, cursors, and many SQL functions. > > > > > > Utilities are provided for updating table statistics used by the > > optimizer > > > for costing (i.e. selectivity/cardinality estimates) plan alternatives, > > for > > > displaying the chosen SQL execution plan, plan shaping, backup and > > > restoring the database, data loading and unloading, and a command line > > > utility for interfacing with the database engine. > > > > > > Explicit control statements are provided to allow applications to > define > > > transaction boundaries and to abort transactions when warranted, > > including > > > BEGIN WORK, COMMIT WORK, ROLLBACK WORK and SET TRANSACTION. > > > > > > Trafodion supports ANSI’s grant/revoke semantics to define user and > role > > > privileges in terms of managing and accessing the database objects. > > > > > > Rationale > > > > > > The name “Trafodion” (the Welsh word for transactions, pronounced > > > “Tra-vod-eee-on”) was chosen specifically to emphasize the > > differentiation > > > that Trafodion provides in closing a critical gap in the Hadoop > > ecosystem. > > > Trafodion builds on the scalability, elasticity, and flexibility of > > Hadoop. > > > Trafodion extends Hadoop to provide guaranteed transactional integrity, > > > enabling new kinds of big data applications to run on Hadoop. > > > > > > Current Status > > > > > > HP released the Trafodion code under the Apache License, Version 2, in > > June > > > of 2014. Since that time, we have had one major release in January 2015 > > and > > > one minor release in April 2015. The focus of these releases has been > in > > > getting our base functionality, including security, working on top of > > > Apache HBase, as well as improving performance, availability and > > > scalability, and integrating better with HBase. > > > > > > Meritocracy > > > > > > We want to build a diverse developer community, based on the Apache > Way, > > > around Trafodion. To help developers become contributors, we have > > > documentation on the wiki about the architecture, the source tree > > > structure, and an example enhancement. We plan to publish our project > > > backlog to the community, specifically highlighting areas where > > developers > > > new to Trafodion may best start contributing, such as extending the > > > database functionality with User Defined Routines (UDRs) and > integrating > > > with other Apache projects in the Hadoop ecosystem. > > > > > > Community > > > > > > We have already begun building a community but at this time the > community > > > consists only of Trafodion developers – all HP employees – and > > prospective > > > users. We have participated in and hosted HBase Meetups and intend to > > ramp > > > up our community building efforts. > > > > > > The Trafodion project has seen interest in China, where HP has > conducted > > > proof-of-concepts with multiple companies and expects to see some of > its > > > first commercial deployments. To help recruit contributors and users in > > > China, members of the team are translating Trafodion wiki content into > > > Mandarin. > > > > > > Core Developers > > > > > > The core developers are very experienced in database and transaction > > > monitor technology, with many having spent more than 20 years working > in > > > this space. > > > > > > Alignment > > > > > > Apache Trafodion relies on Apache HBase as its storage engine. The > > > development team has collaborated with and gained valuable advice from > > > working with the Apache HBase core developers. Apache Trafodion has > > > federation capabilities as well, and can query Trafodion tables stored > in > > > HBase, native HBase tables, and Apache Hive tables. > > > > > > Known Risks > > > > > > Orphaned Products > > > > > > HP Labs and HP-IT have been incubating Trafodion development for almost > > two > > > years. This is part of HP’s strategy to leverage its investment in > > database > > > software and bring software to market as open source and is similar to > > HP’s > > > efforts with OpenStack. Trafodion builds on HP’s equity investment in > the > > > Hadoop ecosystem and its efforts to monetize Hadoop through hardware, > > > software, and services. HP wants Trafodion to be successful, as HP will > > > offer a commercially supported distribution of Trafodion. > > > > > > Inexperience with Open Source > > > > > > We have been working with open source software in building closed > source > > > software for well over two decades. To help transition to doing open > > source > > > development, the development team received guidance and best practices > > from > > > HP developers working on OpenStack open source projects, many of whom > > have > > > experience working on Apache and other open source projects as well. > > Since > > > releasing Trafodion as an open source project in June of 2014, the > > > committers and contributors have moved forward using open source > > > development processes and tools for bug tracking and design blueprints > > and > > > Jenkins for continuous integration. As part of the incubation process, > we > > > recognize we may need to change some of our development processes/tools > > and > > > conduct our discussions using Apache email dlists. > > > > > > Homogenous Developers > > > > > > Since the initial development of Trafodion has been supported by HP, > all > > of > > > the current developers are HP employees. Through the support of the > > Apache > > > incubation project, we aim to expand the list of developers and gain > > > contributors from related SQL-on-Hadoop projects and the Apache HBase > > > project. Trafodion developers are experienced with distributed > > development > > > processes, being primarily based in Palo Alto, CA; Austin, TX; and > > > Shanghai, China. Trafodion is written in C++ and Java. > > > > > > Reliance on Salaried Developers > > > > > > Currently all of the developers working on the project are paid by > their > > > employer to work on the project. These developers will work on the open > > > source project as well as work on the commercially supported > distribution > > > of Trafodion that HP will offer. > > > > > > Relationship with Other Apache Products > > > > > > Trafodion is built upon Apache HBase and extends it to support ACID > > > transactions with HBase co-processors for distributed transaction > > > management and recovery. Trafodion envisions future collaborations with > > the > > > Apache HBase project on performance optimizations, such as in the areas > > of > > > mixed workload support, High Availability, etc. It also provides > > > transactional support and querying from native HBase tables as well. > > > > > > Trafodion uses Apache Zookeeper to coordinate and manage the > distribution > > > of connection services across the cluster for load-balancing and high > > > availability reconnection purposes in the event a Trafodion process > > should > > > fail. > > > > > > Trafodion also envisions working with the Apache Ambari project on > > enabling > > > better Trafodion manageability. While Ambari focuses on system and > > > component level performance metrics, Trafodion manageability will focus > > in > > > a complimentary way on database workload monitoring and performance > > > analytics with capabilities more geared towards database > administrators. > > > > > > There are alternative open source projects that are providing > > SQL-on-Hadoop > > > capabilities, such as Apache Hive, Apache Drill, and Apache Phoenix. > > These > > > are more focused on reporting and analytics across data structures > > > supported on HDFS. In comparison to all of these technologies Trafodion > > > provides a very complete implementation of ANSI SQL, one of the most > > > sophisticated optimizers for such workloads, a completely parallel data > > > flow architecture that does not materialize intermediate results unless > > > necessary, full ACID transactional support, ANSI GRANT/REVOKE security, > > and > > > other capabilities that would take decades to build in these products. > On > > > the other hand currently Trafodion is just focused on HBase and > querying > > > Hive, whereas Hive and Drill provide access to other data formats in > > HDFS. > > > > > > An Excessive Fascination with the Apache Brand > > > > > > We understand the reputation and value of the Apache brand, and no > doubt > > > believe that it will help us attract contributors and users. Our > primary > > > goal is to follow a proven, open source development and community > > building > > > model that will make Trafodion successful and enable better > collaboration > > > with other Apache projects in the Hadoop ecosystem. We also understand > > the > > > rules and guidelines about the use of the Apache brand and intend to > > follow > > > them. > > > > > > Documentation > > > > > > Documentation and technical details on Trafodion can be found at: > > > http://www.trafodion.org/ > > > > > > Initial Source > > > > > > The source is available today in a public github repository: > > > https://github.com/trafodion/trafodion. > > > > > > Source and Intellectual Property Submission Plan > > > > > > The source code has already been released under the Apache License, > > Version > > > 2. The manuals have been released in Adobe PDF format. As part of the > > > submission process, the source for the manuals will be converted from a > > > proprietary DocBook XML format to AsciiDoc. > > > > > > External Dependencies > > > > > > Two dependencies do not have Apache compatible licenses and will be > > > addressed as we enter incubation. One dependency is log4cpp, which is > > > licensed under the LGPL. A compatible alternative might be Apache > > incubator > > > project log4cxx. The other dependency is unixodbc, which is used as the > > > ODBC driver manager. We will look into how Apache Hive manages being > able > > > to use this incompatible software and do similar. All other > dependencies > > > have Apache compatible licenses, including Apache 2.0, MIT/X11, MIT, > and > > > BSD. > > > > > > Cryptography > > > > > > Trafodion does not contain any cryptographic code. It does call > > > cryptographic libraries: OpenSSL for C++ code and Java Cryptography > > > Extension (JCE) for Java code. > > > > > > Required Resources > > > > > > Mailing Lists > > > > > > priv...@trafodion.incubator.apache.org > > > d...@trafodion.incubator.apache.org > > comm...@trafodion.incubator.apache.org > > > > > > Git Repository > > > > > > https://git-wip-us.apache.org/repos/afs/incubator-trafodion.git > > > > > > Issue Tracking > > > > > > JIRA: JIRA Trafodion (Trafodion) > > > > > > > > > Initial Committers and Affiliation > > > > > > Dave Birdsall, Hewlett-Packard Company, Dave.Birdsall<AT>hp<DOT>com > > > Matt Brown, Hewlett-Packard Company, mattbrown<AT>hp<DOT>com > > > Tharak Capirala, Hewlett-Packard Company, Tharak.Capirala<AT>hp<DOT>com > > > Alice Chen, Hewlett-Packard Company, Alice.Chen<AT>hp<DOT>com > > > John DeRoo, Hewlett-Packard Company, John.Deroo<AT>hp<DOT>com > > > Roberta Marton, Hewlett-Packard Company, Roberta.Marton<AT>hp<DOT>com > > > Amanda Moran, Hewlett-Packard Company, Amanda.Kay.Moran<AT>hp<DOT>com > > > Suresh Subbiah, Hewlett-Packard Company, Suresh.Subbiah<AT>hp<DOT>com > > > Sandyha Sundaresan, Hewlett-Packard Company, > > > Sandhya.Sundaresan<AT>hp<DOT>com > > > > > > Sponsors > > > > > > Champion > > > > > > Michael Stack, Stack<AT>apache<DOT>org > > > > > > Nominated Mentors > > > > > > Michael Stack, Stack<AT>apache<DOT>org > > > Roman Shaposhnik, rshaposhnik<AT>pivotal<DOT>io > > > > > > We are seeking additional mentors. > > > > > > Sponsoring Entity > > > > > > Apache Incubator PMC > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > >