Ah sorry, I did not mean "asking to update", I meant "proposing to update".
Thanks, - Henry On Fri, May 2, 2014 at 11:20 AM, Henry Saputra <henry.sapu...@gmail.com> wrote: > HI Ashutosh, > > Since there was a question/ comment about relationship with Apache > MetaModel, I am asking to update the proposal to include this > discussion in either "Relationships with Other Apache Products" or > "Alignment" section before going for a VOTE. > > Apache Slider did the same thing with relation to Apache Twill and > Apache Helix projects. > > Thanks, > > - Henry > > On Wed, Apr 30, 2014 at 3:21 PM, Ashutosh Chauhan <hashut...@apache.org> > wrote: >> I would like to propose Optiq as an Apache Incubator project. I have >> posted the proposal to https://wiki.apache.org/incubator/OptiqProposal and >> posted the text of the proposal below. >> >> Ashutosh. >> >> = Optiq = >> == Abstract == >> >> Optiq is a framework that allows efficient translation of queries involving >> heterogeneous and federated data. >> >> == Proposal == >> >> Optiq is a highly customizable engine for parsing and planning queries on >> data in a wide variety of formats. It allows database-like access, and in >> particular a SQL interface and advanced query optimization, for data not >> residing in a traditional database. >> >> == Background == >> >> Databases were traditionally engineered in a monolithic stack, providing a >> data storage format, data processing algorithms, query parser, query >> planner, built-in functions, metadata repository and connectivity layer. >> They innovate in some areas but rarely in all. >> >> Modern data management systems are decomposing that stack into separate >> components, separating data, processing engine, metadata, and query >> language support. They are highly heterogeneous, with data in multiple >> locations and formats, caching and redundant data, different workloads, and >> processing occurring in different engines. >> >> Query planning (sometimes called query optimization) has always been a key >> function of a DBMS, because it allows the implementors to introduce new >> query-processing algorithms, and allows data administrators to re-organize >> the data without affecting applications built on that data. In a >> componentized system, the query planner integrates the components (data >> formats, engines, algorithms) without introducing unncessary coupling or >> performance tradeoffs. >> >> But building a query planner is hard; many systems muddle along without a >> planner, and indeed a SQL interface, until the demand from their customers >> is overwhelming. >> >> There is an opportunity to make this process more efficient by creating a >> re-usable framework. >> >> == Rationale == >> >> Optiq allows database-like access, and in particular a SQL interface and >> advanced query optimization, for data not residing in a traditional >> database. It is complementary to many current Hadoop and NoSQL systems, >> which have innovative and performant storage and runtime systems but lack a >> SQL interface and intelligent query translation. >> >> Optiq is already in use by several projects, including Apache Drill, Apache >> Hive and Cascading Lingual, and commercial products. >> >> Optiq's architecture consists of: >> >> An extensible relational algebra. >> SPIs (service-provider interfaces) for metadata (schemas and tables), >> planner rules, statistics, cost-estimates, user-defined functions. >> Built-in sets of rules for logical transformations and common data-sources. >> Two query planning engines driven by rules, statistics, etc. One engine is >> cost-based, the other rule-based. >> Optional SQL parser, validator and translator to relational algebra. >> Optional JDBC driver. >> == Initial Goals == >> >> The initial goals are be to move the existing codebase to Apache and >> integrate with the Apache development process. Once this is accomplished, >> we plan for incremental development and releases that follow the Apache >> guidelines. >> >> As we move the code into the org.apache namespace, we will restructure >> components as necessary to allow clients to use just the components of >> Optiq that they need. >> >> A version 1.0 release, including pre-built binaries, will foster wider >> adoption. >> >> == Current Status == >> >> Optiq has had over a dozen minor releases over the last 18 months. Its core >> SQL parser and validator, and its planning engine and core rules, are >> mature and robust and are the basis for several production systems; but >> other components and SPIs are still undergoing rapid evolution. >> >> === Meritocracy === >> >> We plan to invest in supporting a meritocracy. We will discuss the >> requirements in an open forum. We encourage the companies and projects >> using Optiq to discuss their requirements in an open forum and to >> participate in development. We will encourage and monitor community >> participation so that privileges can be extended to those that contribute. >> >> Optiq's pluggable architecture encourages developers to contribute >> extensions such as adapters for data sources, new planning rules, and >> better statistics and cost-estimation functions. We look forward to >> fostering a rich ecosystem of extensions. >> >> === Community === >> >> Building a data management system requires a high degree of technical >> skill, and correspondingly, the community of developers directly using >> Optiq is potentially fairly small, albeit highly technical and engaged. But >> we also expect engagement from members of the communities of projects that >> use Optiq, such as Drill and Hive. And we intend to structure Optiq so that >> it can be used for lighter weight applications, such as providing a SQL and >> JDBC interface to a NoSQL system. >> >> === Core Developers === >> >> The developers on the initial committers list are all experienced open >> source developers, and are actively using Optiq in their projects. >> >> * Julian Hyde is lead developer of Mondrian, an open source OLAP engine, >> and an Apache Drill committer. >> * Chris Wensel is lead developer of Cascading, and of Lingual, the SQL >> interface to Cascading built using Optiq. >> * Jacques Nadeau is lead developer of Apache Drill, which uses Optiq. >> >> In addition, there are several regular contributors whom we hope will >> graduate to committers during the incubation process. >> >> We realize that additional employer diversity is needed, and we will work >> aggressively to recruit developers from additional companies. >> >> === Alignment === >> >> Apache, and in particular the ecosystem surrounding Hadoop, contains >> several projects for building data management systems that leverage each >> other's capabilities. Optiq is a natural fit for that ecosystem, and will >> help foster projects meeting new challenges. >> >> Optiq is already used by Apache Hive and Apache Drill; Optiq embeds Apache >> Spark as an optional engine; we are in discussion with Apache Phoenix about >> integrating JDBC and query planning. >> >> == Known Risks == >> >> === Orphaned Products === >> >> Optiq is already a key component in three independent projects, each backed >> by a different company, so the risk of being orphaned is relatively low. We >> plan to mitigate this risk by recruiting additional committers, and >> promoting Optiq's adoption as a framework by other projects. >> >> === Inexperience with Open Source === >> >> The initial committers are all Apache members, some of whom have several >> years in the Apache Hadoop community. The founder of the project, Julian >> Hyde, has been a founder and key developer in open source projects for over >> ten years. >> >> === Homogenous Developers === >> >> The initial committers are employed by a number of companies, including >> Concurrent, Hortonworks, MapR Technologies and Salesforce.com. We are >> committed to recruiting additional committers from outside these companies. >> >> === Reliance on Salaried Developers === >> >> Like most open source projects, Optiq receives substantial support from >> salaried developers. This is to be expected given that it is a highly >> technical framework. However, they are all passionate about the project, >> and we are confident that the project will continue even if no salaried >> developers contribute to the project. As a framework, the project >> encourages the involvement of members of other projects, and of academic >> researchers. We are committed to recruiting additional committers including >> non-salaried developers. >> >> === Relationships with Other Apache Products === >> >> As mentioned in the Alignment section, Optiq is being used by Apache Hive >> and Apache Drill, and has adapters for Apache Phoenix and Apache Spark. >> Optiq often operates on data in a Hadoop environment, so collaboration with >> other Hadoop projects is desirable and highly likely. >> >> === An Excessive Fascination with the Apache Brand === >> >> Optiq solves a real problem, as evidenced by its take-up by other projects. >> This proposal is not for the purpose of generating publicity. Rather, the >> primary benefits to joining Apache are those outlined in the Rationale >> section. >> >> == Documentation == >> >> Additional documentation for Optiq may be found on its github site: >> >> * [[https://github.com/julianhyde/optiq/blob/master/README.md|Overview]] >> * [[ >> https://github.com/julianhyde/optiq-csv/blob/master/TUTORIAL.md|Tutorial]] >> * [[https://github.com/julianhyde/optiq/blob/master/HOWTO.md|HOWTO]] >> * >> [[https://github.com/julianhyde/optiq/blob/master/REFERENCE.md|Referenceguide]] >> >> === Presentation: === >> >> *[[ >> https://github.com/julianhyde/share/blob/master/slides/optiq-richrelevance-2013.pdf?raw=true| >> SQL on Big Data using Optiq]] >> == Initial Source == >> >> The initial code codebase resides in three projects, all hosted on github: >> >> * https://github.com/julianhyde/optiq >> * https://github.com/julianhyde/optiq-csv >> * https://github.com/julianhyde/linq4j >> >> === Source and Intellectual Property Submission Plan === >> >> The initial codebase is already distributed under the Apache 2.0 License. >> The owners of the IP have indicated willingness to sign the SGA. >> >> === External Dependencies === >> >> Optiq and Linq4j have the following external dependencies. >> >> * Java 1.6, 1.7 or 1.8 >> * Apache Maven, Commons >> * JavaCC (BSD license) >> * Sqlline 1.1.6 (BSD license) >> * Junit 4.11 (EPL) >> * Janino (BSD license) >> * Guava (Apache 2.0 license) >> * Eigenbase-resgen, eigenbase-xom, eigenbase-properties (Apache 2.0 >> license) >> >> Some of Optiq's adapters (optiq-csv, optiq-mongodb, optiq-spark, >> optiq-splunk) are currently developed alongside core Optiq, and have the >> following additional dependencies: >> >> * Open CSV 2.3 (Apache 2.0 license) >> * Apache Incubator Spark >> * Mongo Java driver (Apache 2.0 license) >> Upon acceptance to the incubator, we would begin a thorough analysis of all >> transitive dependencies to verify this information and introduce license >> checking into the build and release process by integrating with Apache Rat. >> >> === Cryptography === >> >> Optiq will eventually support encryption on the wire. This is not one of >> the initial goals, and we do not expect Optiq to be a controlled export >> item due to the use of encryption. >> >> == Required Resources == >> >> === Mailing Lists === >> >> * priv...@optiq.incubator.apache.org >> * d...@optiq.incubator.apache.org (will be migrated from >> optiq-...@googlegroups.com) >> * comm...@optiq.incubator.apache.org >> >> === Source control === >> >> The Optiq team would like to use git for source control, due to our current >> use of git/github. We request a writeable git repo git:// >> git.apache.org/incubator-optiq, and mirroring to be set up to github >> through INFRA. >> >> === Issue Tracking === >> >> Optiq currently uses the github issue tracking system associated with its >> github repo: https://github.com/julianhyde/optiq/issues. We will migrate to >> the Apache JIRA: http://issues.apache.org/jira/browse/OPTIQ. >> >> == Initial Committers == >> >> * Julian Hyde (jhyde at apache dot org) >> * Jacques Nadeau (jacques at apache dot org) >> * James R. Taylor (jamestaylor at apache dot org) >> * Chris Wensel (cwensel at apache dot org) >> >> === Affiliations === >> >> The initial committers are employees of Concurrent, Hortonworks, MapR and >> Salesforce.com. >> >> * Julian Hyde (Hortonworks) >> * Jacques Nadeau (MapR Technologies) >> * James R. Taylor (Salesforce.com) >> * Chris Wensel (Concurrent) >> >> == Sponsors == >> >> === Champion === >> >> * Ashutosh Chauhan (hashutosh at apache dot org) >> >> === Nominated Mentors === >> >> * Ted Dunning (tdunning at apache dot org) - Chief Application Architect >> at MapR Technologies; committer for Lucene, Mahout and ZooKeeper. >> * Alan Gates (gates at apache dot org) - Architect at Hortonworks; >> committer for Pig, Hive and others. >> * Steven Noels (stevenn at apache dot org) - Chief Technical Officer at >> NGDATA; committer for Cocoon and Forrest, mentor for Phoenix. >> >> === Sponsoring Entity === >> >> The Apache Incubator. --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org