Thank you Sharad! ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----Original Message----- From: Sharad Agarwal <sha...@apache.org> Reply-To: "general@incubator.apache.org" <general@incubator.apache.org>, "sha...@apache.org" <sha...@apache.org> Date: Friday, September 19, 2014 8:59 PM To: "general@incubator.apache.org" <general@incubator.apache.org> Subject: Re: [PROPOSAL] Grill as new Incubator project >Chris, >Multi-dimensional here is in the context of OLAP cube -> >http://en.wikipedia.org/wiki/OLAP_cube >Grill data model consists of set of measures which can be analysed on >different dimensions. >For remote sensing, data can be modelled as cube -> measurement on >various >set of attributes(dimensions) as Facts; and time and space can be thought >of dimensions. >Yes, it supports numerical data. > > >Ted, >Both are in same general area, but I think there is very little chance of >confusion as clearly their propositions are completely different. And both >words are simple and widely used nouns. >We liked the name Grill as it is simple to spell and pronounce, and in >some >way convey the project's meaning -> to question intensely. > >Thanks, >Sharad > >On Sat, Sep 20, 2014 at 12:11 AM, Ted Dunning <ted.dunn...@gmail.com> >wrote: > >> There is a strong phonetic similarity to Apache Drill, a project in the >> same general domain. >> >> Is the Grill name already baked in (pun intended)? >> >> >> >> On Fri, Sep 19, 2014 at 7:24 AM, Mattmann, Chris A (3980) < >> chris.a.mattm...@jpl.nasa.gov> wrote: >> >> > Thank you Sharad. So I could use this system for remote sensing >> > data, like 3-dimension (time, space, and measurement) type of cubes? >> > Does it support numerical data well? >> > >> > Sorry for so many questions just excited :) >> > >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > Chris Mattmann, Ph.D. >> > Chief Architect >> > Instrument Software and Science Data Systems Section (398) >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> > Office: 168-519, Mailstop: 168-527 >> > Email: chris.a.mattm...@nasa.gov >> > WWW: http://sunset.usc.edu/~mattmann/ >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > Adjunct Associate Professor, Computer Science Department >> > University of Southern California, Los Angeles, CA 90089 USA >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > >> > >> > >> > >> > >> > >> > -----Original Message----- >> > From: Sharad Agarwal <sha...@apache.org> >> > Reply-To: "sha...@apache.org" <sha...@apache.org> >> > Date: Friday, September 19, 2014 4:06 AM >> > To: Chris Mattmann <chris.a.mattm...@jpl.nasa.gov> >> > Cc: "general@incubator.apache.org" <general@incubator.apache.org> >> > Subject: Re: [PROPOSAL] Grill as new Incubator project >> > >> > >Chris, Thanks for your comments. >> > > >> > > >> > >The differences that I see are: >> > >- SciDB exposes Array Data model and Array Query Language (AQL). >>Grill >> > >data model is based on OLAP Fact and Dimensions. Grill exposes SQL >>like >> > >language (a subset of Hive QL) that works on *logical* entities >>(facts, >> > >dimensions) >> > > >> > > >> > >- The goal of Grill is not to build a new query execution database, >>but >> > >to unify them by having a central metadata catalog, and provide a >>Cube >> > >abstraction layer on top of it. >> > > >> > > >> > > >> > >Thanks, >> > >Sharad >> > > >> > > >> > >On Fri, Sep 19, 2014 at 9:34 AM, Mattmann, Chris A (3980) >> > ><chris.a.mattm...@jpl.nasa.gov> wrote: >> > > >> > >This sounds super cool! >> > > >> > >How does this relate to SciDB? is it trying to do a similar thing? >> > > >> > >Cheers, >> > >Chris >> > > >> > > >> > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > >Chris Mattmann, Ph.D. >> > >Chief Architect >> > >Instrument Software and Science Data Systems Section (398) >> > >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> > >Office: 168-519, Mailstop: 168-527 >> > >Email: chris.a.mattm...@nasa.gov >> > >WWW: http://sunset.usc.edu/~mattmann/ >> > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > >Adjunct Associate Professor, Computer Science Department >> > >University of Southern California, Los Angeles, CA 90089 USA >> > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > > >> > > >> > > >> > > >> > > >> > > >> > >-----Original Message----- >> > >From: Sharad Agarwal <sha...@apache.org> >> > >Reply-To: "general@incubator.apache.org" >><general@incubator.apache.org >> >, >> > >"sha...@apache.org" <sha...@apache.org> >> > >Date: Thursday, September 18, 2014 8:54 PM >> > >To: "general@incubator.apache.org" <general@incubator.apache.org> >> > >Subject: [PROPOSAL] Grill as new Incubator project >> > > >> > >>Grill Proposal >> > >>========== >> > >> >> > >># Abstract >> > >> >> > >>Grill is a platform that enables multi-dimensional queries in a >>unified >> > >>way >> > >>over datasets stored in multiple warehouses. Grill integrates Apache >> Hive >> > >>with other data warehouses by tiering them together to form logical >> data >> > >>cubes. >> > >> >> > >> >> > >># Proposal >> > >> >> > >>Grill provides a unified Cube abstraction for data stored in >>different >> > >>stores. Grill tiers multiple data warehouses for unified >>representation >> > >>and >> > >>efficient access. It provides SQL-like Cube query language to query >>and >> > >>describe data sets organized in data cubes. It enables users to run >> > >>queries >> > >>against Facts and Dimensions that can span multiple physical tables >> > >>stored >> > >>in different stores. >> > >> >> > >>The primary use cases that Grill aims to solve: >> > >>- Facilitate analytical queries by providing the OLAP like Cube >> > >>abstraction >> > >>- Data Discovery by providing single metadata layer for data stored >>in >> > >>different stores >> > >>- Unified access to data by integrating Hive with other traditional >> data >> > >>warehouses >> > >> >> > >> >> > >># Background >> > >> >> > >>Apache Hive is a data warehouse that facilitates querying and >>managing >> > >>large datasets stored in distributed storage systems like HDFS. It >> > >>provides >> > >>SQL like language called HiveQL aka HQL. Apache Hive is a widely >>used >> > >>platform in various organizations for doing adhoc analytical >>queries. >> > >>In a typical Data warehouse scenario, the data is multi-dimensional >>and >> > >>organized into Facts and Dimensions to form Data Cubes. Grill >>provides >> > >>this >> > >>logical layer to enable querying and manage data as Cubes. >> > >>The Grill project is actively being developed at InMobi to provide >>the >> > >>higher level of analytical abstraction to query data stored in >> different >> > >>storages including Hive and beyond seamlessly. >> > >> >> > >> >> > >># Rationale >> > >> >> > >>The Grill project aims to ease the analytical querying capabilities >>and >> > >>cut >> > >>the data-silos by providing a single view of data across multiple >>data >> > >>stores. >> > >>Conceiving data as a cube with hierarchical dimensions leads to >> > >>conceptually straightforward operations to facilitate analysis. >> > >>Integrating >> > >>Apache Hive with other traditional warehouses provides the >>opportunity >> to >> > >>optimize on the query execution cost by tiering the data across >> multiple >> > >>warehouses. Grill provides >> > >>- Access to data Cubes via Cube Query language similar to HiveQL. >> > >>- Driver based architecture to allow for plugging systems like Hive >>and >> > >>other warehouses such as columnar data RDBMS. >> > >>- Cost based engine selection that provides optimal use of >>resources by >> > >>selecting the best execution engine for a given query. >> > >> >> > >>In a typical Data warehouse, data is organized in Cubes with >>multiple >> > >>dimensions and measures. This facilitates the analysis by conceiving >> the >> > >>data in terms of Facts and Dimensions instead of physical tables. >>Grill >> > >>aims to provide this logical Cube abstraction on Data warehouses >>like >> > >>Hive >> > >>and other traditional warehouses. >> > >> >> > >> >> > >># Initial Goals >> > >> >> > >>- Donate the Grill source code and documentation to Apache Software >> > >>Foundation >> > >>- Build a user and developer community >> > >>- Support Hive and other Columnar data warehouses >> > >>- Support full query life cycle management >> > >>- Add authentication for querying cubes >> > >>- Provide detailed query statistics >> > >> >> > >> >> > >># Long Term Goals >> > >> >> > >>Here are some longer-term capabilities that would be added to Grill >> > >>- Add authorization for managing and querying Cubes >> > >>- Provide REST and CLI for full Admin controls >> > >>- Capability to schedule queries >> > >>- Query caching >> > >>- Integrate with Apache Spark. Creating Spark RDD from Grill query >> > >>- Integrate with Apache Optiq >> > >> >> > >> >> > >># Current Status >> > >> >> > >>The project is actively developed at InMobi. The first version is >> > >>deployed >> > >>at InMobi 4 months back. This version allows querying dimension and >> fact >> > >>data stored in Hive over CLI. The source code and documentation is >> hosted >> > >>at GitHub. >> > >> >> > >>## Meritocracy >> > >> >> > >>We intend to build a diverse developer and user community for the >> project >> > >>following the Apache meritocracy model. We want to encourage >> contributors >> > >>from multiple organizations, provide plenty of support to new >> developers >> > >>and welcome them to be committers. >> > >> >> > >>## Community >> > >> >> > >>Currently the project is being developed at InMobi. We hope to >>extend >> our >> > >>contributor and user base significantly in the future and build a >>solid >> > >>open source community around Grill. >> > >>Core Developers >> > >>Grill is currently being developed by Amareshwari Sriramadasu, >>Sharad >> > >>Agarwal and Jaideep Dhok from InMobi, and Sreekanth Ramakrishnan >>who is >> > >>currently employed by SoftwareAG. Raghavendra Singh from InMobi has >> built >> > >>the QA automation for Grill. >> > >> >> > >>## Alignment >> > >> >> > >>The ASF is a natural home to Grill as it is for Apache Hadoop, >>Apache >> > >>Hive, >> > >>Apache Spark and other emerging projects in Big Data space. >> > >>We believe in any enterprise, multiple data warehouses will >>co-exist, >> as >> > >>not all workloads are cost effective to run on single one. Apache >>Hive >> is >> > >>one of the crucial data warehouse along with upcoming projects like >> > >>Apache >> > >>Spark in Hadoop ecosystem. Grill will benefit in working in close >> > >>proximity >> > >>with these projects. >> > >>The traditional Columnar data warehouses complement Apache Hive as >> > >>certain >> > >>workloads continue to be cost effective to run in traditional >>columnar >> > >>data >> > >>warehouses. Having multiple data warehouses leads to data silos that >> > >>Grill >> > >>aims to cut within the enterprise and provide a holistic unified >>access >> > >>to >> > >>data. >> > >> >> > >> >> > >># Known Risks >> > >> >> > >>## Orphaned products & Reliance on Salaried Developers >> > >> >> > >>There is little risk of Grill getting orphaned, as Grill is key >>part of >> > >>the >> > >>Data Platform stack at InMobi. The core Grill developers plan to >>work >> on >> > >>it >> > >>full-time. We think Grill will bring value in the Big Data space >>and we >> > >>plan to grow the community of users and contributors. >> > >> >> > >>## Inexperience with Open Source >> > >> >> > >>All the core developers have long and significant experience in >>Apache >> > >>projects and Hadoop ecosystem. Amareshwari Sriramadasu has long >> standing >> > >>contributions to Apache Hadoop MapReduce and Apache Hive, she being >>PMC >> > >>member of Hadoop and a committer of Hive. Sharad Agarwal is a PMC >> member >> > >>of >> > >>Hadoop and contributed to Hadoop YARN and Hadoop MapReduce. Srikanth >> > >>Sundarrajan is a PMC member of Apache Falcon. Sreekanth >>Ramakrishnan >> is >> > >>committer of Apache Hadoop. Jaideep Dhok has contributed patches to >> > >>Apache >> > >>Hive. Gunther is a PMC member of Apache Hive. Vikram is a committer >>of >> > >>Apache Hive. >> > >> >> > >>## Homogeneous Developers >> > >> >> > >>The initial developers are employed by Hortonworks, InMobi and >> > >>SoftwareAG. >> > >>We are committed to recruiting additional committers from other >> companies >> > >>based on their contribution to the project. >> > >> >> > >>## Reliance on Salaried Developers >> > >> >> > >>The majority of initial committers are paid by their employee to >> > >>contribute >> > >>to the project and few are contributing in their spare time. Once >>the >> > >>project has a community built, we are committed to recruit >>committers >> and >> > >>developers from outside the current core developers. >> > >> >> > >>## Relationships with Other Apache Products >> > >> >> > >>Grill is deeply integrated with other Apache projects. Grill uses >>and >> > >>extends Apache Hive HCatalog to store and manage the Data cubes. It >> uses >> > >>HDFS and Hive session management libraries. Grill has the >>driver-based >> > >>architecture that allows for adding multiple execution drivers. >>Apart >> > >>from >> > >>integrating Apache Hive, it can be integrated with Apache Spark over >> > >>Spark >> > >>SQL or Shark, Apache Drill, Apache Tajo and Apache Phoenix. >> > >>In future we want to use Apache Optiq in Grill for query >>optimization >> and >> > >>cost based driver selection. >> > >> >> > >>## An Excessive Fascination with the Apache Brand >> > >> >> > >>The project is conceived from beginning to be in line with the >>Apache >> > >>philosophy. As the core developers have good experience with Apache, >> the >> > >>source code organization, build, review and commit process are >>highly >> > >>influenced by Apache. We believe that Apache will be a solid home >>for >> > >>Grill >> > >>to grow and build the open source community. We have also described >>the >> > >>reasons in the Rationale and Alignment sections. >> > >> >> > >> >> > >># Documentation >> > >> >> > >>http://inmobi.github.io/grill/ >> > >> >> > >> >> > >># Initial Source >> > >> >> > >>The source is currently in github repository at: >> > >>https://github.com/inmobi/grill >> > >> >> > >> >> > >># Source and Intellectual Property Submission Plan >> > >> >> > >>The complete Grill code is already under Apache Software License 2. >> > >> >> > >> >> > >># External Dependencies >> > >> >> > >>The dependencies all have Apache compatible licenses. These include >> > >>Apache >> > >>2.0, BSD, MIT, EPL and CDDL licensed dependencies. >> > >> >> > >> >> > >># Cryptography >> > >> >> > >>None >> > >> >> > >> >> > >># Required Resources >> > >> >> > >>## Mailing lists >> > >> >> > >>grill-dev AT incubator DOT apache DOT org >> > >>grill-commits AT incubator DOT apache DOT org >> > >>grill-private AT incubator DOT apache DOT org >> > >> >> > >>## Subversion Directory >> > >> >> > >>Git is the preferred source control system: git:// >> > >>git.apache.org/incubator-grill >><http://git.apache.org/incubator-grill> >> > >> >> > >>## Issue Tracking >> > >> >> > >>JIRA Grill (GRILL) >> > >> >> > >> >> > >># Initial Committers >> > >> >> > >>Amareshwari Sriramadasu (amareshwari AT apache DOT org) >> > >>Gunther Hagleitner (gunther AT apache DOT org) >> > >>Jaideep Dhok (jaideep.dhok AT Inmobi DOT com) >> > >>Raghavendra Singh (raghavendra.singh AT Inmobi DOT com) >> > >>Sharad Agarwal (sharad AT apache DOT org) >> > >>Sreekanth Ramakrishnan (sreekanth AT apache DOT org) >> > >>Srikanth Sundarrajan (sriksun AT apache DOT org) >> > >>Suma Shivaprasad (suma.shivaprasad AT Inmobi DOT com) >> > >>Vikram Dixit (vikram AT apache DOT org) >> > >> >> > >> >> > >># Affiliations >> > >> >> > >>Amareshwari SR (InMobi) >> > >>Gunther Hagleitner (Hortonworks) >> > >>Jaideep Dhok (InMobi) >> > >>Raghavendra Singh (InMobi) >> > >>Sharad Agarwal (InMobi) >> > >>Sreekanth Ramakrishnan (SoftwareAG) >> > >>Srikanth Sundarrajan (InMobi) >> > >>Suma Shivaprasad (InMobi) >> > >>Vikram Dixit (Hortonworks) >> > >> >> > >> >> > >># Sponsors >> > >> >> > >>## Champion >> > >> >> > >>Vinod K <vinodkv AT apache DOT org> (Apache Member) >> > >> >> > >>## Nominated Mentors >> > >> >> > >>Chris Douglas (Microsoft) >> > >>Jacob Homan (Microsoft) >> > >>Jean Baptiste Onofre (Talend) >> > >>Vinod K (Hortonworks) >> > >> >> > >>## Sponsoring Entity >> > >> >> > >>Incubator PMC >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> > For additional commands, e-mail: general-h...@incubator.apache.org >> > >> > >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org