+1 (nonbinding) On Wed, Apr 26, 2017 at 11:13 PM Jeff Feng <jeff.f...@gmail.com> wrote:
> Hello everyone, > > Thank you for checking out our proposal on Superset and for your > consideration for the Apache Incubator. So far, I believe we have 8 > binding votes and 2 non-binding votes. > > As Taylor mentioned earlier, we made a minor update to the wording in the > "Source and Intellectual Property Submission Plan" section based on a > suggestion by John Ament. The update was to help confirm the previously > unstated assumption that we will submit an SGA. I have copied the updated > proposal from the wiki to the email below and highlighted (in yellow) the > new sentence below in the document. > > Folks on the cc line who have already voted, please let us know if the > change impacts your vote. > > Thank you all, > Jeff > > > > = Superset = > > == Abstract == > Superset is an enterprise-ready web application for data exploration, data > visualization and dashboarding. > > == Proposal == > Superset is business intelligence (BI) software that helps modern > organizations visualize and interact with their data. Superset enables > users explore data from a variety of databases, assemble beautiful > dashboards and share their findings. Superset works neatly with all modern > SQL-speaking databases, and integrates with Druid.io to provide real-time, > interactive, blazing fast data access to large datasets. > > == Background == > Data is mission critical. To succeed in this era, organizations need to > provide low-friction, intuitive and interactive access to data. It is > paramount for knowledge workers to be capable of answering their own > questions by querying, exploring and visualizing data. > > The entire business intelligence industry has pivoted from a model of > centralized top-down platforms driven by IT organizations to self-service > analytics and agile workflows by any user. This shift unblocks centralized > service bottlenecks for creating data visualizations while also creating an > environment that is iterative and fast-moving. This means that business > intelligence software must also be easy and delightful to use. > Self-service analytics doesn’t mean that admin and governance features are > not needed. > Modern BI tools provide fine-grain access controls and auditing > capabilities to understand how data is being used. Superset is a solution > that delivers on all of these vectors. > > The technology stack is also constantly morphing - vendors are struggling > to provide cheap, quick and easy solutions to access data. Business > intelligence users are finding existing solutions lacking as these software > products either disregard or react slowly to recent game-changing > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js, > React.js and iPython’s Jupyter for instance. > > == Rationale == > Business intelligence is more relevant today than at any other point in > history. Organizations are currently very limited in options for open > source data visualization solutions, especially solutions that are both > self-service and enterprise-ready. Every company informing their decisions > with data needs a BI tool. > > We believe that Superset will be a strong compliment to existing Apache > Software Foundation technologies by offering scalable user interactions to > distributed storage and computation solutions. Users will often find that > Superset can act as a catalyst for tooling that can visualize the byproduct > of data and computation infrastructure. > > Superset has many key design elements that help fill a gap in current > solutions for organizations: > * Easy, low friction access to data through a simple, web-based data > exploration interface. Composing charts and dashboards are intuitive. > Eliminating the need to write code or SQL empowers anyone to use it. > * Access to a wide array of rich, interactive data visualization types. > * Enterprise-ready: Integration with different authentication mechanisms > and granular permissions centered around actions and data access. > * Realtime & fast: Superset provides realtime analytics at the speed of > thought on very large datasets when integrated with Druid.io. > * Broad data access: Consume data out of any SQL-speaking relational > database. > * Extensible: Can be extended to talk to many noSQL databases like Apache > Drill, Elastic Search, and other popular database engines. > * Fast loading dashboards with configurable web-scale caching. > * Plug-in framework that enables organizations to build custom analytical > applications with new UI/UX interfaces. > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users > with more flexibility. SQL Lab integrates with the visualization engine > seamlessly. > > == Initial Goals == > The initial goals of the Superset project are several-fold: > * Move the existing codebase to Apache and integrate with the Apache > development process. > * Redesign the user interface and interaction model for creating > visualizations/dashboards and connecting to data sources > * Build robust support for security and governance of the tool including > popular authorization modules (including Apache Ranger and Apache Sentry) > and a more sophisticated permissions system > * Grow the extensibility of the project both in terms of enhanced > connectivity to NoSQL-based data sources and creating a plug-in framework > that enables organizations to build custom analytical applications which > require a new UI/UX > > == Current Status == > By many standards, Superset is already a successful open source project. As > of March 2017, Superset is officially used in production at about a dozen > companies, has received contributions from over one hundred contributors on > Github, 1500+ forks, and 12k+ stars. > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made > significant contributions, and expressed their commitment to the project. > The product is feature complete and has been viable for months. It already > serves as the main interface for consuming data at many companies of > different sizes. > > While the product is usable, there’s room for improvement across the board, > starting with providing a smoother user experience around content creation, > making sure all features work out-of-the-box on more platforms and > databases, providing better user training guides and videos, having a > predictable release process, and increasing the overall quality of the > Superset releases. > > === Meritocracy === > We plan to invest in supporting a meritocracy. We will discuss the > requirements in an open forum. Several companies have expressed interest in > this project, and we intend to invite additional developers to participate. > We will encourage and monitor community participation so that privileges > can be extended to those that contribute. > > === Community === > The need for an enterprise-ready data visualization and exploration > platform in the open source community is tremendous. While Superset is > fairly well known, recognized and used within the Druid.io community, > adoption is currently limited outside of that niche. There is a huge > opportunity to grow the community to hundreds if not thousands of > organizations, and we are hoping that embracing “the Apache way” will > accelerate the growth of our community. > > We have already been active at seeking and inviting contributions, and are > planning to scale the project by investing time and growing the support > structure to grow the community. > > === Core Developers === > The initial committers for Superset include experienced full stack, > front-end and data engineers: > * Maxime Beauchemin (Airbnb) > * Alanna Scott (Airbnb) > * Bogdan Kyryliuk (Airbnb) > * Vera Liu (Airbnb) > * Jeff Feng (Airbnb) > * Ashutosh Chauhan (Hortonworks) > * Nishant Bangarwa (Hortonworks) > * Slim Bouguerra (Hortonworks) > * Priyank Shah (Hortonworks) > * Sriharsha Chintalapani (Hortonworks) > * Daniel Dai (Hortonworks) > > We realize that additional employer diversity is needed, and we will work > aggressively to recruit developers from additional companies. > > === Alignment === > The initial committers strongly believe that a system for interactive > visualization of data will gain broader adoption as an open source, > community driven project, where the community can contribute not only to > the core components, but also to a growing collection of connectors, > visualizations and improving integration a all potential data sources. > Superset already integrates closely with Apache Hive, the Hive metastore, > as well as most SQL-speaking databases found in modern data ecosystems. > > == Known Risks == > > === Orphaned Products === > Superset is a vital component for both visualizing, accessing and > democratizing data at Airbnb. Also at Hortonworks, Superset is a core > component of the DataFlow product offering. Thus, the risk of the project > being orphaned is relatively low. The project could be at risk if Airbnb > changes their approach for democratizing data or if Hortonworks changes > their strategy in the market. In such an event, the committers plan to > continue working on the project on their own time, thought the progress > will likely be slower. We plan to mitigate this risk by recruiting > additional committers. > > === Inexperience with Open Source === > The initial committers include veteran Apache members (committers and PPMC > members) and other developers who have varying degrees of experience with > open source projects. All have been involved with source code that has been > released under an open source license, and several also have experience > developing code with an open source development process. > > === Homogenous Developers === > The initial committers are employed by Airbnb Inc. and Hortonworks. We are > committed to recruiting additional committers from other companies. > > === Reliance on Salaried Developers === > It is expected that Superset development will occur on both salaried time > and on volunteer time, after hours. The majority of initial committers are > paid by their employer to contribute to this project. However, they are all > passionate about the project, and we are confident that the project will > continue even if no salaried developers contribute to the project. We are > committed to recruiting additional committers including non-salaried > developers. > > === Relationships with Other Apache Products === > To the knowledge of the Initial Committers, there are no direct competitors > to Superset within the Apache Software Foundation. That said, Apache > Zeppelin is an indirect competitor, but it solves a different use case. > > Apache Zeppelin is a web-based notebook that enables interactive data > analytics. It enables the creation of beautiful data-driven, interactive > and collaborative documents with SQL, Scala and more. Although a user can > create data visualizations using this project, it leverages a notebook > style user interfaces and it is geared towards the Spark community where > Scala and SQL co-exist > > We look forward to collaborating with those communities, as well as other > Apache communities. > > === An Excessive Fascination with the Apache Brand === > Superset is solving two huge challenges: > The challenge of enabling every knowledge worker to make data informed > decisions, particularly those who are not deeply skilled at writing SQL. > The challenge of visualizing huge amounts of data interactively and in > real-time > > Superset was first developed as a data visualization solution for Druid.io > as a way to visualize billions of rows of data. Since then, usage of > Superset has expanded to address data visualization use cases across SQL > speaking data sources as well. > > Our rationale for developing Superset as an Apache project is detailed in > the Rationale Section. We believe that the Apache brand and community > process will help us attract more contributors to this project, and help > grow the footprint of the project through usage at other organizations and > within other applications. Establishing consensus among users and > developers will result in a more valuable tool for everyone. > > == Documentation == > References to further reading material: > * [[http://airbnb.io/superset/|Superset Documentation]] > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post: Superset: > Airbnb’s Data Exploration Platform]] > * [[https://medium.com/airbnb-engineering/superset-scaling-dat > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post: > Superset: Scaling Data Access & Visual Insights at Airbnb]] > > == Initial Source == > The origin of the proposed code base can be found at > https://github.com/airbnb/superset. The code base is primarily in Python. > > == Source and Intellectual Property Submission Plan == > Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the > incubator. We do not expect any complications for the submission of the > Superset code base. Our code is already in Github and there is only a > single code base. > > == External Dependencies == > List of Python packages, from the Python Package Index (Pypi): > > * boto3 > * celery > * cryptography > * flask-appbuilder > * flask-cache > * flask-migrate > * flask-script > * flask-sqlalchemy > * flask-testing > * humanize > * gunicorn > * markdown > * pandas > * parsedatetime > * pydruid > * PyHive > * python-dateutil > * requests > * simplejson > * six > * sqlalchemy > * sqlalchemy-utils > * sqlparse > * thrift > * thrift-sasl > * werkzeug > > List of Javascript packages, from NPM: > * autobind-decorator > * bootstrap > * bootstrap-datepicker > * brace > * brfs > * cal-heatmap > * classnames > * d3 > * d3-cloud > * d3-sankey > * d3-scale > * d3-tip > * datamaps > * datatables-bootstrap3-plugin > * datatables.net-bs > * font-awesome > * gridster > * immutability-helper > * immutable > * jquery > * lodash.throttle > * mapbox-gl > * moment > * moments > * mustache > * nvd3 > * react > * react-ace > * react-bootstrap > * react-bootstrap-table > * react-dom > * react-draggable > * react-gravatar > * react-grid-layout > * react-map-gl > * react-redux > * react-resizable > * react-select > * react-syntax-highlighter > * reactable > * redux > * redux-localstorage > * redux-thunk > * shortid > * style-loader > * supercluster > * topojson > * victory > * viewport-mercator-project > > == Cryptography == > The proposal does not include cryptographic code. > > == Required Resources == > > === Mailing List === > There is a current mailing list as a Google Group “airbnb_superset” that we > are planning on deprecating as the Apache.org become ready to serve our > community. > > * superset-private > * superset-dev > * superset-user > > === Subversion Directory === > Git is the preferred source control system. http://svn.apache.org/repos/as > f/incubator/superset <http://svn.apache.org/repos/asf/incubator/superset> > > == Git Repository == > Git is the preferred source control system, we’re assuming > https://github.com/apache/incubator-superset based on the naming scheme > > == Issue Tracking == > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs > to manage our project as much as possible. It’s been said that there are > ways to keep Github’s issues in sync with Jira, allowing us to get best of > both worlds. If that is not possible, we will comply to using Jira. > > == Other Resources == > We currently use a set of Github integrated services that are free to the > open source community, like Travis-ci, Code Climate, Coveralls, > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using > these services as they allow us to scale contributions and optimize our > development flows. These services require some elevated rights on the > Github repository in order to set up or tune and we would like for the > committers to have the required rights. > > > == Initial Committers == > > * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC & Committer > * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & Committer > * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & Committer > * Vera Liu <vera....@airbnb.com> - Committer > * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer > * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & Committer > * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > * Priyank Shah <ps...@hortonworks.com> - Committer > * Harsha Chintalapani <schintalap...@hortonworks.com> - Committer > * Daniel Dai <da...@apache.org> - Champion & Committer > * Luke Han <luke....@apache.org> - Mentor > > == Affiliations == > The initial committers are employees of Airbnb Inc. and Hortonworks. > > == Sponsors == > > === Champion === > Daniel Dai <da...@apache.org> > > === Nominated Mentors === > * Ashutosh Chauhan <hashut...@apache.org> > * Luke Han <luke....@apache.org> > > === Sponsoring Entity === > Incubator PMC > > > > > > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <edwardy...@apache.org> > wrote: > > > +1 binding > > > > On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal > > <naresh.agar...@gmail.com> wrote: > > > +1 (non-binding). > > > > > > Thanks > > > Naresh Agarwal > > > > > > On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunn...@gmail.com> > > wrote: > > > > > >> +1 (binding) > > >> > > >> > > >> > > >> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: > > >> > > >> > +1 (binding) > > >> > > > >> > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey > > >> > <jiten...@hortonworks.com> wrote: > > >> > > +1 (binding) > > >> > > > > >> > > On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: > > >> > > > > >> > > +1 binding > > >> > > > > >> > > > On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org > > > > >> > wrote: > > >> > > > > > >> > > > +1 (non-binding) > > >> > > > > > >> > > > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < > > >> > hashut...@apache.org> > > >> > > > wrote: > > >> > > > > > >> > > >> +1 (binding) > > >> > > >> > > >> > > >> Thanks, > > >> > > >> Ashutosh > > >> > > >> > > >> > > >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han < > luke...@gmail.com > > > > > >> > wrote: > > >> > > >> > > >> > > >>> +1 binding > > >> > > >>> > > >> > > >>> Love to see Superset to be new incubator project. > > >> > > >>> > > >> > > >>> > > >> > > >>> Best Regards! > > >> > > >>> --------------------- > > >> > > >>> > > >> > > >>> Luke Han > > >> > > >>> > > >> > > >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng < > > >> jeff.f...@gmail.com> > > >> > wrote: > > >> > > >>> > > >> > > >>>> Dear Apache Incubator Community, > > >> > > >>>> > > >> > > >>>> We have updated the Superset proposal > > >> > > >>>> <https://wiki.apache.org/incubator/SupersetProposal> > > (copied > > >> > below) for > > >> > > >>>> > > >> > > >>>> Apache Incubation with an additional mentor (Luke Han - > > >> > > >>>> luke....@apache.org), > > >> > > >>>> and would like to start a vote thread for acceptance into > > the > > >> > incubator. > > >> > > >>>> > > >> > > >>>> Our team is excited to share Superset with the Apache > > >> community > > >> > and we > > >> > > >>>> hope > > >> > > >>>> for the your continued support! > > >> > > >>>> > > >> > > >>>> Cheers, > > >> > > >>>> Jeff & the Superset Team > > >> > > >>>> > > >> > > >>>> > > >> > > >>>> > > >> > > >>>> > > >> > > >>>> = Superset = > > >> > > >>>> > > >> > > >>>> == Abstract == > > >> > > >>>> Superset is an enterprise-ready web application for data > > >> > exploration, > > >> > > >> data > > >> > > >>>> visualization and dashboarding. > > >> > > >>>> > > >> > > >>>> == Proposal == > > >> > > >>>> Superset is business intelligence (BI) software that > helps > > >> > modern > > >> > > >>>> organizations visualize and interact with their data. > > Superset > > >> > enables > > >> > > >>>> users explore data from a variety of databases, assemble > > >> > beautiful > > >> > > >>>> dashboards and share their findings. Superset works > neatly > > >> > with all > > >> > > >>>> modern > > >> > > >>>> SQL-speaking databases, and integrates with Druid.io to > > >> provide > > >> > > >> real-time, > > >> > > >>>> interactive, blazing fast data access to large datasets. > > >> > > >>>> > > >> > > >>>> == Background == > > >> > > >>>> Data is mission critical. To succeed in this era, > > >> organizations > > >> > need to > > >> > > >>>> provide low-friction, intuitive and interactive access to > > >> data. > > >> > It is > > >> > > >>>> paramount for knowledge workers to be capable of > answering > > >> > their own > > >> > > >>>> questions by querying, exploring and visualizing data. > > >> > > >>>> > > >> > > >>>> The entire business intelligence industry has pivoted > from > > a > > >> > model of > > >> > > >>>> centralized top-down platforms driven by IT organizations > > to > > >> > > >> self-service > > >> > > >>>> analytics and agile workflows by any user. This shift > > >> unblocks > > >> > > >>>> centralized > > >> > > >>>> service bottlenecks for creating data visualizations > while > > >> also > > >> > creating > > >> > > >>>> an > > >> > > >>>> environment that is iterative and fast-moving. This > means > > >> that > > >> > business > > >> > > >>>> intelligence software must also be easy and delightful to > > use. > > >> > > >>>> Self-service analytics doesn’t mean that admin and > > governance > > >> > features > > >> > > >> are > > >> > > >>>> not needed. > > >> > > >>>> Modern BI tools provide fine-grain access controls and > > >> auditing > > >> > > >>>> capabilities to understand how data is being used. > > Superset > > >> is > > >> > a > > >> > > >> solution > > >> > > >>>> that delivers on all of these vectors. > > >> > > >>>> > > >> > > >>>> The technology stack is also constantly morphing - > vendors > > are > > >> > > >> struggling > > >> > > >>>> to provide cheap, quick and easy solutions to access > data. > > >> > Business > > >> > > >>>> intelligence users are finding existing solutions lacking > > as > > >> > these > > >> > > >>>> software > > >> > > >>>> products either disregard or react slowly to recent > > >> > game-changing > > >> > > >>>> technologies like Druid.io, PrestoDB, Apache Drill, > Apache > > >> > Kylin, d3.js, > > >> > > >>>> React.js and iPython’s Jupyter for instance. > > >> > > >>>> > > >> > > >>>> == Rationale == > > >> > > >>>> Business intelligence is more relevant today than at any > > other > > >> > point in > > >> > > >>>> history. Organizations are currently very limited in > > options > > >> > for open > > >> > > >>>> source data visualization solutions, especially solutions > > that > > >> > are both > > >> > > >>>> self-service and enterprise-ready. Every company > informing > > >> > their > > >> > > >>>> decisions > > >> > > >>>> with data needs a BI tool. > > >> > > >>>> > > >> > > >>>> We believe that Superset will be a strong compliment to > > >> > existing Apache > > >> > > >>>> Software Foundation technologies by offering scalable > user > > >> > interactions > > >> > > >> to > > >> > > >>>> distributed storage and computation solutions. Users > will > > >> > often find > > >> > > >> that > > >> > > >>>> Superset can act as a catalyst for tooling that can > > visualize > > >> > the > > >> > > >>>> byproduct > > >> > > >>>> of data and computation infrastructure. > > >> > > >>>> > > >> > > >>>> Superset has many key design elements that help fill a > gap > > in > > >> > current > > >> > > >>>> solutions for organizations: > > >> > > >>>> * Easy, low friction access to data through a simple, > > >> web-based > > >> > data > > >> > > >>>> exploration interface. Composing charts and dashboards > are > > >> > intuitive. > > >> > > >>>> Eliminating the need to write code or SQL empowers anyone > > to > > >> > use it. > > >> > > >>>> * Access to a wide array of rich, interactive data > > >> > visualization types. > > >> > > >>>> * Enterprise-ready: Integration with different > > authentication > > >> > > >> mechanisms > > >> > > >>>> and granular permissions centered around actions and data > > >> > access. > > >> > > >>>> * Realtime & fast: Superset provides realtime analytics > at > > the > > >> > speed of > > >> > > >>>> thought on very large datasets when integrated with > > Druid.io. > > >> > > >>>> * Broad data access: Consume data out of any SQL-speaking > > >> > relational > > >> > > >>>> database. > > >> > > >>>> * Extensible: Can be extended to talk to many noSQL > > databases > > >> > like > > >> > > >> Apache > > >> > > >>>> Drill, Elastic Search, and other popular database > engines. > > >> > > >>>> * Fast loading dashboards with configurable web-scale > > caching. > > >> > > >>>> * Plug-in framework that enables organizations to build > > custom > > >> > > >> analytical > > >> > > >>>> applications with new UI/UX interfaces. > > >> > > >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers > > >> > SQL-speaking users > > >> > > >>>> with more flexibility. SQL Lab integrates with the > > >> > visualization engine > > >> > > >>>> seamlessly. > > >> > > >>>> > > >> > > >>>> == Initial Goals == > > >> > > >>>> The initial goals of the Superset project are > several-fold: > > >> > > >>>> * Move the existing codebase to Apache and integrate with > > the > > >> > Apache > > >> > > >>>> development process. > > >> > > >>>> * Redesign the user interface and interaction model for > > >> creating > > >> > > >>>> visualizations/dashboards and connecting to data sources > > >> > > >>>> * Build robust support for security and governance of the > > tool > > >> > > >> including > > >> > > >>>> popular authorization modules (including Apache Ranger > and > > >> > Apache > > >> > > >> Sentry) > > >> > > >>>> and a more sophisticated permissions system > > >> > > >>>> * Grow the extensibility of the project both in terms of > > >> > enhanced > > >> > > >>>> connectivity to NoSQL-based data sources and creating a > > >> plug-in > > >> > > >> framework > > >> > > >>>> that enables organizations to build custom analytical > > >> > applications which > > >> > > >>>> require a new UI/UX > > >> > > >>>> > > >> > > >>>> == Current Status == > > >> > > >>>> By many standards, Superset is already a successful open > > >> source > > >> > project. > > >> > > >>>> As > > >> > > >>>> of March 2017, Superset is officially used in production > at > > >> > about a > > >> > > >> dozen > > >> > > >>>> companies, has received contributions from over one > hundred > > >> > contributors > > >> > > >>>> on > > >> > > >>>> Github, 1500+ forks, and 12k+ stars. > > >> > > >>>> > > >> > > >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks > have > > >> made > > >> > > >>>> significant contributions, and expressed their commitment > > to > > >> the > > >> > > >> project. > > >> > > >>>> The product is feature complete and has been viable for > > >> months. > > >> > It > > >> > > >> already > > >> > > >>>> serves as the main interface for consuming data at many > > >> > companies of > > >> > > >>>> different sizes. > > >> > > >>>> > > >> > > >>>> While the product is usable, there’s room for improvement > > >> > across the > > >> > > >>>> board, > > >> > > >>>> starting with providing a smoother user experience around > > >> > content > > >> > > >>>> creation, > > >> > > >>>> making sure all features work out-of-the-box on more > > platforms > > >> > and > > >> > > >>>> databases, providing better user training guides and > > videos, > > >> > having a > > >> > > >>>> predictable release process, and increasing the overall > > >> quality > > >> > of the > > >> > > >>>> Superset releases. > > >> > > >>>> > > >> > > >>>> === Meritocracy === > > >> > > >>>> We plan to invest in supporting a meritocracy. We will > > discuss > > >> > the > > >> > > >>>> requirements in an open forum. Several companies have > > >> expressed > > >> > interest > > >> > > >>>> in > > >> > > >>>> this project, and we intend to invite additional > > developers to > > >> > > >>>> participate. > > >> > > >>>> We will encourage and monitor community participation so > > that > > >> > privileges > > >> > > >>>> can be extended to those that contribute. > > >> > > >>>> > > >> > > >>>> === Community === > > >> > > >>>> The need for an enterprise-ready data visualization and > > >> > exploration > > >> > > >>>> platform in the open source community is tremendous. > While > > >> > Superset is > > >> > > >>>> fairly well known, recognized and used within the > Druid.io > > >> > community, > > >> > > >>>> adoption is currently limited outside of that niche. > There > > is > > >> a > > >> > huge > > >> > > >>>> opportunity to grow the community to hundreds if not > > thousands > > >> > of > > >> > > >>>> organizations, and we are hoping that embracing “the > Apache > > >> > way” will > > >> > > >>>> accelerate the growth of our community. > > >> > > >>>> > > >> > > >>>> We have already been active at seeking and inviting > > >> > contributions, and > > >> > > >> are > > >> > > >>>> planning to scale the project by investing time and > growing > > >> the > > >> > support > > >> > > >>>> structure to grow the community. > > >> > > >>>> > > >> > > >>>> === Core Developers === > > >> > > >>>> The initial committers for Superset include experienced > > full > > >> > stack, > > >> > > >>>> front-end and data engineers: > > >> > > >>>> * Maxime Beauchemin (Airbnb) > > >> > > >>>> * Alanna Scott (Airbnb) > > >> > > >>>> * Bogdan Kyryliuk (Airbnb) > > >> > > >>>> * Vera Liu (Airbnb) > > >> > > >>>> * Jeff Feng (Airbnb) > > >> > > >>>> * Ashutosh Chauhan (Hortonworks) > > >> > > >>>> * Nishant Bangarwa (Hortonworks) > > >> > > >>>> * Slim Bouguerra (Hortonworks) > > >> > > >>>> * Priyank Shah (Hortonworks) > > >> > > >>>> * Sriharsha Chintalapani (Hortonworks) > > >> > > >>>> * Daniel Dai (Hortonworks) > > >> > > >>>> > > >> > > >>>> We realize that additional employer diversity is needed, > > and > > >> we > > >> > will > > >> > > >> work > > >> > > >>>> aggressively to recruit developers from additional > > companies. > > >> > > >>>> > > >> > > >>>> === Alignment === > > >> > > >>>> The initial committers strongly believe that a system for > > >> > interactive > > >> > > >>>> visualization of data will gain broader adoption as an > open > > >> > source, > > >> > > >>>> community driven project, where the community can > > contribute > > >> > not only to > > >> > > >>>> the core components, but also to a growing collection of > > >> > connectors, > > >> > > >>>> visualizations and improving integration a all potential > > data > > >> > sources. > > >> > > >>>> Superset already integrates closely with Apache Hive, the > > Hive > > >> > > >> metastore, > > >> > > >>>> as well as most SQL-speaking databases found in modern > data > > >> > ecosystems. > > >> > > >>>> > > >> > > >>>> == Known Risks == > > >> > > >>>> > > >> > > >>>> === Orphaned Products === > > >> > > >>>> Superset is a vital component for both visualizing, > > accessing > > >> > and > > >> > > >>>> democratizing data at Airbnb. Also at Hortonworks, > > Superset > > >> is > > >> > a core > > >> > > >>>> component of the DataFlow product offering. Thus, the > > risk of > > >> > the > > >> > > >> project > > >> > > >>>> being orphaned is relatively low. The project could be > at > > >> risk > > >> > if > > >> > > >> Airbnb > > >> > > >>>> changes their approach for democratizing data or if > > >> Hortonworks > > >> > changes > > >> > > >>>> their strategy in the market. In such an event, the > > >> committers > > >> > plan to > > >> > > >>>> continue working on the project on their own time, > thought > > the > > >> > progress > > >> > > >>>> will likely be slower. We plan to mitigate this risk by > > >> > recruiting > > >> > > >>>> additional committers. > > >> > > >>>> > > >> > > >>>> === Inexperience with Open Source === > > >> > > >>>> The initial committers include veteran Apache members > > >> > (committers and > > >> > > >> PPMC > > >> > > >>>> members) and other developers who have varying degrees of > > >> > experience > > >> > > >> with > > >> > > >>>> open source projects. All have been involved with source > > code > > >> > that has > > >> > > >>>> been > > >> > > >>>> released under an open source license, and several also > > have > > >> > experience > > >> > > >>>> developing code with an open source development process. > > >> > > >>>> > > >> > > >>>> === Homogenous Developers === > > >> > > >>>> The initial committers are employed by Airbnb Inc. and > > >> > Hortonworks. We > > >> > > >> are > > >> > > >>>> committed to recruiting additional committers from other > > >> > companies. > > >> > > >>>> > > >> > > >>>> === Reliance on Salaried Developers === > > >> > > >>>> It is expected that Superset development will occur on > both > > >> > salaried > > >> > > >> time > > >> > > >>>> and on volunteer time, after hours. The majority of > initial > > >> > committers > > >> > > >> are > > >> > > >>>> paid by their employer to contribute to this project. > > However, > > >> > they are > > >> > > >>>> all > > >> > > >>>> passionate about the project, and we are confident that > the > > >> > project will > > >> > > >>>> continue even if no salaried developers contribute to the > > >> > project. We > > >> > > >> are > > >> > > >>>> committed to recruiting additional committers including > > >> > non-salaried > > >> > > >>>> developers. > > >> > > >>>> > > >> > > >>>> === Relationships with Other Apache Products === > > >> > > >>>> To the knowledge of the Initial Committers, there are no > > >> direct > > >> > > >>>> competitors > > >> > > >>>> to Superset within the Apache Software Foundation. That > > said, > > >> > Apache > > >> > > >>>> Zeppelin is an indirect competitor, but it solves a > > different > > >> > use case. > > >> > > >>>> > > >> > > >>>> Apache Zeppelin is a web-based notebook that enables > > >> > interactive data > > >> > > >>>> analytics. It enables the creation of beautiful > > data-driven, > > >> > interactive > > >> > > >>>> and collaborative documents with SQL, Scala and more. > > >> Although > > >> > a user > > >> > > >> can > > >> > > >>>> create data visualizations using this project, it > > leverages a > > >> > notebook > > >> > > >>>> style user interfaces and it is geared towards the Spark > > >> > community where > > >> > > >>>> Scala and SQL co-exist > > >> > > >>>> > > >> > > >>>> We look forward to collaborating with those communities, > as > > >> > well as > > >> > > >> other > > >> > > >>>> Apache communities. > > >> > > >>>> > > >> > > >>>> === An Excessive Fascination with the Apache Brand === > > >> > > >>>> Superset is solving two huge challenges: > > >> > > >>>> The challenge of enabling every knowledge worker to make > > data > > >> > informed > > >> > > >>>> decisions, particularly those who are not deeply skilled > at > > >> > writing SQL. > > >> > > >>>> The challenge of visualizing huge amounts of data > > >> interactively > > >> > and in > > >> > > >>>> real-time > > >> > > >>>> > > >> > > >>>> Superset was first developed as a data visualization > > solution > > >> > for > > >> > > >> Druid.io > > >> > > >>>> as a way to visualize billions of rows of data. Since > > then, > > >> > usage of > > >> > > >>>> Superset has expanded to address data visualization use > > cases > > >> > across SQL > > >> > > >>>> speaking data sources as well. > > >> > > >>>> > > >> > > >>>> Our rationale for developing Superset as an Apache > project > > is > > >> > detailed > > >> > > >> in > > >> > > >>>> the Rationale Section. We believe that the Apache brand > > and > > >> > community > > >> > > >>>> process will help us attract more contributors to this > > >> project, > > >> > and help > > >> > > >>>> grow the footprint of the project through usage at other > > >> > organizations > > >> > > >> and > > >> > > >>>> within other applications. Establishing consensus among > > users > > >> > and > > >> > > >>>> developers will result in a more valuable tool for > > everyone. > > >> > > >>>> > > >> > > >>>> == Documentation == > > >> > > >>>> References to further reading material: > > >> > > >>>> * [[http://airbnb.io/superset/|Superset Documentation]] > > >> > > >>>> * [[ > > >> > > >>>> https://medium.com/airbnb-engi > > neering/caravel-airbnb-s-data- > > >> > > >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog > > >> > > >>>> Post: Superset: Airbnb’s Data Exploration Platform]] > > >> > > >>>> * [[ > > >> > > >>>> https://medium.com/airbnb-engi > > neering/superset-scaling-data- > > >> > > >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. > > >> > a505zvb1t|Blog > > >> > > >>>> Post: Superset: Scaling Data Access & Visual Insights at > > >> > Airbnb]] > > >> > > >>>> > > >> > > >>>> == Initial Source == > > >> > > >>>> The origin of the proposed code base can be found at > > >> > > >>>> https://github.com/airbnb/superset. The code base is > > >> > primarily in > > >> > > >>>> Python. > > >> > > >>>> > > >> > > >>>> == Source and Intellectual Property Submission Plan == > > >> > > >>>> We do not expect any complications for the submission of > > the > > >> > Superset > > >> > > >> code > > >> > > >>>> base. Our code is already in Github and there is only a > > >> single > > >> > code > > >> > > >> base. > > >> > > >>>> > > >> > > >>>> == External Dependencies == > > >> > > >>>> List of Python packages, from the Python Package Index > > (Pypi): > > >> > > >>>> > > >> > > >>>> * boto3 > > >> > > >>>> * celery > > >> > > >>>> * cryptography > > >> > > >>>> * flask-appbuilder > > >> > > >>>> * flask-cache > > >> > > >>>> * flask-migrate > > >> > > >>>> * flask-script > > >> > > >>>> * flask-sqlalchemy > > >> > > >>>> * flask-testing > > >> > > >>>> * humanize > > >> > > >>>> * gunicorn > > >> > > >>>> * markdown > > >> > > >>>> * pandas > > >> > > >>>> * parsedatetime > > >> > > >>>> * pydruid > > >> > > >>>> * PyHive > > >> > > >>>> * python-dateutil > > >> > > >>>> * requests > > >> > > >>>> * simplejson > > >> > > >>>> * six > > >> > > >>>> * sqlalchemy > > >> > > >>>> * sqlalchemy-utils > > >> > > >>>> * sqlparse > > >> > > >>>> * thrift > > >> > > >>>> * thrift-sasl > > >> > > >>>> * werkzeug > > >> > > >>>> > > >> > > >>>> List of Javascript packages, from NPM: > > >> > > >>>> * autobind-decorator > > >> > > >>>> * bootstrap > > >> > > >>>> * bootstrap-datepicker > > >> > > >>>> * brace > > >> > > >>>> * brfs > > >> > > >>>> * cal-heatmap > > >> > > >>>> * classnames > > >> > > >>>> * d3 > > >> > > >>>> * d3-cloud > > >> > > >>>> * d3-sankey > > >> > > >>>> * d3-scale > > >> > > >>>> * d3-tip > > >> > > >>>> * datamaps > > >> > > >>>> * datatables-bootstrap3-plugin > > >> > > >>>> * datatables.net-bs > > >> > > >>>> * font-awesome > > >> > > >>>> * gridster > > >> > > >>>> * immutability-helper > > >> > > >>>> * immutable > > >> > > >>>> * jquery > > >> > > >>>> * lodash.throttle > > >> > > >>>> * mapbox-gl > > >> > > >>>> * moment > > >> > > >>>> * moments > > >> > > >>>> * mustache > > >> > > >>>> * nvd3 > > >> > > >>>> * react > > >> > > >>>> * react-ace > > >> > > >>>> * react-bootstrap > > >> > > >>>> * react-bootstrap-table > > >> > > >>>> * react-dom > > >> > > >>>> * react-draggable > > >> > > >>>> * react-gravatar > > >> > > >>>> * react-grid-layout > > >> > > >>>> * react-map-gl > > >> > > >>>> * react-redux > > >> > > >>>> * react-resizable > > >> > > >>>> * react-select > > >> > > >>>> * react-syntax-highlighter > > >> > > >>>> * reactable > > >> > > >>>> * redux > > >> > > >>>> * redux-localstorage > > >> > > >>>> * redux-thunk > > >> > > >>>> * shortid > > >> > > >>>> * style-loader > > >> > > >>>> * supercluster > > >> > > >>>> * topojson > > >> > > >>>> * victory > > >> > > >>>> * viewport-mercator-project > > >> > > >>>> > > >> > > >>>> == Cryptography == > > >> > > >>>> The proposal does not include cryptographic code. > > >> > > >>>> > > >> > > >>>> == Required Resources == > > >> > > >>>> > > >> > > >>>> === Mailing List === > > >> > > >>>> There is a current mailing list as a Google Group > > >> > “airbnb_superset” that > > >> > > >>>> we > > >> > > >>>> are planning on deprecating as the Apache.org become > ready > > to > > >> > serve our > > >> > > >>>> community. > > >> > > >>>> > > >> > > >>>> * superset-private > > >> > > >>>> * superset-dev > > >> > > >>>> * superset-user > > >> > > >>>> > > >> > > >>>> === Subversion Directory === > > >> > > >>>> Git is the preferred source control system. > > >> > > >>>> http://svn.apache.org/repos/asf/incubator/superset > > >> > > >>>> > > >> > > >>>> == Git Repository == > > >> > > >>>> Git is the preferred source control system, we’re > assuming > > >> > > >>>> https://github.com/apache/incubator-superset based on > the > > >> > naming scheme > > >> > > >>>> > > >> > > >>>> == Issue Tracking == > > >> > > >>>> JIRA Superset (SUPERSET). If possible, we’d like to use > > Github > > >> > issues & > > >> > > >>>> PRs > > >> > > >>>> to manage our project as much as possible. It’s been said > > that > > >> > there are > > >> > > >>>> ways to keep Github’s issues in sync with Jira, allowing > > us to > > >> > get best > > >> > > >> of > > >> > > >>>> both worlds. If that is not possible, we will comply to > > using > > >> > Jira. > > >> > > >>>> > > >> > > >>>> == Other Resources == > > >> > > >>>> We currently use a set of Github integrated services that > > are > > >> > free to > > >> > > >> the > > >> > > >>>> open source community, like Travis-ci, Code Climate, > > >> Coveralls, > > >> > > >>>> Landscape.io, Requires.io, david-dm and Gitter. We would > > like > > >> > to keep > > >> > > >>>> using > > >> > > >>>> these services as they allow us to scale contributions > and > > >> > optimize our > > >> > > >>>> development flows. These services require some elevated > > rights > > >> > on the > > >> > > >>>> Github repository in order to set up or tune and we would > > like > > >> > for the > > >> > > >>>> committers to have the required rights. > > >> > > >>>> > > >> > > >>>> > > >> > > >>>> == Initial Committers == > > >> > > >>>> > > >> > > >>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - > PPMC > > & > > >> > Committer > > >> > > >>>> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & > > Committer > > >> > > >>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & > > Committer > > >> > > >>>> * Vera Liu <vera....@airbnb.com> - Committer > > >> > > >>>> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > > >> > > >>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & > > >> Committer > > >> > > >>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & > > >> > Committer > > >> > > >>>> * Slim Bouguerra <sbougue...@hortonworks.com> - > Committer > > >> > > >>>> * Priyank Shah <ps...@hortonworks.com> - Committer > > >> > > >>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - > > >> > Committer > > >> > > >>>> * Daniel Dai <da...@apache.org> - Champion & Committer > > >> > > >>>> * Luke Han <luke....@apache.org> - Mentor > > >> > > >>>> > > >> > > >>>> == Affiliations == > > >> > > >>>> The initial committers are employees of Airbnb Inc. and > > >> > Hortonworks. > > >> > > >>>> > > >> > > >>>> == Sponsors == > > >> > > >>>> > > >> > > >>>> === Champion === > > >> > > >>>> Daniel Dai <da...@apache.org> > > >> > > >>>> > > >> > > >>>> === Nominated Mentors === > > >> > > >>>> * Ashutosh Chauhan <hashut...@apache.org> > > >> > > >>>> * Luke Han <luke....@apache.org> > > >> > > >>>> > > >> > > >>>> === Sponsoring Entity === > > >> > > >>>> Incubator PMC > > >> > > >>>> > > >> > > >>> > > >> > > >>> > > >> > > >> > > >> > > > > >> > > > > >> > > ------------------------------------------------------------ > > >> > --------- > > >> > > To unsubscribe, e-mail: general-unsubscribe@incubator. > > apache.org > > >> > > For additional commands, e-mail: > general-help@incubator.apache. > > org > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > --------------------------------------------------------------------- > > >> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > >> > For additional commands, e-mail: general-h...@incubator.apache.org > > >> > > > >> > > > >> > > > > > > > > -- > > Best Regards, Edward J. Yoon > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > >