+1 (binding) And an offer to mentor if you would like/want another.
> On Apr 27, 2017, at 1:02 PM, Felix Cheung <felixche...@apache.org> wrote: > > +1 (nonbinding) > > On Wed, Apr 26, 2017 at 11:13 PM Jeff Feng <jeff.f...@gmail.com> wrote: > >> Hello everyone, >> >> Thank you for checking out our proposal on Superset and for your >> consideration for the Apache Incubator. So far, I believe we have 8 >> binding votes and 2 non-binding votes. >> >> As Taylor mentioned earlier, we made a minor update to the wording in the >> "Source and Intellectual Property Submission Plan" section based on a >> suggestion by John Ament. The update was to help confirm the previously >> unstated assumption that we will submit an SGA. I have copied the updated >> proposal from the wiki to the email below and highlighted (in yellow) the >> new sentence below in the document. >> >> Folks on the cc line who have already voted, please let us know if the >> change impacts your vote. >> >> Thank you all, >> Jeff >> >> >> >> = Superset = >> >> == Abstract == >> Superset is an enterprise-ready web application for data exploration, data >> visualization and dashboarding. >> >> == Proposal == >> Superset is business intelligence (BI) software that helps modern >> organizations visualize and interact with their data. Superset enables >> users explore data from a variety of databases, assemble beautiful >> dashboards and share their findings. Superset works neatly with all modern >> SQL-speaking databases, and integrates with Druid.io to provide real-time, >> interactive, blazing fast data access to large datasets. >> >> == Background == >> Data is mission critical. To succeed in this era, organizations need to >> provide low-friction, intuitive and interactive access to data. It is >> paramount for knowledge workers to be capable of answering their own >> questions by querying, exploring and visualizing data. >> >> The entire business intelligence industry has pivoted from a model of >> centralized top-down platforms driven by IT organizations to self-service >> analytics and agile workflows by any user. This shift unblocks centralized >> service bottlenecks for creating data visualizations while also creating an >> environment that is iterative and fast-moving. This means that business >> intelligence software must also be easy and delightful to use. >> Self-service analytics doesn’t mean that admin and governance features are >> not needed. >> Modern BI tools provide fine-grain access controls and auditing >> capabilities to understand how data is being used. Superset is a solution >> that delivers on all of these vectors. >> >> The technology stack is also constantly morphing - vendors are struggling >> to provide cheap, quick and easy solutions to access data. Business >> intelligence users are finding existing solutions lacking as these software >> products either disregard or react slowly to recent game-changing >> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js, >> React.js and iPython’s Jupyter for instance. >> >> == Rationale == >> Business intelligence is more relevant today than at any other point in >> history. Organizations are currently very limited in options for open >> source data visualization solutions, especially solutions that are both >> self-service and enterprise-ready. Every company informing their decisions >> with data needs a BI tool. >> >> We believe that Superset will be a strong compliment to existing Apache >> Software Foundation technologies by offering scalable user interactions to >> distributed storage and computation solutions. Users will often find that >> Superset can act as a catalyst for tooling that can visualize the byproduct >> of data and computation infrastructure. >> >> Superset has many key design elements that help fill a gap in current >> solutions for organizations: >> * Easy, low friction access to data through a simple, web-based data >> exploration interface. Composing charts and dashboards are intuitive. >> Eliminating the need to write code or SQL empowers anyone to use it. >> * Access to a wide array of rich, interactive data visualization types. >> * Enterprise-ready: Integration with different authentication mechanisms >> and granular permissions centered around actions and data access. >> * Realtime & fast: Superset provides realtime analytics at the speed of >> thought on very large datasets when integrated with Druid.io. >> * Broad data access: Consume data out of any SQL-speaking relational >> database. >> * Extensible: Can be extended to talk to many noSQL databases like Apache >> Drill, Elastic Search, and other popular database engines. >> * Fast loading dashboards with configurable web-scale caching. >> * Plug-in framework that enables organizations to build custom analytical >> applications with new UI/UX interfaces. >> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users >> with more flexibility. SQL Lab integrates with the visualization engine >> seamlessly. >> >> == Initial Goals == >> The initial goals of the Superset project are several-fold: >> * Move the existing codebase to Apache and integrate with the Apache >> development process. >> * Redesign the user interface and interaction model for creating >> visualizations/dashboards and connecting to data sources >> * Build robust support for security and governance of the tool including >> popular authorization modules (including Apache Ranger and Apache Sentry) >> and a more sophisticated permissions system >> * Grow the extensibility of the project both in terms of enhanced >> connectivity to NoSQL-based data sources and creating a plug-in framework >> that enables organizations to build custom analytical applications which >> require a new UI/UX >> >> == Current Status == >> By many standards, Superset is already a successful open source project. As >> of March 2017, Superset is officially used in production at about a dozen >> companies, has received contributions from over one hundred contributors on >> Github, 1500+ forks, and 12k+ stars. >> >> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made >> significant contributions, and expressed their commitment to the project. >> The product is feature complete and has been viable for months. It already >> serves as the main interface for consuming data at many companies of >> different sizes. >> >> While the product is usable, there’s room for improvement across the board, >> starting with providing a smoother user experience around content creation, >> making sure all features work out-of-the-box on more platforms and >> databases, providing better user training guides and videos, having a >> predictable release process, and increasing the overall quality of the >> Superset releases. >> >> === Meritocracy === >> We plan to invest in supporting a meritocracy. We will discuss the >> requirements in an open forum. Several companies have expressed interest in >> this project, and we intend to invite additional developers to participate. >> We will encourage and monitor community participation so that privileges >> can be extended to those that contribute. >> >> === Community === >> The need for an enterprise-ready data visualization and exploration >> platform in the open source community is tremendous. While Superset is >> fairly well known, recognized and used within the Druid.io community, >> adoption is currently limited outside of that niche. There is a huge >> opportunity to grow the community to hundreds if not thousands of >> organizations, and we are hoping that embracing “the Apache way” will >> accelerate the growth of our community. >> >> We have already been active at seeking and inviting contributions, and are >> planning to scale the project by investing time and growing the support >> structure to grow the community. >> >> === Core Developers === >> The initial committers for Superset include experienced full stack, >> front-end and data engineers: >> * Maxime Beauchemin (Airbnb) >> * Alanna Scott (Airbnb) >> * Bogdan Kyryliuk (Airbnb) >> * Vera Liu (Airbnb) >> * Jeff Feng (Airbnb) >> * Ashutosh Chauhan (Hortonworks) >> * Nishant Bangarwa (Hortonworks) >> * Slim Bouguerra (Hortonworks) >> * Priyank Shah (Hortonworks) >> * Sriharsha Chintalapani (Hortonworks) >> * Daniel Dai (Hortonworks) >> >> We realize that additional employer diversity is needed, and we will work >> aggressively to recruit developers from additional companies. >> >> === Alignment === >> The initial committers strongly believe that a system for interactive >> visualization of data will gain broader adoption as an open source, >> community driven project, where the community can contribute not only to >> the core components, but also to a growing collection of connectors, >> visualizations and improving integration a all potential data sources. >> Superset already integrates closely with Apache Hive, the Hive metastore, >> as well as most SQL-speaking databases found in modern data ecosystems. >> >> == Known Risks == >> >> === Orphaned Products === >> Superset is a vital component for both visualizing, accessing and >> democratizing data at Airbnb. Also at Hortonworks, Superset is a core >> component of the DataFlow product offering. Thus, the risk of the project >> being orphaned is relatively low. The project could be at risk if Airbnb >> changes their approach for democratizing data or if Hortonworks changes >> their strategy in the market. In such an event, the committers plan to >> continue working on the project on their own time, thought the progress >> will likely be slower. We plan to mitigate this risk by recruiting >> additional committers. >> >> === Inexperience with Open Source === >> The initial committers include veteran Apache members (committers and PPMC >> members) and other developers who have varying degrees of experience with >> open source projects. All have been involved with source code that has been >> released under an open source license, and several also have experience >> developing code with an open source development process. >> >> === Homogenous Developers === >> The initial committers are employed by Airbnb Inc. and Hortonworks. We are >> committed to recruiting additional committers from other companies. >> >> === Reliance on Salaried Developers === >> It is expected that Superset development will occur on both salaried time >> and on volunteer time, after hours. The majority of initial committers are >> paid by their employer to contribute to this project. However, they are all >> passionate about the project, and we are confident that the project will >> continue even if no salaried developers contribute to the project. We are >> committed to recruiting additional committers including non-salaried >> developers. >> >> === Relationships with Other Apache Products === >> To the knowledge of the Initial Committers, there are no direct competitors >> to Superset within the Apache Software Foundation. That said, Apache >> Zeppelin is an indirect competitor, but it solves a different use case. >> >> Apache Zeppelin is a web-based notebook that enables interactive data >> analytics. It enables the creation of beautiful data-driven, interactive >> and collaborative documents with SQL, Scala and more. Although a user can >> create data visualizations using this project, it leverages a notebook >> style user interfaces and it is geared towards the Spark community where >> Scala and SQL co-exist >> >> We look forward to collaborating with those communities, as well as other >> Apache communities. >> >> === An Excessive Fascination with the Apache Brand === >> Superset is solving two huge challenges: >> The challenge of enabling every knowledge worker to make data informed >> decisions, particularly those who are not deeply skilled at writing SQL. >> The challenge of visualizing huge amounts of data interactively and in >> real-time >> >> Superset was first developed as a data visualization solution for Druid.io >> as a way to visualize billions of rows of data. Since then, usage of >> Superset has expanded to address data visualization use cases across SQL >> speaking data sources as well. >> >> Our rationale for developing Superset as an Apache project is detailed in >> the Rationale Section. We believe that the Apache brand and community >> process will help us attract more contributors to this project, and help >> grow the footprint of the project through usage at other organizations and >> within other applications. Establishing consensus among users and >> developers will result in a more valuable tool for everyone. >> >> == Documentation == >> References to further reading material: >> * [[http://airbnb.io/superset/|Superset Documentation]] >> * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat >> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post: Superset: >> Airbnb’s Data Exploration Platform]] >> * [[https://medium.com/airbnb-engineering/superset-scaling-dat >> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post: >> Superset: Scaling Data Access & Visual Insights at Airbnb]] >> >> == Initial Source == >> The origin of the proposed code base can be found at >> https://github.com/airbnb/superset. The code base is primarily in Python. >> >> == Source and Intellectual Property Submission Plan == >> Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the >> incubator. We do not expect any complications for the submission of the >> Superset code base. Our code is already in Github and there is only a >> single code base. >> >> == External Dependencies == >> List of Python packages, from the Python Package Index (Pypi): >> >> * boto3 >> * celery >> * cryptography >> * flask-appbuilder >> * flask-cache >> * flask-migrate >> * flask-script >> * flask-sqlalchemy >> * flask-testing >> * humanize >> * gunicorn >> * markdown >> * pandas >> * parsedatetime >> * pydruid >> * PyHive >> * python-dateutil >> * requests >> * simplejson >> * six >> * sqlalchemy >> * sqlalchemy-utils >> * sqlparse >> * thrift >> * thrift-sasl >> * werkzeug >> >> List of Javascript packages, from NPM: >> * autobind-decorator >> * bootstrap >> * bootstrap-datepicker >> * brace >> * brfs >> * cal-heatmap >> * classnames >> * d3 >> * d3-cloud >> * d3-sankey >> * d3-scale >> * d3-tip >> * datamaps >> * datatables-bootstrap3-plugin >> * datatables.net-bs >> * font-awesome >> * gridster >> * immutability-helper >> * immutable >> * jquery >> * lodash.throttle >> * mapbox-gl >> * moment >> * moments >> * mustache >> * nvd3 >> * react >> * react-ace >> * react-bootstrap >> * react-bootstrap-table >> * react-dom >> * react-draggable >> * react-gravatar >> * react-grid-layout >> * react-map-gl >> * react-redux >> * react-resizable >> * react-select >> * react-syntax-highlighter >> * reactable >> * redux >> * redux-localstorage >> * redux-thunk >> * shortid >> * style-loader >> * supercluster >> * topojson >> * victory >> * viewport-mercator-project >> >> == Cryptography == >> The proposal does not include cryptographic code. >> >> == Required Resources == >> >> === Mailing List === >> There is a current mailing list as a Google Group “airbnb_superset” that we >> are planning on deprecating as the Apache.org become ready to serve our >> community. >> >> * superset-private >> * superset-dev >> * superset-user >> >> === Subversion Directory === >> Git is the preferred source control system. http://svn.apache.org/repos/as >> f/incubator/superset <http://svn.apache.org/repos/asf/incubator/superset> >> >> == Git Repository == >> Git is the preferred source control system, we’re assuming >> https://github.com/apache/incubator-superset based on the naming scheme >> >> == Issue Tracking == >> JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs >> to manage our project as much as possible. It’s been said that there are >> ways to keep Github’s issues in sync with Jira, allowing us to get best of >> both worlds. If that is not possible, we will comply to using Jira. >> >> == Other Resources == >> We currently use a set of Github integrated services that are free to the >> open source community, like Travis-ci, Code Climate, Coveralls, >> Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using >> these services as they allow us to scale contributions and optimize our >> development flows. These services require some elevated rights on the >> Github repository in order to set up or tune and we would like for the >> committers to have the required rights. >> >> >> == Initial Committers == >> >> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC & Committer >> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & Committer >> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & Committer >> * Vera Liu <vera....@airbnb.com> - Committer >> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer >> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer >> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & Committer >> * Slim Bouguerra <sbougue...@hortonworks.com> - Committer >> * Priyank Shah <ps...@hortonworks.com> - Committer >> * Harsha Chintalapani <schintalap...@hortonworks.com> - Committer >> * Daniel Dai <da...@apache.org> - Champion & Committer >> * Luke Han <luke....@apache.org> - Mentor >> >> == Affiliations == >> The initial committers are employees of Airbnb Inc. and Hortonworks. >> >> == Sponsors == >> >> === Champion === >> Daniel Dai <da...@apache.org> >> >> === Nominated Mentors === >> * Ashutosh Chauhan <hashut...@apache.org> >> * Luke Han <luke....@apache.org> >> >> === Sponsoring Entity === >> Incubator PMC >> >> >> >> >> >> On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <edwardy...@apache.org> >> wrote: >> >>> +1 binding >>> >>> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal >>> <naresh.agar...@gmail.com> wrote: >>>> +1 (non-binding). >>>> >>>> Thanks >>>> Naresh Agarwal >>>> >>>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunn...@gmail.com> >>> wrote: >>>> >>>>> +1 (binding) >>>>> >>>>> >>>>> >>>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: >>>>> >>>>>> +1 (binding) >>>>>> >>>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey >>>>>> <jiten...@hortonworks.com> wrote: >>>>>>> +1 (binding) >>>>>>> >>>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: >>>>>>> >>>>>>> +1 binding >>>>>>> >>>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org >>> >>>>>> wrote: >>>>>>>> >>>>>>>> +1 (non-binding) >>>>>>>> >>>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < >>>>>> hashut...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> +1 (binding) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Ashutosh >>>>>>>>> >>>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han < >> luke...@gmail.com >>>> >>>>>> wrote: >>>>>>>>> >>>>>>>>>> +1 binding >>>>>>>>>> >>>>>>>>>> Love to see Superset to be new incubator project. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Best Regards! >>>>>>>>>> --------------------- >>>>>>>>>> >>>>>>>>>> Luke Han >>>>>>>>>> >>>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng < >>>>> jeff.f...@gmail.com> >>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Dear Apache Incubator Community, >>>>>>>>>>> >>>>>>>>>>> We have updated the Superset proposal >>>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal> >>> (copied >>>>>> below) for >>>>>>>>>>> >>>>>>>>>>> Apache Incubation with an additional mentor (Luke Han - >>>>>>>>>>> luke....@apache.org), >>>>>>>>>>> and would like to start a vote thread for acceptance into >>> the >>>>>> incubator. >>>>>>>>>>> >>>>>>>>>>> Our team is excited to share Superset with the Apache >>>>> community >>>>>> and we >>>>>>>>>>> hope >>>>>>>>>>> for the your continued support! >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> Jeff & the Superset Team >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> = Superset = >>>>>>>>>>> >>>>>>>>>>> == Abstract == >>>>>>>>>>> Superset is an enterprise-ready web application for data >>>>>> exploration, >>>>>>>>> data >>>>>>>>>>> visualization and dashboarding. >>>>>>>>>>> >>>>>>>>>>> == Proposal == >>>>>>>>>>> Superset is business intelligence (BI) software that >> helps >>>>>> modern >>>>>>>>>>> organizations visualize and interact with their data. >>> Superset >>>>>> enables >>>>>>>>>>> users explore data from a variety of databases, assemble >>>>>> beautiful >>>>>>>>>>> dashboards and share their findings. Superset works >> neatly >>>>>> with all >>>>>>>>>>> modern >>>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to >>>>> provide >>>>>>>>> real-time, >>>>>>>>>>> interactive, blazing fast data access to large datasets. >>>>>>>>>>> >>>>>>>>>>> == Background == >>>>>>>>>>> Data is mission critical. To succeed in this era, >>>>> organizations >>>>>> need to >>>>>>>>>>> provide low-friction, intuitive and interactive access to >>>>> data. >>>>>> It is >>>>>>>>>>> paramount for knowledge workers to be capable of >> answering >>>>>> their own >>>>>>>>>>> questions by querying, exploring and visualizing data. >>>>>>>>>>> >>>>>>>>>>> The entire business intelligence industry has pivoted >> from >>> a >>>>>> model of >>>>>>>>>>> centralized top-down platforms driven by IT organizations >>> to >>>>>>>>> self-service >>>>>>>>>>> analytics and agile workflows by any user. This shift >>>>> unblocks >>>>>>>>>>> centralized >>>>>>>>>>> service bottlenecks for creating data visualizations >> while >>>>> also >>>>>> creating >>>>>>>>>>> an >>>>>>>>>>> environment that is iterative and fast-moving. This >> means >>>>> that >>>>>> business >>>>>>>>>>> intelligence software must also be easy and delightful to >>> use. >>>>>>>>>>> Self-service analytics doesn’t mean that admin and >>> governance >>>>>> features >>>>>>>>> are >>>>>>>>>>> not needed. >>>>>>>>>>> Modern BI tools provide fine-grain access controls and >>>>> auditing >>>>>>>>>>> capabilities to understand how data is being used. >>> Superset >>>>> is >>>>>> a >>>>>>>>> solution >>>>>>>>>>> that delivers on all of these vectors. >>>>>>>>>>> >>>>>>>>>>> The technology stack is also constantly morphing - >> vendors >>> are >>>>>>>>> struggling >>>>>>>>>>> to provide cheap, quick and easy solutions to access >> data. >>>>>> Business >>>>>>>>>>> intelligence users are finding existing solutions lacking >>> as >>>>>> these >>>>>>>>>>> software >>>>>>>>>>> products either disregard or react slowly to recent >>>>>> game-changing >>>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, >> Apache >>>>>> Kylin, d3.js, >>>>>>>>>>> React.js and iPython’s Jupyter for instance. >>>>>>>>>>> >>>>>>>>>>> == Rationale == >>>>>>>>>>> Business intelligence is more relevant today than at any >>> other >>>>>> point in >>>>>>>>>>> history. Organizations are currently very limited in >>> options >>>>>> for open >>>>>>>>>>> source data visualization solutions, especially solutions >>> that >>>>>> are both >>>>>>>>>>> self-service and enterprise-ready. Every company >> informing >>>>>> their >>>>>>>>>>> decisions >>>>>>>>>>> with data needs a BI tool. >>>>>>>>>>> >>>>>>>>>>> We believe that Superset will be a strong compliment to >>>>>> existing Apache >>>>>>>>>>> Software Foundation technologies by offering scalable >> user >>>>>> interactions >>>>>>>>> to >>>>>>>>>>> distributed storage and computation solutions. Users >> will >>>>>> often find >>>>>>>>> that >>>>>>>>>>> Superset can act as a catalyst for tooling that can >>> visualize >>>>>> the >>>>>>>>>>> byproduct >>>>>>>>>>> of data and computation infrastructure. >>>>>>>>>>> >>>>>>>>>>> Superset has many key design elements that help fill a >> gap >>> in >>>>>> current >>>>>>>>>>> solutions for organizations: >>>>>>>>>>> * Easy, low friction access to data through a simple, >>>>> web-based >>>>>> data >>>>>>>>>>> exploration interface. Composing charts and dashboards >> are >>>>>> intuitive. >>>>>>>>>>> Eliminating the need to write code or SQL empowers anyone >>> to >>>>>> use it. >>>>>>>>>>> * Access to a wide array of rich, interactive data >>>>>> visualization types. >>>>>>>>>>> * Enterprise-ready: Integration with different >>> authentication >>>>>>>>> mechanisms >>>>>>>>>>> and granular permissions centered around actions and data >>>>>> access. >>>>>>>>>>> * Realtime & fast: Superset provides realtime analytics >> at >>> the >>>>>> speed of >>>>>>>>>>> thought on very large datasets when integrated with >>> Druid.io. >>>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking >>>>>> relational >>>>>>>>>>> database. >>>>>>>>>>> * Extensible: Can be extended to talk to many noSQL >>> databases >>>>>> like >>>>>>>>> Apache >>>>>>>>>>> Drill, Elastic Search, and other popular database >> engines. >>>>>>>>>>> * Fast loading dashboards with configurable web-scale >>> caching. >>>>>>>>>>> * Plug-in framework that enables organizations to build >>> custom >>>>>>>>> analytical >>>>>>>>>>> applications with new UI/UX interfaces. >>>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers >>>>>> SQL-speaking users >>>>>>>>>>> with more flexibility. SQL Lab integrates with the >>>>>> visualization engine >>>>>>>>>>> seamlessly. >>>>>>>>>>> >>>>>>>>>>> == Initial Goals == >>>>>>>>>>> The initial goals of the Superset project are >> several-fold: >>>>>>>>>>> * Move the existing codebase to Apache and integrate with >>> the >>>>>> Apache >>>>>>>>>>> development process. >>>>>>>>>>> * Redesign the user interface and interaction model for >>>>> creating >>>>>>>>>>> visualizations/dashboards and connecting to data sources >>>>>>>>>>> * Build robust support for security and governance of the >>> tool >>>>>>>>> including >>>>>>>>>>> popular authorization modules (including Apache Ranger >> and >>>>>> Apache >>>>>>>>> Sentry) >>>>>>>>>>> and a more sophisticated permissions system >>>>>>>>>>> * Grow the extensibility of the project both in terms of >>>>>> enhanced >>>>>>>>>>> connectivity to NoSQL-based data sources and creating a >>>>> plug-in >>>>>>>>> framework >>>>>>>>>>> that enables organizations to build custom analytical >>>>>> applications which >>>>>>>>>>> require a new UI/UX >>>>>>>>>>> >>>>>>>>>>> == Current Status == >>>>>>>>>>> By many standards, Superset is already a successful open >>>>> source >>>>>> project. >>>>>>>>>>> As >>>>>>>>>>> of March 2017, Superset is officially used in production >> at >>>>>> about a >>>>>>>>> dozen >>>>>>>>>>> companies, has received contributions from over one >> hundred >>>>>> contributors >>>>>>>>>>> on >>>>>>>>>>> Github, 1500+ forks, and 12k+ stars. >>>>>>>>>>> >>>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks >> have >>>>> made >>>>>>>>>>> significant contributions, and expressed their commitment >>> to >>>>> the >>>>>>>>> project. >>>>>>>>>>> The product is feature complete and has been viable for >>>>> months. >>>>>> It >>>>>>>>> already >>>>>>>>>>> serves as the main interface for consuming data at many >>>>>> companies of >>>>>>>>>>> different sizes. >>>>>>>>>>> >>>>>>>>>>> While the product is usable, there’s room for improvement >>>>>> across the >>>>>>>>>>> board, >>>>>>>>>>> starting with providing a smoother user experience around >>>>>> content >>>>>>>>>>> creation, >>>>>>>>>>> making sure all features work out-of-the-box on more >>> platforms >>>>>> and >>>>>>>>>>> databases, providing better user training guides and >>> videos, >>>>>> having a >>>>>>>>>>> predictable release process, and increasing the overall >>>>> quality >>>>>> of the >>>>>>>>>>> Superset releases. >>>>>>>>>>> >>>>>>>>>>> === Meritocracy === >>>>>>>>>>> We plan to invest in supporting a meritocracy. We will >>> discuss >>>>>> the >>>>>>>>>>> requirements in an open forum. Several companies have >>>>> expressed >>>>>> interest >>>>>>>>>>> in >>>>>>>>>>> this project, and we intend to invite additional >>> developers to >>>>>>>>>>> participate. >>>>>>>>>>> We will encourage and monitor community participation so >>> that >>>>>> privileges >>>>>>>>>>> can be extended to those that contribute. >>>>>>>>>>> >>>>>>>>>>> === Community === >>>>>>>>>>> The need for an enterprise-ready data visualization and >>>>>> exploration >>>>>>>>>>> platform in the open source community is tremendous. >> While >>>>>> Superset is >>>>>>>>>>> fairly well known, recognized and used within the >> Druid.io >>>>>> community, >>>>>>>>>>> adoption is currently limited outside of that niche. >> There >>> is >>>>> a >>>>>> huge >>>>>>>>>>> opportunity to grow the community to hundreds if not >>> thousands >>>>>> of >>>>>>>>>>> organizations, and we are hoping that embracing “the >> Apache >>>>>> way” will >>>>>>>>>>> accelerate the growth of our community. >>>>>>>>>>> >>>>>>>>>>> We have already been active at seeking and inviting >>>>>> contributions, and >>>>>>>>> are >>>>>>>>>>> planning to scale the project by investing time and >> growing >>>>> the >>>>>> support >>>>>>>>>>> structure to grow the community. >>>>>>>>>>> >>>>>>>>>>> === Core Developers === >>>>>>>>>>> The initial committers for Superset include experienced >>> full >>>>>> stack, >>>>>>>>>>> front-end and data engineers: >>>>>>>>>>> * Maxime Beauchemin (Airbnb) >>>>>>>>>>> * Alanna Scott (Airbnb) >>>>>>>>>>> * Bogdan Kyryliuk (Airbnb) >>>>>>>>>>> * Vera Liu (Airbnb) >>>>>>>>>>> * Jeff Feng (Airbnb) >>>>>>>>>>> * Ashutosh Chauhan (Hortonworks) >>>>>>>>>>> * Nishant Bangarwa (Hortonworks) >>>>>>>>>>> * Slim Bouguerra (Hortonworks) >>>>>>>>>>> * Priyank Shah (Hortonworks) >>>>>>>>>>> * Sriharsha Chintalapani (Hortonworks) >>>>>>>>>>> * Daniel Dai (Hortonworks) >>>>>>>>>>> >>>>>>>>>>> We realize that additional employer diversity is needed, >>> and >>>>> we >>>>>> will >>>>>>>>> work >>>>>>>>>>> aggressively to recruit developers from additional >>> companies. >>>>>>>>>>> >>>>>>>>>>> === Alignment === >>>>>>>>>>> The initial committers strongly believe that a system for >>>>>> interactive >>>>>>>>>>> visualization of data will gain broader adoption as an >> open >>>>>> source, >>>>>>>>>>> community driven project, where the community can >>> contribute >>>>>> not only to >>>>>>>>>>> the core components, but also to a growing collection of >>>>>> connectors, >>>>>>>>>>> visualizations and improving integration a all potential >>> data >>>>>> sources. >>>>>>>>>>> Superset already integrates closely with Apache Hive, the >>> Hive >>>>>>>>> metastore, >>>>>>>>>>> as well as most SQL-speaking databases found in modern >> data >>>>>> ecosystems. >>>>>>>>>>> >>>>>>>>>>> == Known Risks == >>>>>>>>>>> >>>>>>>>>>> === Orphaned Products === >>>>>>>>>>> Superset is a vital component for both visualizing, >>> accessing >>>>>> and >>>>>>>>>>> democratizing data at Airbnb. Also at Hortonworks, >>> Superset >>>>> is >>>>>> a core >>>>>>>>>>> component of the DataFlow product offering. Thus, the >>> risk of >>>>>> the >>>>>>>>> project >>>>>>>>>>> being orphaned is relatively low. The project could be >> at >>>>> risk >>>>>> if >>>>>>>>> Airbnb >>>>>>>>>>> changes their approach for democratizing data or if >>>>> Hortonworks >>>>>> changes >>>>>>>>>>> their strategy in the market. In such an event, the >>>>> committers >>>>>> plan to >>>>>>>>>>> continue working on the project on their own time, >> thought >>> the >>>>>> progress >>>>>>>>>>> will likely be slower. We plan to mitigate this risk by >>>>>> recruiting >>>>>>>>>>> additional committers. >>>>>>>>>>> >>>>>>>>>>> === Inexperience with Open Source === >>>>>>>>>>> The initial committers include veteran Apache members >>>>>> (committers and >>>>>>>>> PPMC >>>>>>>>>>> members) and other developers who have varying degrees of >>>>>> experience >>>>>>>>> with >>>>>>>>>>> open source projects. All have been involved with source >>> code >>>>>> that has >>>>>>>>>>> been >>>>>>>>>>> released under an open source license, and several also >>> have >>>>>> experience >>>>>>>>>>> developing code with an open source development process. >>>>>>>>>>> >>>>>>>>>>> === Homogenous Developers === >>>>>>>>>>> The initial committers are employed by Airbnb Inc. and >>>>>> Hortonworks. We >>>>>>>>> are >>>>>>>>>>> committed to recruiting additional committers from other >>>>>> companies. >>>>>>>>>>> >>>>>>>>>>> === Reliance on Salaried Developers === >>>>>>>>>>> It is expected that Superset development will occur on >> both >>>>>> salaried >>>>>>>>> time >>>>>>>>>>> and on volunteer time, after hours. The majority of >> initial >>>>>> committers >>>>>>>>> are >>>>>>>>>>> paid by their employer to contribute to this project. >>> However, >>>>>> they are >>>>>>>>>>> all >>>>>>>>>>> passionate about the project, and we are confident that >> the >>>>>> project will >>>>>>>>>>> continue even if no salaried developers contribute to the >>>>>> project. We >>>>>>>>> are >>>>>>>>>>> committed to recruiting additional committers including >>>>>> non-salaried >>>>>>>>>>> developers. >>>>>>>>>>> >>>>>>>>>>> === Relationships with Other Apache Products === >>>>>>>>>>> To the knowledge of the Initial Committers, there are no >>>>> direct >>>>>>>>>>> competitors >>>>>>>>>>> to Superset within the Apache Software Foundation. That >>> said, >>>>>> Apache >>>>>>>>>>> Zeppelin is an indirect competitor, but it solves a >>> different >>>>>> use case. >>>>>>>>>>> >>>>>>>>>>> Apache Zeppelin is a web-based notebook that enables >>>>>> interactive data >>>>>>>>>>> analytics. It enables the creation of beautiful >>> data-driven, >>>>>> interactive >>>>>>>>>>> and collaborative documents with SQL, Scala and more. >>>>> Although >>>>>> a user >>>>>>>>> can >>>>>>>>>>> create data visualizations using this project, it >>> leverages a >>>>>> notebook >>>>>>>>>>> style user interfaces and it is geared towards the Spark >>>>>> community where >>>>>>>>>>> Scala and SQL co-exist >>>>>>>>>>> >>>>>>>>>>> We look forward to collaborating with those communities, >> as >>>>>> well as >>>>>>>>> other >>>>>>>>>>> Apache communities. >>>>>>>>>>> >>>>>>>>>>> === An Excessive Fascination with the Apache Brand === >>>>>>>>>>> Superset is solving two huge challenges: >>>>>>>>>>> The challenge of enabling every knowledge worker to make >>> data >>>>>> informed >>>>>>>>>>> decisions, particularly those who are not deeply skilled >> at >>>>>> writing SQL. >>>>>>>>>>> The challenge of visualizing huge amounts of data >>>>> interactively >>>>>> and in >>>>>>>>>>> real-time >>>>>>>>>>> >>>>>>>>>>> Superset was first developed as a data visualization >>> solution >>>>>> for >>>>>>>>> Druid.io >>>>>>>>>>> as a way to visualize billions of rows of data. Since >>> then, >>>>>> usage of >>>>>>>>>>> Superset has expanded to address data visualization use >>> cases >>>>>> across SQL >>>>>>>>>>> speaking data sources as well. >>>>>>>>>>> >>>>>>>>>>> Our rationale for developing Superset as an Apache >> project >>> is >>>>>> detailed >>>>>>>>> in >>>>>>>>>>> the Rationale Section. We believe that the Apache brand >>> and >>>>>> community >>>>>>>>>>> process will help us attract more contributors to this >>>>> project, >>>>>> and help >>>>>>>>>>> grow the footprint of the project through usage at other >>>>>> organizations >>>>>>>>> and >>>>>>>>>>> within other applications. Establishing consensus among >>> users >>>>>> and >>>>>>>>>>> developers will result in a more valuable tool for >>> everyone. >>>>>>>>>>> >>>>>>>>>>> == Documentation == >>>>>>>>>>> References to further reading material: >>>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]] >>>>>>>>>>> * [[ >>>>>>>>>>> https://medium.com/airbnb-engi >>> neering/caravel-airbnb-s-data- >>>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog >>>>>>>>>>> Post: Superset: Airbnb’s Data Exploration Platform]] >>>>>>>>>>> * [[ >>>>>>>>>>> https://medium.com/airbnb-engi >>> neering/superset-scaling-data- >>>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. >>>>>> a505zvb1t|Blog >>>>>>>>>>> Post: Superset: Scaling Data Access & Visual Insights at >>>>>> Airbnb]] >>>>>>>>>>> >>>>>>>>>>> == Initial Source == >>>>>>>>>>> The origin of the proposed code base can be found at >>>>>>>>>>> https://github.com/airbnb/superset. The code base is >>>>>> primarily in >>>>>>>>>>> Python. >>>>>>>>>>> >>>>>>>>>>> == Source and Intellectual Property Submission Plan == >>>>>>>>>>> We do not expect any complications for the submission of >>> the >>>>>> Superset >>>>>>>>> code >>>>>>>>>>> base. Our code is already in Github and there is only a >>>>> single >>>>>> code >>>>>>>>> base. >>>>>>>>>>> >>>>>>>>>>> == External Dependencies == >>>>>>>>>>> List of Python packages, from the Python Package Index >>> (Pypi): >>>>>>>>>>> >>>>>>>>>>> * boto3 >>>>>>>>>>> * celery >>>>>>>>>>> * cryptography >>>>>>>>>>> * flask-appbuilder >>>>>>>>>>> * flask-cache >>>>>>>>>>> * flask-migrate >>>>>>>>>>> * flask-script >>>>>>>>>>> * flask-sqlalchemy >>>>>>>>>>> * flask-testing >>>>>>>>>>> * humanize >>>>>>>>>>> * gunicorn >>>>>>>>>>> * markdown >>>>>>>>>>> * pandas >>>>>>>>>>> * parsedatetime >>>>>>>>>>> * pydruid >>>>>>>>>>> * PyHive >>>>>>>>>>> * python-dateutil >>>>>>>>>>> * requests >>>>>>>>>>> * simplejson >>>>>>>>>>> * six >>>>>>>>>>> * sqlalchemy >>>>>>>>>>> * sqlalchemy-utils >>>>>>>>>>> * sqlparse >>>>>>>>>>> * thrift >>>>>>>>>>> * thrift-sasl >>>>>>>>>>> * werkzeug >>>>>>>>>>> >>>>>>>>>>> List of Javascript packages, from NPM: >>>>>>>>>>> * autobind-decorator >>>>>>>>>>> * bootstrap >>>>>>>>>>> * bootstrap-datepicker >>>>>>>>>>> * brace >>>>>>>>>>> * brfs >>>>>>>>>>> * cal-heatmap >>>>>>>>>>> * classnames >>>>>>>>>>> * d3 >>>>>>>>>>> * d3-cloud >>>>>>>>>>> * d3-sankey >>>>>>>>>>> * d3-scale >>>>>>>>>>> * d3-tip >>>>>>>>>>> * datamaps >>>>>>>>>>> * datatables-bootstrap3-plugin >>>>>>>>>>> * datatables.net-bs >>>>>>>>>>> * font-awesome >>>>>>>>>>> * gridster >>>>>>>>>>> * immutability-helper >>>>>>>>>>> * immutable >>>>>>>>>>> * jquery >>>>>>>>>>> * lodash.throttle >>>>>>>>>>> * mapbox-gl >>>>>>>>>>> * moment >>>>>>>>>>> * moments >>>>>>>>>>> * mustache >>>>>>>>>>> * nvd3 >>>>>>>>>>> * react >>>>>>>>>>> * react-ace >>>>>>>>>>> * react-bootstrap >>>>>>>>>>> * react-bootstrap-table >>>>>>>>>>> * react-dom >>>>>>>>>>> * react-draggable >>>>>>>>>>> * react-gravatar >>>>>>>>>>> * react-grid-layout >>>>>>>>>>> * react-map-gl >>>>>>>>>>> * react-redux >>>>>>>>>>> * react-resizable >>>>>>>>>>> * react-select >>>>>>>>>>> * react-syntax-highlighter >>>>>>>>>>> * reactable >>>>>>>>>>> * redux >>>>>>>>>>> * redux-localstorage >>>>>>>>>>> * redux-thunk >>>>>>>>>>> * shortid >>>>>>>>>>> * style-loader >>>>>>>>>>> * supercluster >>>>>>>>>>> * topojson >>>>>>>>>>> * victory >>>>>>>>>>> * viewport-mercator-project >>>>>>>>>>> >>>>>>>>>>> == Cryptography == >>>>>>>>>>> The proposal does not include cryptographic code. >>>>>>>>>>> >>>>>>>>>>> == Required Resources == >>>>>>>>>>> >>>>>>>>>>> === Mailing List === >>>>>>>>>>> There is a current mailing list as a Google Group >>>>>> “airbnb_superset” that >>>>>>>>>>> we >>>>>>>>>>> are planning on deprecating as the Apache.org become >> ready >>> to >>>>>> serve our >>>>>>>>>>> community. >>>>>>>>>>> >>>>>>>>>>> * superset-private >>>>>>>>>>> * superset-dev >>>>>>>>>>> * superset-user >>>>>>>>>>> >>>>>>>>>>> === Subversion Directory === >>>>>>>>>>> Git is the preferred source control system. >>>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset >>>>>>>>>>> >>>>>>>>>>> == Git Repository == >>>>>>>>>>> Git is the preferred source control system, we’re >> assuming >>>>>>>>>>> https://github.com/apache/incubator-superset based on >> the >>>>>> naming scheme >>>>>>>>>>> >>>>>>>>>>> == Issue Tracking == >>>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use >>> Github >>>>>> issues & >>>>>>>>>>> PRs >>>>>>>>>>> to manage our project as much as possible. It’s been said >>> that >>>>>> there are >>>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing >>> us to >>>>>> get best >>>>>>>>> of >>>>>>>>>>> both worlds. If that is not possible, we will comply to >>> using >>>>>> Jira. >>>>>>>>>>> >>>>>>>>>>> == Other Resources == >>>>>>>>>>> We currently use a set of Github integrated services that >>> are >>>>>> free to >>>>>>>>> the >>>>>>>>>>> open source community, like Travis-ci, Code Climate, >>>>> Coveralls, >>>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would >>> like >>>>>> to keep >>>>>>>>>>> using >>>>>>>>>>> these services as they allow us to scale contributions >> and >>>>>> optimize our >>>>>>>>>>> development flows. These services require some elevated >>> rights >>>>>> on the >>>>>>>>>>> Github repository in order to set up or tune and we would >>> like >>>>>> for the >>>>>>>>>>> committers to have the required rights. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> == Initial Committers == >>>>>>>>>>> >>>>>>>>>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - >> PPMC >>> & >>>>>> Committer >>>>>>>>>>> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & >>> Committer >>>>>>>>>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & >>> Committer >>>>>>>>>>> * Vera Liu <vera....@airbnb.com> - Committer >>>>>>>>>>> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer >>>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & >>>>> Committer >>>>>>>>>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & >>>>>> Committer >>>>>>>>>>> * Slim Bouguerra <sbougue...@hortonworks.com> - >> Committer >>>>>>>>>>> * Priyank Shah <ps...@hortonworks.com> - Committer >>>>>>>>>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - >>>>>> Committer >>>>>>>>>>> * Daniel Dai <da...@apache.org> - Champion & Committer >>>>>>>>>>> * Luke Han <luke....@apache.org> - Mentor >>>>>>>>>>> >>>>>>>>>>> == Affiliations == >>>>>>>>>>> The initial committers are employees of Airbnb Inc. and >>>>>> Hortonworks. >>>>>>>>>>> >>>>>>>>>>> == Sponsors == >>>>>>>>>>> >>>>>>>>>>> === Champion === >>>>>>>>>>> Daniel Dai <da...@apache.org> >>>>>>>>>>> >>>>>>>>>>> === Nominated Mentors === >>>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> >>>>>>>>>>> * Luke Han <luke....@apache.org> >>>>>>>>>>> >>>>>>>>>>> === Sponsoring Entity === >>>>>>>>>>> Incubator PMC >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------ >>>>>> --------- >>>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator. >>> apache.org >>>>>>> For additional commands, e-mail: >> general-help@incubator.apache. >>> org >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>>>>> For additional commands, e-mail: general-h...@incubator.apache.org >>>>>> >>>>>> >>>>> >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>> For additional commands, e-mail: general-h...@incubator.apache.org >>> >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org