Re-affirming my vote as well: +1 (binding)
On Thu, Apr 27, 2017 at 10:45 AM, Julian Hyde <jh...@apache.org> wrote: > Re-affriming my vote: > > +1 (binding) > > > On Apr 26, 2017, at 11:12 PM, Jeff Feng <jeff.f...@gmail.com> wrote: > > > > Hello everyone, > > > > Thank you for checking out our proposal on Superset and for your > > consideration for the Apache Incubator. So far, I believe we have 8 > > binding votes and 2 non-binding votes. > > > > As Taylor mentioned earlier, we made a minor update to the wording in the > > "Source and Intellectual Property Submission Plan" section based on a > > suggestion by John Ament. The update was to help confirm the previously > > unstated assumption that we will submit an SGA. I have copied the > updated > > proposal from the wiki to the email below and highlighted (in yellow) the > > new sentence below in the document. > > > > Folks on the cc line who have already voted, please let us know if the > > change impacts your vote. > > > > Thank you all, > > Jeff > > > > > > > > = Superset = > > > > == Abstract == > > Superset is an enterprise-ready web application for data exploration, > data > > visualization and dashboarding. > > > > == Proposal == > > Superset is business intelligence (BI) software that helps modern > > organizations visualize and interact with their data. Superset enables > > users explore data from a variety of databases, assemble beautiful > > dashboards and share their findings. Superset works neatly with all > modern > > SQL-speaking databases, and integrates with Druid.io to provide > real-time, > > interactive, blazing fast data access to large datasets. > > > > == Background == > > Data is mission critical. To succeed in this era, organizations need to > > provide low-friction, intuitive and interactive access to data. It is > > paramount for knowledge workers to be capable of answering their own > > questions by querying, exploring and visualizing data. > > > > The entire business intelligence industry has pivoted from a model of > > centralized top-down platforms driven by IT organizations to self-service > > analytics and agile workflows by any user. This shift unblocks > centralized > > service bottlenecks for creating data visualizations while also creating > an > > environment that is iterative and fast-moving. This means that business > > intelligence software must also be easy and delightful to use. > > Self-service analytics doesn’t mean that admin and governance features > are > > not needed. > > Modern BI tools provide fine-grain access controls and auditing > > capabilities to understand how data is being used. Superset is a > solution > > that delivers on all of these vectors. > > > > The technology stack is also constantly morphing - vendors are struggling > > to provide cheap, quick and easy solutions to access data. Business > > intelligence users are finding existing solutions lacking as these > software > > products either disregard or react slowly to recent game-changing > > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js, > > React.js and iPython’s Jupyter for instance. > > > > == Rationale == > > Business intelligence is more relevant today than at any other point in > > history. Organizations are currently very limited in options for open > > source data visualization solutions, especially solutions that are both > > self-service and enterprise-ready. Every company informing their > decisions > > with data needs a BI tool. > > > > We believe that Superset will be a strong compliment to existing Apache > > Software Foundation technologies by offering scalable user interactions > to > > distributed storage and computation solutions. Users will often find > that > > Superset can act as a catalyst for tooling that can visualize the > byproduct > > of data and computation infrastructure. > > > > Superset has many key design elements that help fill a gap in current > > solutions for organizations: > > * Easy, low friction access to data through a simple, web-based data > > exploration interface. Composing charts and dashboards are intuitive. > > Eliminating the need to write code or SQL empowers anyone to use it. > > * Access to a wide array of rich, interactive data visualization types. > > * Enterprise-ready: Integration with different authentication mechanisms > > and granular permissions centered around actions and data access. > > * Realtime & fast: Superset provides realtime analytics at the speed of > > thought on very large datasets when integrated with Druid.io. > > * Broad data access: Consume data out of any SQL-speaking relational > > database. > > * Extensible: Can be extended to talk to many noSQL databases like Apache > > Drill, Elastic Search, and other popular database engines. > > * Fast loading dashboards with configurable web-scale caching. > > * Plug-in framework that enables organizations to build custom analytical > > applications with new UI/UX interfaces. > > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users > > with more flexibility. SQL Lab integrates with the visualization engine > > seamlessly. > > > > == Initial Goals == > > The initial goals of the Superset project are several-fold: > > * Move the existing codebase to Apache and integrate with the Apache > > development process. > > * Redesign the user interface and interaction model for creating > > visualizations/dashboards and connecting to data sources > > * Build robust support for security and governance of the tool including > > popular authorization modules (including Apache Ranger and Apache Sentry) > > and a more sophisticated permissions system > > * Grow the extensibility of the project both in terms of enhanced > > connectivity to NoSQL-based data sources and creating a plug-in framework > > that enables organizations to build custom analytical applications which > > require a new UI/UX > > > > == Current Status == > > By many standards, Superset is already a successful open source project. > As > > of March 2017, Superset is officially used in production at about a dozen > > companies, has received contributions from over one hundred contributors > on > > Github, 1500+ forks, and 12k+ stars. > > > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made > > significant contributions, and expressed their commitment to the project. > > The product is feature complete and has been viable for months. It > already > > serves as the main interface for consuming data at many companies of > > different sizes. > > > > While the product is usable, there’s room for improvement across the > board, > > starting with providing a smoother user experience around content > creation, > > making sure all features work out-of-the-box on more platforms and > > databases, providing better user training guides and videos, having a > > predictable release process, and increasing the overall quality of the > > Superset releases. > > > > === Meritocracy === > > We plan to invest in supporting a meritocracy. We will discuss the > > requirements in an open forum. Several companies have expressed interest > in > > this project, and we intend to invite additional developers to > participate. > > We will encourage and monitor community participation so that privileges > > can be extended to those that contribute. > > > > === Community === > > The need for an enterprise-ready data visualization and exploration > > platform in the open source community is tremendous. While Superset is > > fairly well known, recognized and used within the Druid.io community, > > adoption is currently limited outside of that niche. There is a huge > > opportunity to grow the community to hundreds if not thousands of > > organizations, and we are hoping that embracing “the Apache way” will > > accelerate the growth of our community. > > > > We have already been active at seeking and inviting contributions, and > are > > planning to scale the project by investing time and growing the support > > structure to grow the community. > > > > === Core Developers === > > The initial committers for Superset include experienced full stack, > > front-end and data engineers: > > * Maxime Beauchemin (Airbnb) > > * Alanna Scott (Airbnb) > > * Bogdan Kyryliuk (Airbnb) > > * Vera Liu (Airbnb) > > * Jeff Feng (Airbnb) > > * Ashutosh Chauhan (Hortonworks) > > * Nishant Bangarwa (Hortonworks) > > * Slim Bouguerra (Hortonworks) > > * Priyank Shah (Hortonworks) > > * Sriharsha Chintalapani (Hortonworks) > > * Daniel Dai (Hortonworks) > > > > We realize that additional employer diversity is needed, and we will work > > aggressively to recruit developers from additional companies. > > > > === Alignment === > > The initial committers strongly believe that a system for interactive > > visualization of data will gain broader adoption as an open source, > > community driven project, where the community can contribute not only to > > the core components, but also to a growing collection of connectors, > > visualizations and improving integration a all potential data sources. > > Superset already integrates closely with Apache Hive, the Hive metastore, > > as well as most SQL-speaking databases found in modern data ecosystems. > > > > == Known Risks == > > > > === Orphaned Products === > > Superset is a vital component for both visualizing, accessing and > > democratizing data at Airbnb. Also at Hortonworks, Superset is a core > > component of the DataFlow product offering. Thus, the risk of the > project > > being orphaned is relatively low. The project could be at risk if Airbnb > > changes their approach for democratizing data or if Hortonworks changes > > their strategy in the market. In such an event, the committers plan to > > continue working on the project on their own time, thought the progress > > will likely be slower. We plan to mitigate this risk by recruiting > > additional committers. > > > > === Inexperience with Open Source === > > The initial committers include veteran Apache members (committers and > PPMC > > members) and other developers who have varying degrees of experience with > > open source projects. All have been involved with source code that has > been > > released under an open source license, and several also have experience > > developing code with an open source development process. > > > > === Homogenous Developers === > > The initial committers are employed by Airbnb Inc. and Hortonworks. We > are > > committed to recruiting additional committers from other companies. > > > > === Reliance on Salaried Developers === > > It is expected that Superset development will occur on both salaried time > > and on volunteer time, after hours. The majority of initial committers > are > > paid by their employer to contribute to this project. However, they are > all > > passionate about the project, and we are confident that the project will > > continue even if no salaried developers contribute to the project. We are > > committed to recruiting additional committers including non-salaried > > developers. > > > > === Relationships with Other Apache Products === > > To the knowledge of the Initial Committers, there are no direct > competitors > > to Superset within the Apache Software Foundation. That said, Apache > > Zeppelin is an indirect competitor, but it solves a different use case. > > > > Apache Zeppelin is a web-based notebook that enables interactive data > > analytics. It enables the creation of beautiful data-driven, interactive > > and collaborative documents with SQL, Scala and more. Although a user > can > > create data visualizations using this project, it leverages a notebook > > style user interfaces and it is geared towards the Spark community where > > Scala and SQL co-exist > > > > We look forward to collaborating with those communities, as well as other > > Apache communities. > > > > === An Excessive Fascination with the Apache Brand === > > Superset is solving two huge challenges: > > The challenge of enabling every knowledge worker to make data informed > > decisions, particularly those who are not deeply skilled at writing SQL. > > The challenge of visualizing huge amounts of data interactively and in > > real-time > > > > Superset was first developed as a data visualization solution for > Druid.io > > as a way to visualize billions of rows of data. Since then, usage of > > Superset has expanded to address data visualization use cases across SQL > > speaking data sources as well. > > > > Our rationale for developing Superset as an Apache project is detailed in > > the Rationale Section. We believe that the Apache brand and community > > process will help us attract more contributors to this project, and help > > grow the footprint of the project through usage at other organizations > and > > within other applications. Establishing consensus among users and > > developers will result in a more valuable tool for everyone. > > > > == Documentation == > > References to further reading material: > > * [[http://airbnb.io/superset/|Superset Documentation]] > > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat > > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post: Superset: > > Airbnb’s Data Exploration Platform]] > > * [[https://medium.com/airbnb-engineering/superset-scaling-dat > > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog > Post: > > Superset: Scaling Data Access & Visual Insights at Airbnb]] > > > > == Initial Source == > > The origin of the proposed code base can be found at > > https://github.com/airbnb/superset. The code base is primarily in > Python. > > > > == Source and Intellectual Property Submission Plan == > > Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the > > incubator. We do not expect any complications for the submission of the > > Superset code base. Our code is already in Github and there is only a > > single code base. > > > > == External Dependencies == > > List of Python packages, from the Python Package Index (Pypi): > > > > * boto3 > > * celery > > * cryptography > > * flask-appbuilder > > * flask-cache > > * flask-migrate > > * flask-script > > * flask-sqlalchemy > > * flask-testing > > * humanize > > * gunicorn > > * markdown > > * pandas > > * parsedatetime > > * pydruid > > * PyHive > > * python-dateutil > > * requests > > * simplejson > > * six > > * sqlalchemy > > * sqlalchemy-utils > > * sqlparse > > * thrift > > * thrift-sasl > > * werkzeug > > > > List of Javascript packages, from NPM: > > * autobind-decorator > > * bootstrap > > * bootstrap-datepicker > > * brace > > * brfs > > * cal-heatmap > > * classnames > > * d3 > > * d3-cloud > > * d3-sankey > > * d3-scale > > * d3-tip > > * datamaps > > * datatables-bootstrap3-plugin > > * datatables.net-bs > > * font-awesome > > * gridster > > * immutability-helper > > * immutable > > * jquery > > * lodash.throttle > > * mapbox-gl > > * moment > > * moments > > * mustache > > * nvd3 > > * react > > * react-ace > > * react-bootstrap > > * react-bootstrap-table > > * react-dom > > * react-draggable > > * react-gravatar > > * react-grid-layout > > * react-map-gl > > * react-redux > > * react-resizable > > * react-select > > * react-syntax-highlighter > > * reactable > > * redux > > * redux-localstorage > > * redux-thunk > > * shortid > > * style-loader > > * supercluster > > * topojson > > * victory > > * viewport-mercator-project > > > > == Cryptography == > > The proposal does not include cryptographic code. > > > > == Required Resources == > > > > === Mailing List === > > There is a current mailing list as a Google Group “airbnb_superset” that > we > > are planning on deprecating as the Apache.org become ready to serve our > > community. > > > > * superset-private > > * superset-dev > > * superset-user > > > > === Subversion Directory === > > Git is the preferred source control system. > http://svn.apache.org/repos/as > > f/incubator/superset > > > > == Git Repository == > > Git is the preferred source control system, we’re assuming > > https://github.com/apache/incubator-superset based on the naming scheme > > > > == Issue Tracking == > > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & > PRs > > to manage our project as much as possible. It’s been said that there are > > ways to keep Github’s issues in sync with Jira, allowing us to get best > of > > both worlds. If that is not possible, we will comply to using Jira. > > > > == Other Resources == > > We currently use a set of Github integrated services that are free to the > > open source community, like Travis-ci, Code Climate, Coveralls, > > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep > using > > these services as they allow us to scale contributions and optimize our > > development flows. These services require some elevated rights on the > > Github repository in order to set up or tune and we would like for the > > committers to have the required rights. > > > > > > == Initial Committers == > > > > * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC & Committer > > * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & Committer > > * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & Committer > > * Vera Liu <vera....@airbnb.com> - Committer > > * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > > * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer > > * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & Committer > > * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > > * Priyank Shah <ps...@hortonworks.com> - Committer > > * Harsha Chintalapani <schintalap...@hortonworks.com> - Committer > > * Daniel Dai <da...@apache.org> - Champion & Committer > > * Luke Han <luke....@apache.org> - Mentor > > > > == Affiliations == > > The initial committers are employees of Airbnb Inc. and Hortonworks. > > > > == Sponsors == > > > > === Champion === > > Daniel Dai <da...@apache.org> > > > > === Nominated Mentors === > > * Ashutosh Chauhan <hashut...@apache.org> > > * Luke Han <luke....@apache.org> > > > > === Sponsoring Entity === > > Incubator PMC > > > > > > > > > > > > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <edwardy...@apache.org> > > wrote: > > > >> +1 binding > >> > >> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal > >> <naresh.agar...@gmail.com> wrote: > >>> +1 (non-binding). > >>> > >>> Thanks > >>> Naresh Agarwal > >>> > >>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunn...@gmail.com> > >> wrote: > >>> > >>>> +1 (binding) > >>>> > >>>> > >>>> > >>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: > >>>> > >>>>> +1 (binding) > >>>>> > >>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey > >>>>> <jiten...@hortonworks.com> wrote: > >>>>>> +1 (binding) > >>>>>> > >>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: > >>>>>> > >>>>>> +1 binding > >>>>>> > >>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org> > >>>>> wrote: > >>>>>>> > >>>>>>> +1 (non-binding) > >>>>>>> > >>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < > >>>>> hashut...@apache.org> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> +1 (binding) > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Ashutosh > >>>>>>>> > >>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <luke...@gmail.com > >>> > >>>>> wrote: > >>>>>>>> > >>>>>>>>> +1 binding > >>>>>>>>> > >>>>>>>>> Love to see Superset to be new incubator project. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Best Regards! > >>>>>>>>> --------------------- > >>>>>>>>> > >>>>>>>>> Luke Han > >>>>>>>>> > >>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng < > >>>> jeff.f...@gmail.com> > >>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Dear Apache Incubator Community, > >>>>>>>>>> > >>>>>>>>>> We have updated the Superset proposal > >>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal> > >> (copied > >>>>> below) for > >>>>>>>>>> > >>>>>>>>>> Apache Incubation with an additional mentor (Luke Han - > >>>>>>>>>> luke....@apache.org), > >>>>>>>>>> and would like to start a vote thread for acceptance into > >> the > >>>>> incubator. > >>>>>>>>>> > >>>>>>>>>> Our team is excited to share Superset with the Apache > >>>> community > >>>>> and we > >>>>>>>>>> hope > >>>>>>>>>> for the your continued support! > >>>>>>>>>> > >>>>>>>>>> Cheers, > >>>>>>>>>> Jeff & the Superset Team > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> = Superset = > >>>>>>>>>> > >>>>>>>>>> == Abstract == > >>>>>>>>>> Superset is an enterprise-ready web application for data > >>>>> exploration, > >>>>>>>> data > >>>>>>>>>> visualization and dashboarding. > >>>>>>>>>> > >>>>>>>>>> == Proposal == > >>>>>>>>>> Superset is business intelligence (BI) software that helps > >>>>> modern > >>>>>>>>>> organizations visualize and interact with their data. > >> Superset > >>>>> enables > >>>>>>>>>> users explore data from a variety of databases, assemble > >>>>> beautiful > >>>>>>>>>> dashboards and share their findings. Superset works neatly > >>>>> with all > >>>>>>>>>> modern > >>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to > >>>> provide > >>>>>>>> real-time, > >>>>>>>>>> interactive, blazing fast data access to large datasets. > >>>>>>>>>> > >>>>>>>>>> == Background == > >>>>>>>>>> Data is mission critical. To succeed in this era, > >>>> organizations > >>>>> need to > >>>>>>>>>> provide low-friction, intuitive and interactive access to > >>>> data. > >>>>> It is > >>>>>>>>>> paramount for knowledge workers to be capable of answering > >>>>> their own > >>>>>>>>>> questions by querying, exploring and visualizing data. > >>>>>>>>>> > >>>>>>>>>> The entire business intelligence industry has pivoted from > >> a > >>>>> model of > >>>>>>>>>> centralized top-down platforms driven by IT organizations > >> to > >>>>>>>> self-service > >>>>>>>>>> analytics and agile workflows by any user. This shift > >>>> unblocks > >>>>>>>>>> centralized > >>>>>>>>>> service bottlenecks for creating data visualizations while > >>>> also > >>>>> creating > >>>>>>>>>> an > >>>>>>>>>> environment that is iterative and fast-moving. This means > >>>> that > >>>>> business > >>>>>>>>>> intelligence software must also be easy and delightful to > >> use. > >>>>>>>>>> Self-service analytics doesn’t mean that admin and > >> governance > >>>>> features > >>>>>>>> are > >>>>>>>>>> not needed. > >>>>>>>>>> Modern BI tools provide fine-grain access controls and > >>>> auditing > >>>>>>>>>> capabilities to understand how data is being used. > >> Superset > >>>> is > >>>>> a > >>>>>>>> solution > >>>>>>>>>> that delivers on all of these vectors. > >>>>>>>>>> > >>>>>>>>>> The technology stack is also constantly morphing - vendors > >> are > >>>>>>>> struggling > >>>>>>>>>> to provide cheap, quick and easy solutions to access data. > >>>>> Business > >>>>>>>>>> intelligence users are finding existing solutions lacking > >> as > >>>>> these > >>>>>>>>>> software > >>>>>>>>>> products either disregard or react slowly to recent > >>>>> game-changing > >>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache > >>>>> Kylin, d3.js, > >>>>>>>>>> React.js and iPython’s Jupyter for instance. > >>>>>>>>>> > >>>>>>>>>> == Rationale == > >>>>>>>>>> Business intelligence is more relevant today than at any > >> other > >>>>> point in > >>>>>>>>>> history. Organizations are currently very limited in > >> options > >>>>> for open > >>>>>>>>>> source data visualization solutions, especially solutions > >> that > >>>>> are both > >>>>>>>>>> self-service and enterprise-ready. Every company informing > >>>>> their > >>>>>>>>>> decisions > >>>>>>>>>> with data needs a BI tool. > >>>>>>>>>> > >>>>>>>>>> We believe that Superset will be a strong compliment to > >>>>> existing Apache > >>>>>>>>>> Software Foundation technologies by offering scalable user > >>>>> interactions > >>>>>>>> to > >>>>>>>>>> distributed storage and computation solutions. Users will > >>>>> often find > >>>>>>>> that > >>>>>>>>>> Superset can act as a catalyst for tooling that can > >> visualize > >>>>> the > >>>>>>>>>> byproduct > >>>>>>>>>> of data and computation infrastructure. > >>>>>>>>>> > >>>>>>>>>> Superset has many key design elements that help fill a gap > >> in > >>>>> current > >>>>>>>>>> solutions for organizations: > >>>>>>>>>> * Easy, low friction access to data through a simple, > >>>> web-based > >>>>> data > >>>>>>>>>> exploration interface. Composing charts and dashboards are > >>>>> intuitive. > >>>>>>>>>> Eliminating the need to write code or SQL empowers anyone > >> to > >>>>> use it. > >>>>>>>>>> * Access to a wide array of rich, interactive data > >>>>> visualization types. > >>>>>>>>>> * Enterprise-ready: Integration with different > >> authentication > >>>>>>>> mechanisms > >>>>>>>>>> and granular permissions centered around actions and data > >>>>> access. > >>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at > >> the > >>>>> speed of > >>>>>>>>>> thought on very large datasets when integrated with > >> Druid.io. > >>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking > >>>>> relational > >>>>>>>>>> database. > >>>>>>>>>> * Extensible: Can be extended to talk to many noSQL > >> databases > >>>>> like > >>>>>>>> Apache > >>>>>>>>>> Drill, Elastic Search, and other popular database engines. > >>>>>>>>>> * Fast loading dashboards with configurable web-scale > >> caching. > >>>>>>>>>> * Plug-in framework that enables organizations to build > >> custom > >>>>>>>> analytical > >>>>>>>>>> applications with new UI/UX interfaces. > >>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers > >>>>> SQL-speaking users > >>>>>>>>>> with more flexibility. SQL Lab integrates with the > >>>>> visualization engine > >>>>>>>>>> seamlessly. > >>>>>>>>>> > >>>>>>>>>> == Initial Goals == > >>>>>>>>>> The initial goals of the Superset project are several-fold: > >>>>>>>>>> * Move the existing codebase to Apache and integrate with > >> the > >>>>> Apache > >>>>>>>>>> development process. > >>>>>>>>>> * Redesign the user interface and interaction model for > >>>> creating > >>>>>>>>>> visualizations/dashboards and connecting to data sources > >>>>>>>>>> * Build robust support for security and governance of the > >> tool > >>>>>>>> including > >>>>>>>>>> popular authorization modules (including Apache Ranger and > >>>>> Apache > >>>>>>>> Sentry) > >>>>>>>>>> and a more sophisticated permissions system > >>>>>>>>>> * Grow the extensibility of the project both in terms of > >>>>> enhanced > >>>>>>>>>> connectivity to NoSQL-based data sources and creating a > >>>> plug-in > >>>>>>>> framework > >>>>>>>>>> that enables organizations to build custom analytical > >>>>> applications which > >>>>>>>>>> require a new UI/UX > >>>>>>>>>> > >>>>>>>>>> == Current Status == > >>>>>>>>>> By many standards, Superset is already a successful open > >>>> source > >>>>> project. > >>>>>>>>>> As > >>>>>>>>>> of March 2017, Superset is officially used in production at > >>>>> about a > >>>>>>>> dozen > >>>>>>>>>> companies, has received contributions from over one hundred > >>>>> contributors > >>>>>>>>>> on > >>>>>>>>>> Github, 1500+ forks, and 12k+ stars. > >>>>>>>>>> > >>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have > >>>> made > >>>>>>>>>> significant contributions, and expressed their commitment > >> to > >>>> the > >>>>>>>> project. > >>>>>>>>>> The product is feature complete and has been viable for > >>>> months. > >>>>> It > >>>>>>>> already > >>>>>>>>>> serves as the main interface for consuming data at many > >>>>> companies of > >>>>>>>>>> different sizes. > >>>>>>>>>> > >>>>>>>>>> While the product is usable, there’s room for improvement > >>>>> across the > >>>>>>>>>> board, > >>>>>>>>>> starting with providing a smoother user experience around > >>>>> content > >>>>>>>>>> creation, > >>>>>>>>>> making sure all features work out-of-the-box on more > >> platforms > >>>>> and > >>>>>>>>>> databases, providing better user training guides and > >> videos, > >>>>> having a > >>>>>>>>>> predictable release process, and increasing the overall > >>>> quality > >>>>> of the > >>>>>>>>>> Superset releases. > >>>>>>>>>> > >>>>>>>>>> === Meritocracy === > >>>>>>>>>> We plan to invest in supporting a meritocracy. We will > >> discuss > >>>>> the > >>>>>>>>>> requirements in an open forum. Several companies have > >>>> expressed > >>>>> interest > >>>>>>>>>> in > >>>>>>>>>> this project, and we intend to invite additional > >> developers to > >>>>>>>>>> participate. > >>>>>>>>>> We will encourage and monitor community participation so > >> that > >>>>> privileges > >>>>>>>>>> can be extended to those that contribute. > >>>>>>>>>> > >>>>>>>>>> === Community === > >>>>>>>>>> The need for an enterprise-ready data visualization and > >>>>> exploration > >>>>>>>>>> platform in the open source community is tremendous. While > >>>>> Superset is > >>>>>>>>>> fairly well known, recognized and used within the Druid.io > >>>>> community, > >>>>>>>>>> adoption is currently limited outside of that niche. There > >> is > >>>> a > >>>>> huge > >>>>>>>>>> opportunity to grow the community to hundreds if not > >> thousands > >>>>> of > >>>>>>>>>> organizations, and we are hoping that embracing “the Apache > >>>>> way” will > >>>>>>>>>> accelerate the growth of our community. > >>>>>>>>>> > >>>>>>>>>> We have already been active at seeking and inviting > >>>>> contributions, and > >>>>>>>> are > >>>>>>>>>> planning to scale the project by investing time and growing > >>>> the > >>>>> support > >>>>>>>>>> structure to grow the community. > >>>>>>>>>> > >>>>>>>>>> === Core Developers === > >>>>>>>>>> The initial committers for Superset include experienced > >> full > >>>>> stack, > >>>>>>>>>> front-end and data engineers: > >>>>>>>>>> * Maxime Beauchemin (Airbnb) > >>>>>>>>>> * Alanna Scott (Airbnb) > >>>>>>>>>> * Bogdan Kyryliuk (Airbnb) > >>>>>>>>>> * Vera Liu (Airbnb) > >>>>>>>>>> * Jeff Feng (Airbnb) > >>>>>>>>>> * Ashutosh Chauhan (Hortonworks) > >>>>>>>>>> * Nishant Bangarwa (Hortonworks) > >>>>>>>>>> * Slim Bouguerra (Hortonworks) > >>>>>>>>>> * Priyank Shah (Hortonworks) > >>>>>>>>>> * Sriharsha Chintalapani (Hortonworks) > >>>>>>>>>> * Daniel Dai (Hortonworks) > >>>>>>>>>> > >>>>>>>>>> We realize that additional employer diversity is needed, > >> and > >>>> we > >>>>> will > >>>>>>>> work > >>>>>>>>>> aggressively to recruit developers from additional > >> companies. > >>>>>>>>>> > >>>>>>>>>> === Alignment === > >>>>>>>>>> The initial committers strongly believe that a system for > >>>>> interactive > >>>>>>>>>> visualization of data will gain broader adoption as an open > >>>>> source, > >>>>>>>>>> community driven project, where the community can > >> contribute > >>>>> not only to > >>>>>>>>>> the core components, but also to a growing collection of > >>>>> connectors, > >>>>>>>>>> visualizations and improving integration a all potential > >> data > >>>>> sources. > >>>>>>>>>> Superset already integrates closely with Apache Hive, the > >> Hive > >>>>>>>> metastore, > >>>>>>>>>> as well as most SQL-speaking databases found in modern data > >>>>> ecosystems. > >>>>>>>>>> > >>>>>>>>>> == Known Risks == > >>>>>>>>>> > >>>>>>>>>> === Orphaned Products === > >>>>>>>>>> Superset is a vital component for both visualizing, > >> accessing > >>>>> and > >>>>>>>>>> democratizing data at Airbnb. Also at Hortonworks, > >> Superset > >>>> is > >>>>> a core > >>>>>>>>>> component of the DataFlow product offering. Thus, the > >> risk of > >>>>> the > >>>>>>>> project > >>>>>>>>>> being orphaned is relatively low. The project could be at > >>>> risk > >>>>> if > >>>>>>>> Airbnb > >>>>>>>>>> changes their approach for democratizing data or if > >>>> Hortonworks > >>>>> changes > >>>>>>>>>> their strategy in the market. In such an event, the > >>>> committers > >>>>> plan to > >>>>>>>>>> continue working on the project on their own time, thought > >> the > >>>>> progress > >>>>>>>>>> will likely be slower. We plan to mitigate this risk by > >>>>> recruiting > >>>>>>>>>> additional committers. > >>>>>>>>>> > >>>>>>>>>> === Inexperience with Open Source === > >>>>>>>>>> The initial committers include veteran Apache members > >>>>> (committers and > >>>>>>>> PPMC > >>>>>>>>>> members) and other developers who have varying degrees of > >>>>> experience > >>>>>>>> with > >>>>>>>>>> open source projects. All have been involved with source > >> code > >>>>> that has > >>>>>>>>>> been > >>>>>>>>>> released under an open source license, and several also > >> have > >>>>> experience > >>>>>>>>>> developing code with an open source development process. > >>>>>>>>>> > >>>>>>>>>> === Homogenous Developers === > >>>>>>>>>> The initial committers are employed by Airbnb Inc. and > >>>>> Hortonworks. We > >>>>>>>> are > >>>>>>>>>> committed to recruiting additional committers from other > >>>>> companies. > >>>>>>>>>> > >>>>>>>>>> === Reliance on Salaried Developers === > >>>>>>>>>> It is expected that Superset development will occur on both > >>>>> salaried > >>>>>>>> time > >>>>>>>>>> and on volunteer time, after hours. The majority of initial > >>>>> committers > >>>>>>>> are > >>>>>>>>>> paid by their employer to contribute to this project. > >> However, > >>>>> they are > >>>>>>>>>> all > >>>>>>>>>> passionate about the project, and we are confident that the > >>>>> project will > >>>>>>>>>> continue even if no salaried developers contribute to the > >>>>> project. We > >>>>>>>> are > >>>>>>>>>> committed to recruiting additional committers including > >>>>> non-salaried > >>>>>>>>>> developers. > >>>>>>>>>> > >>>>>>>>>> === Relationships with Other Apache Products === > >>>>>>>>>> To the knowledge of the Initial Committers, there are no > >>>> direct > >>>>>>>>>> competitors > >>>>>>>>>> to Superset within the Apache Software Foundation. That > >> said, > >>>>> Apache > >>>>>>>>>> Zeppelin is an indirect competitor, but it solves a > >> different > >>>>> use case. > >>>>>>>>>> > >>>>>>>>>> Apache Zeppelin is a web-based notebook that enables > >>>>> interactive data > >>>>>>>>>> analytics. It enables the creation of beautiful > >> data-driven, > >>>>> interactive > >>>>>>>>>> and collaborative documents with SQL, Scala and more. > >>>> Although > >>>>> a user > >>>>>>>> can > >>>>>>>>>> create data visualizations using this project, it > >> leverages a > >>>>> notebook > >>>>>>>>>> style user interfaces and it is geared towards the Spark > >>>>> community where > >>>>>>>>>> Scala and SQL co-exist > >>>>>>>>>> > >>>>>>>>>> We look forward to collaborating with those communities, as > >>>>> well as > >>>>>>>> other > >>>>>>>>>> Apache communities. > >>>>>>>>>> > >>>>>>>>>> === An Excessive Fascination with the Apache Brand === > >>>>>>>>>> Superset is solving two huge challenges: > >>>>>>>>>> The challenge of enabling every knowledge worker to make > >> data > >>>>> informed > >>>>>>>>>> decisions, particularly those who are not deeply skilled at > >>>>> writing SQL. > >>>>>>>>>> The challenge of visualizing huge amounts of data > >>>> interactively > >>>>> and in > >>>>>>>>>> real-time > >>>>>>>>>> > >>>>>>>>>> Superset was first developed as a data visualization > >> solution > >>>>> for > >>>>>>>> Druid.io > >>>>>>>>>> as a way to visualize billions of rows of data. Since > >> then, > >>>>> usage of > >>>>>>>>>> Superset has expanded to address data visualization use > >> cases > >>>>> across SQL > >>>>>>>>>> speaking data sources as well. > >>>>>>>>>> > >>>>>>>>>> Our rationale for developing Superset as an Apache project > >> is > >>>>> detailed > >>>>>>>> in > >>>>>>>>>> the Rationale Section. We believe that the Apache brand > >> and > >>>>> community > >>>>>>>>>> process will help us attract more contributors to this > >>>> project, > >>>>> and help > >>>>>>>>>> grow the footprint of the project through usage at other > >>>>> organizations > >>>>>>>> and > >>>>>>>>>> within other applications. Establishing consensus among > >> users > >>>>> and > >>>>>>>>>> developers will result in a more valuable tool for > >> everyone. > >>>>>>>>>> > >>>>>>>>>> == Documentation == > >>>>>>>>>> References to further reading material: > >>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]] > >>>>>>>>>> * [[ > >>>>>>>>>> https://medium.com/airbnb-engi > >> neering/caravel-airbnb-s-data- > >>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog > >>>>>>>>>> Post: Superset: Airbnb’s Data Exploration Platform]] > >>>>>>>>>> * [[ > >>>>>>>>>> https://medium.com/airbnb-engi > >> neering/superset-scaling-data- > >>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. > >>>>> a505zvb1t|Blog > >>>>>>>>>> Post: Superset: Scaling Data Access & Visual Insights at > >>>>> Airbnb]] > >>>>>>>>>> > >>>>>>>>>> == Initial Source == > >>>>>>>>>> The origin of the proposed code base can be found at > >>>>>>>>>> https://github.com/airbnb/superset. The code base is > >>>>> primarily in > >>>>>>>>>> Python. > >>>>>>>>>> > >>>>>>>>>> == Source and Intellectual Property Submission Plan == > >>>>>>>>>> We do not expect any complications for the submission of > >> the > >>>>> Superset > >>>>>>>> code > >>>>>>>>>> base. Our code is already in Github and there is only a > >>>> single > >>>>> code > >>>>>>>> base. > >>>>>>>>>> > >>>>>>>>>> == External Dependencies == > >>>>>>>>>> List of Python packages, from the Python Package Index > >> (Pypi): > >>>>>>>>>> > >>>>>>>>>> * boto3 > >>>>>>>>>> * celery > >>>>>>>>>> * cryptography > >>>>>>>>>> * flask-appbuilder > >>>>>>>>>> * flask-cache > >>>>>>>>>> * flask-migrate > >>>>>>>>>> * flask-script > >>>>>>>>>> * flask-sqlalchemy > >>>>>>>>>> * flask-testing > >>>>>>>>>> * humanize > >>>>>>>>>> * gunicorn > >>>>>>>>>> * markdown > >>>>>>>>>> * pandas > >>>>>>>>>> * parsedatetime > >>>>>>>>>> * pydruid > >>>>>>>>>> * PyHive > >>>>>>>>>> * python-dateutil > >>>>>>>>>> * requests > >>>>>>>>>> * simplejson > >>>>>>>>>> * six > >>>>>>>>>> * sqlalchemy > >>>>>>>>>> * sqlalchemy-utils > >>>>>>>>>> * sqlparse > >>>>>>>>>> * thrift > >>>>>>>>>> * thrift-sasl > >>>>>>>>>> * werkzeug > >>>>>>>>>> > >>>>>>>>>> List of Javascript packages, from NPM: > >>>>>>>>>> * autobind-decorator > >>>>>>>>>> * bootstrap > >>>>>>>>>> * bootstrap-datepicker > >>>>>>>>>> * brace > >>>>>>>>>> * brfs > >>>>>>>>>> * cal-heatmap > >>>>>>>>>> * classnames > >>>>>>>>>> * d3 > >>>>>>>>>> * d3-cloud > >>>>>>>>>> * d3-sankey > >>>>>>>>>> * d3-scale > >>>>>>>>>> * d3-tip > >>>>>>>>>> * datamaps > >>>>>>>>>> * datatables-bootstrap3-plugin > >>>>>>>>>> * datatables.net-bs > >>>>>>>>>> * font-awesome > >>>>>>>>>> * gridster > >>>>>>>>>> * immutability-helper > >>>>>>>>>> * immutable > >>>>>>>>>> * jquery > >>>>>>>>>> * lodash.throttle > >>>>>>>>>> * mapbox-gl > >>>>>>>>>> * moment > >>>>>>>>>> * moments > >>>>>>>>>> * mustache > >>>>>>>>>> * nvd3 > >>>>>>>>>> * react > >>>>>>>>>> * react-ace > >>>>>>>>>> * react-bootstrap > >>>>>>>>>> * react-bootstrap-table > >>>>>>>>>> * react-dom > >>>>>>>>>> * react-draggable > >>>>>>>>>> * react-gravatar > >>>>>>>>>> * react-grid-layout > >>>>>>>>>> * react-map-gl > >>>>>>>>>> * react-redux > >>>>>>>>>> * react-resizable > >>>>>>>>>> * react-select > >>>>>>>>>> * react-syntax-highlighter > >>>>>>>>>> * reactable > >>>>>>>>>> * redux > >>>>>>>>>> * redux-localstorage > >>>>>>>>>> * redux-thunk > >>>>>>>>>> * shortid > >>>>>>>>>> * style-loader > >>>>>>>>>> * supercluster > >>>>>>>>>> * topojson > >>>>>>>>>> * victory > >>>>>>>>>> * viewport-mercator-project > >>>>>>>>>> > >>>>>>>>>> == Cryptography == > >>>>>>>>>> The proposal does not include cryptographic code. > >>>>>>>>>> > >>>>>>>>>> == Required Resources == > >>>>>>>>>> > >>>>>>>>>> === Mailing List === > >>>>>>>>>> There is a current mailing list as a Google Group > >>>>> “airbnb_superset” that > >>>>>>>>>> we > >>>>>>>>>> are planning on deprecating as the Apache.org become ready > >> to > >>>>> serve our > >>>>>>>>>> community. > >>>>>>>>>> > >>>>>>>>>> * superset-private > >>>>>>>>>> * superset-dev > >>>>>>>>>> * superset-user > >>>>>>>>>> > >>>>>>>>>> === Subversion Directory === > >>>>>>>>>> Git is the preferred source control system. > >>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset > >>>>>>>>>> > >>>>>>>>>> == Git Repository == > >>>>>>>>>> Git is the preferred source control system, we’re assuming > >>>>>>>>>> https://github.com/apache/incubator-superset based on the > >>>>> naming scheme > >>>>>>>>>> > >>>>>>>>>> == Issue Tracking == > >>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use > >> Github > >>>>> issues & > >>>>>>>>>> PRs > >>>>>>>>>> to manage our project as much as possible. It’s been said > >> that > >>>>> there are > >>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing > >> us to > >>>>> get best > >>>>>>>> of > >>>>>>>>>> both worlds. If that is not possible, we will comply to > >> using > >>>>> Jira. > >>>>>>>>>> > >>>>>>>>>> == Other Resources == > >>>>>>>>>> We currently use a set of Github integrated services that > >> are > >>>>> free to > >>>>>>>> the > >>>>>>>>>> open source community, like Travis-ci, Code Climate, > >>>> Coveralls, > >>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would > >> like > >>>>> to keep > >>>>>>>>>> using > >>>>>>>>>> these services as they allow us to scale contributions and > >>>>> optimize our > >>>>>>>>>> development flows. These services require some elevated > >> rights > >>>>> on the > >>>>>>>>>> Github repository in order to set up or tune and we would > >> like > >>>>> for the > >>>>>>>>>> committers to have the required rights. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> == Initial Committers == > >>>>>>>>>> > >>>>>>>>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC > >> & > >>>>> Committer > >>>>>>>>>> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & > >> Committer > >>>>>>>>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & > >> Committer > >>>>>>>>>> * Vera Liu <vera....@airbnb.com> - Committer > >>>>>>>>>> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > >>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & > >>>> Committer > >>>>>>>>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & > >>>>> Committer > >>>>>>>>>> * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > >>>>>>>>>> * Priyank Shah <ps...@hortonworks.com> - Committer > >>>>>>>>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - > >>>>> Committer > >>>>>>>>>> * Daniel Dai <da...@apache.org> - Champion & Committer > >>>>>>>>>> * Luke Han <luke....@apache.org> - Mentor > >>>>>>>>>> > >>>>>>>>>> == Affiliations == > >>>>>>>>>> The initial committers are employees of Airbnb Inc. and > >>>>> Hortonworks. > >>>>>>>>>> > >>>>>>>>>> == Sponsors == > >>>>>>>>>> > >>>>>>>>>> === Champion === > >>>>>>>>>> Daniel Dai <da...@apache.org> > >>>>>>>>>> > >>>>>>>>>> === Nominated Mentors === > >>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> > >>>>>>>>>> * Luke Han <luke....@apache.org> > >>>>>>>>>> > >>>>>>>>>> === Sponsoring Entity === > >>>>>>>>>> Incubator PMC > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>>> > >>>>>> ------------------------------------------------------------ > >>>>> --------- > >>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator. > >> apache.org > >>>>>> For additional commands, e-mail: general-help@incubator.apache. > >> org > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> ------------------------------------------------------------ > --------- > >>>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >>>>> For additional commands, e-mail: general-h...@incubator.apache.org > >>>>> > >>>>> > >>>> > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > >> For additional commands, e-mail: general-h...@incubator.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >