Re-affriming my vote: +1 (binding)
> On Apr 26, 2017, at 11:12 PM, Jeff Feng <jeff.f...@gmail.com> wrote: > > Hello everyone, > > Thank you for checking out our proposal on Superset and for your > consideration for the Apache Incubator. So far, I believe we have 8 > binding votes and 2 non-binding votes. > > As Taylor mentioned earlier, we made a minor update to the wording in the > "Source and Intellectual Property Submission Plan" section based on a > suggestion by John Ament. The update was to help confirm the previously > unstated assumption that we will submit an SGA. I have copied the updated > proposal from the wiki to the email below and highlighted (in yellow) the > new sentence below in the document. > > Folks on the cc line who have already voted, please let us know if the > change impacts your vote. > > Thank you all, > Jeff > > > > = Superset = > > == Abstract == > Superset is an enterprise-ready web application for data exploration, data > visualization and dashboarding. > > == Proposal == > Superset is business intelligence (BI) software that helps modern > organizations visualize and interact with their data. Superset enables > users explore data from a variety of databases, assemble beautiful > dashboards and share their findings. Superset works neatly with all modern > SQL-speaking databases, and integrates with Druid.io to provide real-time, > interactive, blazing fast data access to large datasets. > > == Background == > Data is mission critical. To succeed in this era, organizations need to > provide low-friction, intuitive and interactive access to data. It is > paramount for knowledge workers to be capable of answering their own > questions by querying, exploring and visualizing data. > > The entire business intelligence industry has pivoted from a model of > centralized top-down platforms driven by IT organizations to self-service > analytics and agile workflows by any user. This shift unblocks centralized > service bottlenecks for creating data visualizations while also creating an > environment that is iterative and fast-moving. This means that business > intelligence software must also be easy and delightful to use. > Self-service analytics doesn’t mean that admin and governance features are > not needed. > Modern BI tools provide fine-grain access controls and auditing > capabilities to understand how data is being used. Superset is a solution > that delivers on all of these vectors. > > The technology stack is also constantly morphing - vendors are struggling > to provide cheap, quick and easy solutions to access data. Business > intelligence users are finding existing solutions lacking as these software > products either disregard or react slowly to recent game-changing > technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, d3.js, > React.js and iPython’s Jupyter for instance. > > == Rationale == > Business intelligence is more relevant today than at any other point in > history. Organizations are currently very limited in options for open > source data visualization solutions, especially solutions that are both > self-service and enterprise-ready. Every company informing their decisions > with data needs a BI tool. > > We believe that Superset will be a strong compliment to existing Apache > Software Foundation technologies by offering scalable user interactions to > distributed storage and computation solutions. Users will often find that > Superset can act as a catalyst for tooling that can visualize the byproduct > of data and computation infrastructure. > > Superset has many key design elements that help fill a gap in current > solutions for organizations: > * Easy, low friction access to data through a simple, web-based data > exploration interface. Composing charts and dashboards are intuitive. > Eliminating the need to write code or SQL empowers anyone to use it. > * Access to a wide array of rich, interactive data visualization types. > * Enterprise-ready: Integration with different authentication mechanisms > and granular permissions centered around actions and data access. > * Realtime & fast: Superset provides realtime analytics at the speed of > thought on very large datasets when integrated with Druid.io. > * Broad data access: Consume data out of any SQL-speaking relational > database. > * Extensible: Can be extended to talk to many noSQL databases like Apache > Drill, Elastic Search, and other popular database engines. > * Fast loading dashboards with configurable web-scale caching. > * Plug-in framework that enables organizations to build custom analytical > applications with new UI/UX interfaces. > * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking users > with more flexibility. SQL Lab integrates with the visualization engine > seamlessly. > > == Initial Goals == > The initial goals of the Superset project are several-fold: > * Move the existing codebase to Apache and integrate with the Apache > development process. > * Redesign the user interface and interaction model for creating > visualizations/dashboards and connecting to data sources > * Build robust support for security and governance of the tool including > popular authorization modules (including Apache Ranger and Apache Sentry) > and a more sophisticated permissions system > * Grow the extensibility of the project both in terms of enhanced > connectivity to NoSQL-based data sources and creating a plug-in framework > that enables organizations to build custom analytical applications which > require a new UI/UX > > == Current Status == > By many standards, Superset is already a successful open source project. As > of March 2017, Superset is officially used in production at about a dozen > companies, has received contributions from over one hundred contributors on > Github, 1500+ forks, and 12k+ stars. > > Sizeable companies like Airbnb, Yahoo! and Hortonworks have made > significant contributions, and expressed their commitment to the project. > The product is feature complete and has been viable for months. It already > serves as the main interface for consuming data at many companies of > different sizes. > > While the product is usable, there’s room for improvement across the board, > starting with providing a smoother user experience around content creation, > making sure all features work out-of-the-box on more platforms and > databases, providing better user training guides and videos, having a > predictable release process, and increasing the overall quality of the > Superset releases. > > === Meritocracy === > We plan to invest in supporting a meritocracy. We will discuss the > requirements in an open forum. Several companies have expressed interest in > this project, and we intend to invite additional developers to participate. > We will encourage and monitor community participation so that privileges > can be extended to those that contribute. > > === Community === > The need for an enterprise-ready data visualization and exploration > platform in the open source community is tremendous. While Superset is > fairly well known, recognized and used within the Druid.io community, > adoption is currently limited outside of that niche. There is a huge > opportunity to grow the community to hundreds if not thousands of > organizations, and we are hoping that embracing “the Apache way” will > accelerate the growth of our community. > > We have already been active at seeking and inviting contributions, and are > planning to scale the project by investing time and growing the support > structure to grow the community. > > === Core Developers === > The initial committers for Superset include experienced full stack, > front-end and data engineers: > * Maxime Beauchemin (Airbnb) > * Alanna Scott (Airbnb) > * Bogdan Kyryliuk (Airbnb) > * Vera Liu (Airbnb) > * Jeff Feng (Airbnb) > * Ashutosh Chauhan (Hortonworks) > * Nishant Bangarwa (Hortonworks) > * Slim Bouguerra (Hortonworks) > * Priyank Shah (Hortonworks) > * Sriharsha Chintalapani (Hortonworks) > * Daniel Dai (Hortonworks) > > We realize that additional employer diversity is needed, and we will work > aggressively to recruit developers from additional companies. > > === Alignment === > The initial committers strongly believe that a system for interactive > visualization of data will gain broader adoption as an open source, > community driven project, where the community can contribute not only to > the core components, but also to a growing collection of connectors, > visualizations and improving integration a all potential data sources. > Superset already integrates closely with Apache Hive, the Hive metastore, > as well as most SQL-speaking databases found in modern data ecosystems. > > == Known Risks == > > === Orphaned Products === > Superset is a vital component for both visualizing, accessing and > democratizing data at Airbnb. Also at Hortonworks, Superset is a core > component of the DataFlow product offering. Thus, the risk of the project > being orphaned is relatively low. The project could be at risk if Airbnb > changes their approach for democratizing data or if Hortonworks changes > their strategy in the market. In such an event, the committers plan to > continue working on the project on their own time, thought the progress > will likely be slower. We plan to mitigate this risk by recruiting > additional committers. > > === Inexperience with Open Source === > The initial committers include veteran Apache members (committers and PPMC > members) and other developers who have varying degrees of experience with > open source projects. All have been involved with source code that has been > released under an open source license, and several also have experience > developing code with an open source development process. > > === Homogenous Developers === > The initial committers are employed by Airbnb Inc. and Hortonworks. We are > committed to recruiting additional committers from other companies. > > === Reliance on Salaried Developers === > It is expected that Superset development will occur on both salaried time > and on volunteer time, after hours. The majority of initial committers are > paid by their employer to contribute to this project. However, they are all > passionate about the project, and we are confident that the project will > continue even if no salaried developers contribute to the project. We are > committed to recruiting additional committers including non-salaried > developers. > > === Relationships with Other Apache Products === > To the knowledge of the Initial Committers, there are no direct competitors > to Superset within the Apache Software Foundation. That said, Apache > Zeppelin is an indirect competitor, but it solves a different use case. > > Apache Zeppelin is a web-based notebook that enables interactive data > analytics. It enables the creation of beautiful data-driven, interactive > and collaborative documents with SQL, Scala and more. Although a user can > create data visualizations using this project, it leverages a notebook > style user interfaces and it is geared towards the Spark community where > Scala and SQL co-exist > > We look forward to collaborating with those communities, as well as other > Apache communities. > > === An Excessive Fascination with the Apache Brand === > Superset is solving two huge challenges: > The challenge of enabling every knowledge worker to make data informed > decisions, particularly those who are not deeply skilled at writing SQL. > The challenge of visualizing huge amounts of data interactively and in > real-time > > Superset was first developed as a data visualization solution for Druid.io > as a way to visualize billions of rows of data. Since then, usage of > Superset has expanded to address data visualization use cases across SQL > speaking data sources as well. > > Our rationale for developing Superset as an Apache project is detailed in > the Rationale Section. We believe that the Apache brand and community > process will help us attract more contributors to this project, and help > grow the footprint of the project through usage at other organizations and > within other applications. Establishing consensus among users and > developers will result in a more valuable tool for everyone. > > == Documentation == > References to further reading material: > * [[http://airbnb.io/superset/|Superset Documentation]] > * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat > a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post: Superset: > Airbnb’s Data Exploration Platform]] > * [[https://medium.com/airbnb-engineering/superset-scaling-dat > a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#.a505zvb1t|Blog Post: > Superset: Scaling Data Access & Visual Insights at Airbnb]] > > == Initial Source == > The origin of the proposed code base can be found at > https://github.com/airbnb/superset. The code base is primarily in Python. > > == Source and Intellectual Property Submission Plan == > Airbnb will submit a Software Grant Agreement (SGA) as Superset joins the > incubator. We do not expect any complications for the submission of the > Superset code base. Our code is already in Github and there is only a > single code base. > > == External Dependencies == > List of Python packages, from the Python Package Index (Pypi): > > * boto3 > * celery > * cryptography > * flask-appbuilder > * flask-cache > * flask-migrate > * flask-script > * flask-sqlalchemy > * flask-testing > * humanize > * gunicorn > * markdown > * pandas > * parsedatetime > * pydruid > * PyHive > * python-dateutil > * requests > * simplejson > * six > * sqlalchemy > * sqlalchemy-utils > * sqlparse > * thrift > * thrift-sasl > * werkzeug > > List of Javascript packages, from NPM: > * autobind-decorator > * bootstrap > * bootstrap-datepicker > * brace > * brfs > * cal-heatmap > * classnames > * d3 > * d3-cloud > * d3-sankey > * d3-scale > * d3-tip > * datamaps > * datatables-bootstrap3-plugin > * datatables.net-bs > * font-awesome > * gridster > * immutability-helper > * immutable > * jquery > * lodash.throttle > * mapbox-gl > * moment > * moments > * mustache > * nvd3 > * react > * react-ace > * react-bootstrap > * react-bootstrap-table > * react-dom > * react-draggable > * react-gravatar > * react-grid-layout > * react-map-gl > * react-redux > * react-resizable > * react-select > * react-syntax-highlighter > * reactable > * redux > * redux-localstorage > * redux-thunk > * shortid > * style-loader > * supercluster > * topojson > * victory > * viewport-mercator-project > > == Cryptography == > The proposal does not include cryptographic code. > > == Required Resources == > > === Mailing List === > There is a current mailing list as a Google Group “airbnb_superset” that we > are planning on deprecating as the Apache.org become ready to serve our > community. > > * superset-private > * superset-dev > * superset-user > > === Subversion Directory === > Git is the preferred source control system. http://svn.apache.org/repos/as > f/incubator/superset > > == Git Repository == > Git is the preferred source control system, we’re assuming > https://github.com/apache/incubator-superset based on the naming scheme > > == Issue Tracking == > JIRA Superset (SUPERSET). If possible, we’d like to use Github issues & PRs > to manage our project as much as possible. It’s been said that there are > ways to keep Github’s issues in sync with Jira, allowing us to get best of > both worlds. If that is not possible, we will comply to using Jira. > > == Other Resources == > We currently use a set of Github integrated services that are free to the > open source community, like Travis-ci, Code Climate, Coveralls, > Landscape.io, Requires.io, david-dm and Gitter. We would like to keep using > these services as they allow us to scale contributions and optimize our > development flows. These services require some elevated rights on the > Github repository in order to set up or tune and we would like for the > committers to have the required rights. > > > == Initial Committers == > > * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC & Committer > * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & Committer > * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & Committer > * Vera Liu <vera....@airbnb.com> - Committer > * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer > * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & Committer > * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > * Priyank Shah <ps...@hortonworks.com> - Committer > * Harsha Chintalapani <schintalap...@hortonworks.com> - Committer > * Daniel Dai <da...@apache.org> - Champion & Committer > * Luke Han <luke....@apache.org> - Mentor > > == Affiliations == > The initial committers are employees of Airbnb Inc. and Hortonworks. > > == Sponsors == > > === Champion === > Daniel Dai <da...@apache.org> > > === Nominated Mentors === > * Ashutosh Chauhan <hashut...@apache.org> > * Luke Han <luke....@apache.org> > > === Sponsoring Entity === > Incubator PMC > > > > > > On Wed, Apr 26, 2017 at 6:31 PM, Edward J. Yoon <edwardy...@apache.org> > wrote: > >> +1 binding >> >> On Thu, Apr 27, 2017 at 10:29 AM, Naresh Agarwal >> <naresh.agar...@gmail.com> wrote: >>> +1 (non-binding). >>> >>> Thanks >>> Naresh Agarwal >>> >>> On Thu, Apr 27, 2017 at 5:06 AM, Ted Dunning <ted.dunn...@gmail.com> >> wrote: >>> >>>> +1 (binding) >>>> >>>> >>>> >>>> On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: >>>> >>>>> +1 (binding) >>>>> >>>>> On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey >>>>> <jiten...@hortonworks.com> wrote: >>>>>> +1 (binding) >>>>>> >>>>>> On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: >>>>>> >>>>>> +1 binding >>>>>> >>>>>>> On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org> >>>>> wrote: >>>>>>> >>>>>>> +1 (non-binding) >>>>>>> >>>>>>> On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < >>>>> hashut...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> +1 (binding) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Ashutosh >>>>>>>> >>>>>>>> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <luke...@gmail.com >>> >>>>> wrote: >>>>>>>> >>>>>>>>> +1 binding >>>>>>>>> >>>>>>>>> Love to see Superset to be new incubator project. >>>>>>>>> >>>>>>>>> >>>>>>>>> Best Regards! >>>>>>>>> --------------------- >>>>>>>>> >>>>>>>>> Luke Han >>>>>>>>> >>>>>>>>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng < >>>> jeff.f...@gmail.com> >>>>> wrote: >>>>>>>>> >>>>>>>>>> Dear Apache Incubator Community, >>>>>>>>>> >>>>>>>>>> We have updated the Superset proposal >>>>>>>>>> <https://wiki.apache.org/incubator/SupersetProposal> >> (copied >>>>> below) for >>>>>>>>>> >>>>>>>>>> Apache Incubation with an additional mentor (Luke Han - >>>>>>>>>> luke....@apache.org), >>>>>>>>>> and would like to start a vote thread for acceptance into >> the >>>>> incubator. >>>>>>>>>> >>>>>>>>>> Our team is excited to share Superset with the Apache >>>> community >>>>> and we >>>>>>>>>> hope >>>>>>>>>> for the your continued support! >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Jeff & the Superset Team >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> = Superset = >>>>>>>>>> >>>>>>>>>> == Abstract == >>>>>>>>>> Superset is an enterprise-ready web application for data >>>>> exploration, >>>>>>>> data >>>>>>>>>> visualization and dashboarding. >>>>>>>>>> >>>>>>>>>> == Proposal == >>>>>>>>>> Superset is business intelligence (BI) software that helps >>>>> modern >>>>>>>>>> organizations visualize and interact with their data. >> Superset >>>>> enables >>>>>>>>>> users explore data from a variety of databases, assemble >>>>> beautiful >>>>>>>>>> dashboards and share their findings. Superset works neatly >>>>> with all >>>>>>>>>> modern >>>>>>>>>> SQL-speaking databases, and integrates with Druid.io to >>>> provide >>>>>>>> real-time, >>>>>>>>>> interactive, blazing fast data access to large datasets. >>>>>>>>>> >>>>>>>>>> == Background == >>>>>>>>>> Data is mission critical. To succeed in this era, >>>> organizations >>>>> need to >>>>>>>>>> provide low-friction, intuitive and interactive access to >>>> data. >>>>> It is >>>>>>>>>> paramount for knowledge workers to be capable of answering >>>>> their own >>>>>>>>>> questions by querying, exploring and visualizing data. >>>>>>>>>> >>>>>>>>>> The entire business intelligence industry has pivoted from >> a >>>>> model of >>>>>>>>>> centralized top-down platforms driven by IT organizations >> to >>>>>>>> self-service >>>>>>>>>> analytics and agile workflows by any user. This shift >>>> unblocks >>>>>>>>>> centralized >>>>>>>>>> service bottlenecks for creating data visualizations while >>>> also >>>>> creating >>>>>>>>>> an >>>>>>>>>> environment that is iterative and fast-moving. This means >>>> that >>>>> business >>>>>>>>>> intelligence software must also be easy and delightful to >> use. >>>>>>>>>> Self-service analytics doesn’t mean that admin and >> governance >>>>> features >>>>>>>> are >>>>>>>>>> not needed. >>>>>>>>>> Modern BI tools provide fine-grain access controls and >>>> auditing >>>>>>>>>> capabilities to understand how data is being used. >> Superset >>>> is >>>>> a >>>>>>>> solution >>>>>>>>>> that delivers on all of these vectors. >>>>>>>>>> >>>>>>>>>> The technology stack is also constantly morphing - vendors >> are >>>>>>>> struggling >>>>>>>>>> to provide cheap, quick and easy solutions to access data. >>>>> Business >>>>>>>>>> intelligence users are finding existing solutions lacking >> as >>>>> these >>>>>>>>>> software >>>>>>>>>> products either disregard or react slowly to recent >>>>> game-changing >>>>>>>>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache >>>>> Kylin, d3.js, >>>>>>>>>> React.js and iPython’s Jupyter for instance. >>>>>>>>>> >>>>>>>>>> == Rationale == >>>>>>>>>> Business intelligence is more relevant today than at any >> other >>>>> point in >>>>>>>>>> history. Organizations are currently very limited in >> options >>>>> for open >>>>>>>>>> source data visualization solutions, especially solutions >> that >>>>> are both >>>>>>>>>> self-service and enterprise-ready. Every company informing >>>>> their >>>>>>>>>> decisions >>>>>>>>>> with data needs a BI tool. >>>>>>>>>> >>>>>>>>>> We believe that Superset will be a strong compliment to >>>>> existing Apache >>>>>>>>>> Software Foundation technologies by offering scalable user >>>>> interactions >>>>>>>> to >>>>>>>>>> distributed storage and computation solutions. Users will >>>>> often find >>>>>>>> that >>>>>>>>>> Superset can act as a catalyst for tooling that can >> visualize >>>>> the >>>>>>>>>> byproduct >>>>>>>>>> of data and computation infrastructure. >>>>>>>>>> >>>>>>>>>> Superset has many key design elements that help fill a gap >> in >>>>> current >>>>>>>>>> solutions for organizations: >>>>>>>>>> * Easy, low friction access to data through a simple, >>>> web-based >>>>> data >>>>>>>>>> exploration interface. Composing charts and dashboards are >>>>> intuitive. >>>>>>>>>> Eliminating the need to write code or SQL empowers anyone >> to >>>>> use it. >>>>>>>>>> * Access to a wide array of rich, interactive data >>>>> visualization types. >>>>>>>>>> * Enterprise-ready: Integration with different >> authentication >>>>>>>> mechanisms >>>>>>>>>> and granular permissions centered around actions and data >>>>> access. >>>>>>>>>> * Realtime & fast: Superset provides realtime analytics at >> the >>>>> speed of >>>>>>>>>> thought on very large datasets when integrated with >> Druid.io. >>>>>>>>>> * Broad data access: Consume data out of any SQL-speaking >>>>> relational >>>>>>>>>> database. >>>>>>>>>> * Extensible: Can be extended to talk to many noSQL >> databases >>>>> like >>>>>>>> Apache >>>>>>>>>> Drill, Elastic Search, and other popular database engines. >>>>>>>>>> * Fast loading dashboards with configurable web-scale >> caching. >>>>>>>>>> * Plug-in framework that enables organizations to build >> custom >>>>>>>> analytical >>>>>>>>>> applications with new UI/UX interfaces. >>>>>>>>>> * SQL Lab, a state-of-the-art SQL IDE that empowers >>>>> SQL-speaking users >>>>>>>>>> with more flexibility. SQL Lab integrates with the >>>>> visualization engine >>>>>>>>>> seamlessly. >>>>>>>>>> >>>>>>>>>> == Initial Goals == >>>>>>>>>> The initial goals of the Superset project are several-fold: >>>>>>>>>> * Move the existing codebase to Apache and integrate with >> the >>>>> Apache >>>>>>>>>> development process. >>>>>>>>>> * Redesign the user interface and interaction model for >>>> creating >>>>>>>>>> visualizations/dashboards and connecting to data sources >>>>>>>>>> * Build robust support for security and governance of the >> tool >>>>>>>> including >>>>>>>>>> popular authorization modules (including Apache Ranger and >>>>> Apache >>>>>>>> Sentry) >>>>>>>>>> and a more sophisticated permissions system >>>>>>>>>> * Grow the extensibility of the project both in terms of >>>>> enhanced >>>>>>>>>> connectivity to NoSQL-based data sources and creating a >>>> plug-in >>>>>>>> framework >>>>>>>>>> that enables organizations to build custom analytical >>>>> applications which >>>>>>>>>> require a new UI/UX >>>>>>>>>> >>>>>>>>>> == Current Status == >>>>>>>>>> By many standards, Superset is already a successful open >>>> source >>>>> project. >>>>>>>>>> As >>>>>>>>>> of March 2017, Superset is officially used in production at >>>>> about a >>>>>>>> dozen >>>>>>>>>> companies, has received contributions from over one hundred >>>>> contributors >>>>>>>>>> on >>>>>>>>>> Github, 1500+ forks, and 12k+ stars. >>>>>>>>>> >>>>>>>>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have >>>> made >>>>>>>>>> significant contributions, and expressed their commitment >> to >>>> the >>>>>>>> project. >>>>>>>>>> The product is feature complete and has been viable for >>>> months. >>>>> It >>>>>>>> already >>>>>>>>>> serves as the main interface for consuming data at many >>>>> companies of >>>>>>>>>> different sizes. >>>>>>>>>> >>>>>>>>>> While the product is usable, there’s room for improvement >>>>> across the >>>>>>>>>> board, >>>>>>>>>> starting with providing a smoother user experience around >>>>> content >>>>>>>>>> creation, >>>>>>>>>> making sure all features work out-of-the-box on more >> platforms >>>>> and >>>>>>>>>> databases, providing better user training guides and >> videos, >>>>> having a >>>>>>>>>> predictable release process, and increasing the overall >>>> quality >>>>> of the >>>>>>>>>> Superset releases. >>>>>>>>>> >>>>>>>>>> === Meritocracy === >>>>>>>>>> We plan to invest in supporting a meritocracy. We will >> discuss >>>>> the >>>>>>>>>> requirements in an open forum. Several companies have >>>> expressed >>>>> interest >>>>>>>>>> in >>>>>>>>>> this project, and we intend to invite additional >> developers to >>>>>>>>>> participate. >>>>>>>>>> We will encourage and monitor community participation so >> that >>>>> privileges >>>>>>>>>> can be extended to those that contribute. >>>>>>>>>> >>>>>>>>>> === Community === >>>>>>>>>> The need for an enterprise-ready data visualization and >>>>> exploration >>>>>>>>>> platform in the open source community is tremendous. While >>>>> Superset is >>>>>>>>>> fairly well known, recognized and used within the Druid.io >>>>> community, >>>>>>>>>> adoption is currently limited outside of that niche. There >> is >>>> a >>>>> huge >>>>>>>>>> opportunity to grow the community to hundreds if not >> thousands >>>>> of >>>>>>>>>> organizations, and we are hoping that embracing “the Apache >>>>> way” will >>>>>>>>>> accelerate the growth of our community. >>>>>>>>>> >>>>>>>>>> We have already been active at seeking and inviting >>>>> contributions, and >>>>>>>> are >>>>>>>>>> planning to scale the project by investing time and growing >>>> the >>>>> support >>>>>>>>>> structure to grow the community. >>>>>>>>>> >>>>>>>>>> === Core Developers === >>>>>>>>>> The initial committers for Superset include experienced >> full >>>>> stack, >>>>>>>>>> front-end and data engineers: >>>>>>>>>> * Maxime Beauchemin (Airbnb) >>>>>>>>>> * Alanna Scott (Airbnb) >>>>>>>>>> * Bogdan Kyryliuk (Airbnb) >>>>>>>>>> * Vera Liu (Airbnb) >>>>>>>>>> * Jeff Feng (Airbnb) >>>>>>>>>> * Ashutosh Chauhan (Hortonworks) >>>>>>>>>> * Nishant Bangarwa (Hortonworks) >>>>>>>>>> * Slim Bouguerra (Hortonworks) >>>>>>>>>> * Priyank Shah (Hortonworks) >>>>>>>>>> * Sriharsha Chintalapani (Hortonworks) >>>>>>>>>> * Daniel Dai (Hortonworks) >>>>>>>>>> >>>>>>>>>> We realize that additional employer diversity is needed, >> and >>>> we >>>>> will >>>>>>>> work >>>>>>>>>> aggressively to recruit developers from additional >> companies. >>>>>>>>>> >>>>>>>>>> === Alignment === >>>>>>>>>> The initial committers strongly believe that a system for >>>>> interactive >>>>>>>>>> visualization of data will gain broader adoption as an open >>>>> source, >>>>>>>>>> community driven project, where the community can >> contribute >>>>> not only to >>>>>>>>>> the core components, but also to a growing collection of >>>>> connectors, >>>>>>>>>> visualizations and improving integration a all potential >> data >>>>> sources. >>>>>>>>>> Superset already integrates closely with Apache Hive, the >> Hive >>>>>>>> metastore, >>>>>>>>>> as well as most SQL-speaking databases found in modern data >>>>> ecosystems. >>>>>>>>>> >>>>>>>>>> == Known Risks == >>>>>>>>>> >>>>>>>>>> === Orphaned Products === >>>>>>>>>> Superset is a vital component for both visualizing, >> accessing >>>>> and >>>>>>>>>> democratizing data at Airbnb. Also at Hortonworks, >> Superset >>>> is >>>>> a core >>>>>>>>>> component of the DataFlow product offering. Thus, the >> risk of >>>>> the >>>>>>>> project >>>>>>>>>> being orphaned is relatively low. The project could be at >>>> risk >>>>> if >>>>>>>> Airbnb >>>>>>>>>> changes their approach for democratizing data or if >>>> Hortonworks >>>>> changes >>>>>>>>>> their strategy in the market. In such an event, the >>>> committers >>>>> plan to >>>>>>>>>> continue working on the project on their own time, thought >> the >>>>> progress >>>>>>>>>> will likely be slower. We plan to mitigate this risk by >>>>> recruiting >>>>>>>>>> additional committers. >>>>>>>>>> >>>>>>>>>> === Inexperience with Open Source === >>>>>>>>>> The initial committers include veteran Apache members >>>>> (committers and >>>>>>>> PPMC >>>>>>>>>> members) and other developers who have varying degrees of >>>>> experience >>>>>>>> with >>>>>>>>>> open source projects. All have been involved with source >> code >>>>> that has >>>>>>>>>> been >>>>>>>>>> released under an open source license, and several also >> have >>>>> experience >>>>>>>>>> developing code with an open source development process. >>>>>>>>>> >>>>>>>>>> === Homogenous Developers === >>>>>>>>>> The initial committers are employed by Airbnb Inc. and >>>>> Hortonworks. We >>>>>>>> are >>>>>>>>>> committed to recruiting additional committers from other >>>>> companies. >>>>>>>>>> >>>>>>>>>> === Reliance on Salaried Developers === >>>>>>>>>> It is expected that Superset development will occur on both >>>>> salaried >>>>>>>> time >>>>>>>>>> and on volunteer time, after hours. The majority of initial >>>>> committers >>>>>>>> are >>>>>>>>>> paid by their employer to contribute to this project. >> However, >>>>> they are >>>>>>>>>> all >>>>>>>>>> passionate about the project, and we are confident that the >>>>> project will >>>>>>>>>> continue even if no salaried developers contribute to the >>>>> project. We >>>>>>>> are >>>>>>>>>> committed to recruiting additional committers including >>>>> non-salaried >>>>>>>>>> developers. >>>>>>>>>> >>>>>>>>>> === Relationships with Other Apache Products === >>>>>>>>>> To the knowledge of the Initial Committers, there are no >>>> direct >>>>>>>>>> competitors >>>>>>>>>> to Superset within the Apache Software Foundation. That >> said, >>>>> Apache >>>>>>>>>> Zeppelin is an indirect competitor, but it solves a >> different >>>>> use case. >>>>>>>>>> >>>>>>>>>> Apache Zeppelin is a web-based notebook that enables >>>>> interactive data >>>>>>>>>> analytics. It enables the creation of beautiful >> data-driven, >>>>> interactive >>>>>>>>>> and collaborative documents with SQL, Scala and more. >>>> Although >>>>> a user >>>>>>>> can >>>>>>>>>> create data visualizations using this project, it >> leverages a >>>>> notebook >>>>>>>>>> style user interfaces and it is geared towards the Spark >>>>> community where >>>>>>>>>> Scala and SQL co-exist >>>>>>>>>> >>>>>>>>>> We look forward to collaborating with those communities, as >>>>> well as >>>>>>>> other >>>>>>>>>> Apache communities. >>>>>>>>>> >>>>>>>>>> === An Excessive Fascination with the Apache Brand === >>>>>>>>>> Superset is solving two huge challenges: >>>>>>>>>> The challenge of enabling every knowledge worker to make >> data >>>>> informed >>>>>>>>>> decisions, particularly those who are not deeply skilled at >>>>> writing SQL. >>>>>>>>>> The challenge of visualizing huge amounts of data >>>> interactively >>>>> and in >>>>>>>>>> real-time >>>>>>>>>> >>>>>>>>>> Superset was first developed as a data visualization >> solution >>>>> for >>>>>>>> Druid.io >>>>>>>>>> as a way to visualize billions of rows of data. Since >> then, >>>>> usage of >>>>>>>>>> Superset has expanded to address data visualization use >> cases >>>>> across SQL >>>>>>>>>> speaking data sources as well. >>>>>>>>>> >>>>>>>>>> Our rationale for developing Superset as an Apache project >> is >>>>> detailed >>>>>>>> in >>>>>>>>>> the Rationale Section. We believe that the Apache brand >> and >>>>> community >>>>>>>>>> process will help us attract more contributors to this >>>> project, >>>>> and help >>>>>>>>>> grow the footprint of the project through usage at other >>>>> organizations >>>>>>>> and >>>>>>>>>> within other applications. Establishing consensus among >> users >>>>> and >>>>>>>>>> developers will result in a more valuable tool for >> everyone. >>>>>>>>>> >>>>>>>>>> == Documentation == >>>>>>>>>> References to further reading material: >>>>>>>>>> * [[http://airbnb.io/superset/|Superset Documentation]] >>>>>>>>>> * [[ >>>>>>>>>> https://medium.com/airbnb-engi >> neering/caravel-airbnb-s-data- >>>>>>>>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog >>>>>>>>>> Post: Superset: Airbnb’s Data Exploration Platform]] >>>>>>>>>> * [[ >>>>>>>>>> https://medium.com/airbnb-engi >> neering/superset-scaling-data- >>>>>>>>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. >>>>> a505zvb1t|Blog >>>>>>>>>> Post: Superset: Scaling Data Access & Visual Insights at >>>>> Airbnb]] >>>>>>>>>> >>>>>>>>>> == Initial Source == >>>>>>>>>> The origin of the proposed code base can be found at >>>>>>>>>> https://github.com/airbnb/superset. The code base is >>>>> primarily in >>>>>>>>>> Python. >>>>>>>>>> >>>>>>>>>> == Source and Intellectual Property Submission Plan == >>>>>>>>>> We do not expect any complications for the submission of >> the >>>>> Superset >>>>>>>> code >>>>>>>>>> base. Our code is already in Github and there is only a >>>> single >>>>> code >>>>>>>> base. >>>>>>>>>> >>>>>>>>>> == External Dependencies == >>>>>>>>>> List of Python packages, from the Python Package Index >> (Pypi): >>>>>>>>>> >>>>>>>>>> * boto3 >>>>>>>>>> * celery >>>>>>>>>> * cryptography >>>>>>>>>> * flask-appbuilder >>>>>>>>>> * flask-cache >>>>>>>>>> * flask-migrate >>>>>>>>>> * flask-script >>>>>>>>>> * flask-sqlalchemy >>>>>>>>>> * flask-testing >>>>>>>>>> * humanize >>>>>>>>>> * gunicorn >>>>>>>>>> * markdown >>>>>>>>>> * pandas >>>>>>>>>> * parsedatetime >>>>>>>>>> * pydruid >>>>>>>>>> * PyHive >>>>>>>>>> * python-dateutil >>>>>>>>>> * requests >>>>>>>>>> * simplejson >>>>>>>>>> * six >>>>>>>>>> * sqlalchemy >>>>>>>>>> * sqlalchemy-utils >>>>>>>>>> * sqlparse >>>>>>>>>> * thrift >>>>>>>>>> * thrift-sasl >>>>>>>>>> * werkzeug >>>>>>>>>> >>>>>>>>>> List of Javascript packages, from NPM: >>>>>>>>>> * autobind-decorator >>>>>>>>>> * bootstrap >>>>>>>>>> * bootstrap-datepicker >>>>>>>>>> * brace >>>>>>>>>> * brfs >>>>>>>>>> * cal-heatmap >>>>>>>>>> * classnames >>>>>>>>>> * d3 >>>>>>>>>> * d3-cloud >>>>>>>>>> * d3-sankey >>>>>>>>>> * d3-scale >>>>>>>>>> * d3-tip >>>>>>>>>> * datamaps >>>>>>>>>> * datatables-bootstrap3-plugin >>>>>>>>>> * datatables.net-bs >>>>>>>>>> * font-awesome >>>>>>>>>> * gridster >>>>>>>>>> * immutability-helper >>>>>>>>>> * immutable >>>>>>>>>> * jquery >>>>>>>>>> * lodash.throttle >>>>>>>>>> * mapbox-gl >>>>>>>>>> * moment >>>>>>>>>> * moments >>>>>>>>>> * mustache >>>>>>>>>> * nvd3 >>>>>>>>>> * react >>>>>>>>>> * react-ace >>>>>>>>>> * react-bootstrap >>>>>>>>>> * react-bootstrap-table >>>>>>>>>> * react-dom >>>>>>>>>> * react-draggable >>>>>>>>>> * react-gravatar >>>>>>>>>> * react-grid-layout >>>>>>>>>> * react-map-gl >>>>>>>>>> * react-redux >>>>>>>>>> * react-resizable >>>>>>>>>> * react-select >>>>>>>>>> * react-syntax-highlighter >>>>>>>>>> * reactable >>>>>>>>>> * redux >>>>>>>>>> * redux-localstorage >>>>>>>>>> * redux-thunk >>>>>>>>>> * shortid >>>>>>>>>> * style-loader >>>>>>>>>> * supercluster >>>>>>>>>> * topojson >>>>>>>>>> * victory >>>>>>>>>> * viewport-mercator-project >>>>>>>>>> >>>>>>>>>> == Cryptography == >>>>>>>>>> The proposal does not include cryptographic code. >>>>>>>>>> >>>>>>>>>> == Required Resources == >>>>>>>>>> >>>>>>>>>> === Mailing List === >>>>>>>>>> There is a current mailing list as a Google Group >>>>> “airbnb_superset” that >>>>>>>>>> we >>>>>>>>>> are planning on deprecating as the Apache.org become ready >> to >>>>> serve our >>>>>>>>>> community. >>>>>>>>>> >>>>>>>>>> * superset-private >>>>>>>>>> * superset-dev >>>>>>>>>> * superset-user >>>>>>>>>> >>>>>>>>>> === Subversion Directory === >>>>>>>>>> Git is the preferred source control system. >>>>>>>>>> http://svn.apache.org/repos/asf/incubator/superset >>>>>>>>>> >>>>>>>>>> == Git Repository == >>>>>>>>>> Git is the preferred source control system, we’re assuming >>>>>>>>>> https://github.com/apache/incubator-superset based on the >>>>> naming scheme >>>>>>>>>> >>>>>>>>>> == Issue Tracking == >>>>>>>>>> JIRA Superset (SUPERSET). If possible, we’d like to use >> Github >>>>> issues & >>>>>>>>>> PRs >>>>>>>>>> to manage our project as much as possible. It’s been said >> that >>>>> there are >>>>>>>>>> ways to keep Github’s issues in sync with Jira, allowing >> us to >>>>> get best >>>>>>>> of >>>>>>>>>> both worlds. If that is not possible, we will comply to >> using >>>>> Jira. >>>>>>>>>> >>>>>>>>>> == Other Resources == >>>>>>>>>> We currently use a set of Github integrated services that >> are >>>>> free to >>>>>>>> the >>>>>>>>>> open source community, like Travis-ci, Code Climate, >>>> Coveralls, >>>>>>>>>> Landscape.io, Requires.io, david-dm and Gitter. We would >> like >>>>> to keep >>>>>>>>>> using >>>>>>>>>> these services as they allow us to scale contributions and >>>>> optimize our >>>>>>>>>> development flows. These services require some elevated >> rights >>>>> on the >>>>>>>>>> Github repository in order to set up or tune and we would >> like >>>>> for the >>>>>>>>>> committers to have the required rights. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> == Initial Committers == >>>>>>>>>> >>>>>>>>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC >> & >>>>> Committer >>>>>>>>>> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & >> Committer >>>>>>>>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & >> Committer >>>>>>>>>> * Vera Liu <vera....@airbnb.com> - Committer >>>>>>>>>> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer >>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & >>>> Committer >>>>>>>>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & >>>>> Committer >>>>>>>>>> * Slim Bouguerra <sbougue...@hortonworks.com> - Committer >>>>>>>>>> * Priyank Shah <ps...@hortonworks.com> - Committer >>>>>>>>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - >>>>> Committer >>>>>>>>>> * Daniel Dai <da...@apache.org> - Champion & Committer >>>>>>>>>> * Luke Han <luke....@apache.org> - Mentor >>>>>>>>>> >>>>>>>>>> == Affiliations == >>>>>>>>>> The initial committers are employees of Airbnb Inc. and >>>>> Hortonworks. >>>>>>>>>> >>>>>>>>>> == Sponsors == >>>>>>>>>> >>>>>>>>>> === Champion === >>>>>>>>>> Daniel Dai <da...@apache.org> >>>>>>>>>> >>>>>>>>>> === Nominated Mentors === >>>>>>>>>> * Ashutosh Chauhan <hashut...@apache.org> >>>>>>>>>> * Luke Han <luke....@apache.org> >>>>>>>>>> >>>>>>>>>> === Sponsoring Entity === >>>>>>>>>> Incubator PMC >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------ >>>>> --------- >>>>>> To unsubscribe, e-mail: general-unsubscribe@incubator. >> apache.org >>>>>> For additional commands, e-mail: general-help@incubator.apache. >> org >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >>>>> For additional commands, e-mail: general-h...@incubator.apache.org >>>>> >>>>> >>>> >> >> >> >> -- >> Best Regards, Edward J. Yoon >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org