+1 (binding)
On Tue, Apr 25, 2017 at 1:58 PM, Joe Witt <joe.w...@gmail.com> wrote: > +1 (binding) > > On Tue, Apr 25, 2017 at 4:52 PM, Jitendra Pandey > <jiten...@hortonworks.com> wrote: > > +1 (binding) > > > > On 4/25/17, 1:27 PM, "Julian Hyde" <jh...@apache.org> wrote: > > > > +1 binding > > > > > On Apr 25, 2017, at 12:48 PM, moon soo Lee <m...@apache.org> > wrote: > > > > > > +1 (non-binding) > > > > > > On Tue, Apr 25, 2017 at 11:49 AM Ashutosh Chauhan < > hashut...@apache.org> > > > wrote: > > > > > >> +1 (binding) > > >> > > >> Thanks, > > >> Ashutosh > > >> > > >> On Mon, Apr 24, 2017 at 5:45 AM, Luke Han <luke...@gmail.com> > wrote: > > >> > > >>> +1 binding > > >>> > > >>> Love to see Superset to be new incubator project. > > >>> > > >>> > > >>> Best Regards! > > >>> --------------------- > > >>> > > >>> Luke Han > > >>> > > >>> On Sun, Apr 23, 2017 at 10:53 PM, Jeff Feng <jeff.f...@gmail.com> > wrote: > > >>> > > >>>> Dear Apache Incubator Community, > > >>>> > > >>>> We have updated the Superset proposal > > >>>> <https://wiki.apache.org/incubator/SupersetProposal> (copied > below) for > > >>>> > > >>>> Apache Incubation with an additional mentor (Luke Han - > > >>>> luke....@apache.org), > > >>>> and would like to start a vote thread for acceptance into the > incubator. > > >>>> > > >>>> Our team is excited to share Superset with the Apache community > and we > > >>>> hope > > >>>> for the your continued support! > > >>>> > > >>>> Cheers, > > >>>> Jeff & the Superset Team > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> = Superset = > > >>>> > > >>>> == Abstract == > > >>>> Superset is an enterprise-ready web application for data > exploration, > > >> data > > >>>> visualization and dashboarding. > > >>>> > > >>>> == Proposal == > > >>>> Superset is business intelligence (BI) software that helps > modern > > >>>> organizations visualize and interact with their data. Superset > enables > > >>>> users explore data from a variety of databases, assemble > beautiful > > >>>> dashboards and share their findings. Superset works neatly > with all > > >>>> modern > > >>>> SQL-speaking databases, and integrates with Druid.io to provide > > >> real-time, > > >>>> interactive, blazing fast data access to large datasets. > > >>>> > > >>>> == Background == > > >>>> Data is mission critical. To succeed in this era, organizations > need to > > >>>> provide low-friction, intuitive and interactive access to data. > It is > > >>>> paramount for knowledge workers to be capable of answering > their own > > >>>> questions by querying, exploring and visualizing data. > > >>>> > > >>>> The entire business intelligence industry has pivoted from a > model of > > >>>> centralized top-down platforms driven by IT organizations to > > >> self-service > > >>>> analytics and agile workflows by any user. This shift unblocks > > >>>> centralized > > >>>> service bottlenecks for creating data visualizations while also > creating > > >>>> an > > >>>> environment that is iterative and fast-moving. This means that > business > > >>>> intelligence software must also be easy and delightful to use. > > >>>> Self-service analytics doesn’t mean that admin and governance > features > > >> are > > >>>> not needed. > > >>>> Modern BI tools provide fine-grain access controls and auditing > > >>>> capabilities to understand how data is being used. Superset is > a > > >> solution > > >>>> that delivers on all of these vectors. > > >>>> > > >>>> The technology stack is also constantly morphing - vendors are > > >> struggling > > >>>> to provide cheap, quick and easy solutions to access data. > Business > > >>>> intelligence users are finding existing solutions lacking as > these > > >>>> software > > >>>> products either disregard or react slowly to recent > game-changing > > >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache > Kylin, d3.js, > > >>>> React.js and iPython’s Jupyter for instance. > > >>>> > > >>>> == Rationale == > > >>>> Business intelligence is more relevant today than at any other > point in > > >>>> history. Organizations are currently very limited in options > for open > > >>>> source data visualization solutions, especially solutions that > are both > > >>>> self-service and enterprise-ready. Every company informing > their > > >>>> decisions > > >>>> with data needs a BI tool. > > >>>> > > >>>> We believe that Superset will be a strong compliment to > existing Apache > > >>>> Software Foundation technologies by offering scalable user > interactions > > >> to > > >>>> distributed storage and computation solutions. Users will > often find > > >> that > > >>>> Superset can act as a catalyst for tooling that can visualize > the > > >>>> byproduct > > >>>> of data and computation infrastructure. > > >>>> > > >>>> Superset has many key design elements that help fill a gap in > current > > >>>> solutions for organizations: > > >>>> * Easy, low friction access to data through a simple, web-based > data > > >>>> exploration interface. Composing charts and dashboards are > intuitive. > > >>>> Eliminating the need to write code or SQL empowers anyone to > use it. > > >>>> * Access to a wide array of rich, interactive data > visualization types. > > >>>> * Enterprise-ready: Integration with different authentication > > >> mechanisms > > >>>> and granular permissions centered around actions and data > access. > > >>>> * Realtime & fast: Superset provides realtime analytics at the > speed of > > >>>> thought on very large datasets when integrated with Druid.io. > > >>>> * Broad data access: Consume data out of any SQL-speaking > relational > > >>>> database. > > >>>> * Extensible: Can be extended to talk to many noSQL databases > like > > >> Apache > > >>>> Drill, Elastic Search, and other popular database engines. > > >>>> * Fast loading dashboards with configurable web-scale caching. > > >>>> * Plug-in framework that enables organizations to build custom > > >> analytical > > >>>> applications with new UI/UX interfaces. > > >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers > SQL-speaking users > > >>>> with more flexibility. SQL Lab integrates with the > visualization engine > > >>>> seamlessly. > > >>>> > > >>>> == Initial Goals == > > >>>> The initial goals of the Superset project are several-fold: > > >>>> * Move the existing codebase to Apache and integrate with the > Apache > > >>>> development process. > > >>>> * Redesign the user interface and interaction model for creating > > >>>> visualizations/dashboards and connecting to data sources > > >>>> * Build robust support for security and governance of the tool > > >> including > > >>>> popular authorization modules (including Apache Ranger and > Apache > > >> Sentry) > > >>>> and a more sophisticated permissions system > > >>>> * Grow the extensibility of the project both in terms of > enhanced > > >>>> connectivity to NoSQL-based data sources and creating a plug-in > > >> framework > > >>>> that enables organizations to build custom analytical > applications which > > >>>> require a new UI/UX > > >>>> > > >>>> == Current Status == > > >>>> By many standards, Superset is already a successful open source > project. > > >>>> As > > >>>> of March 2017, Superset is officially used in production at > about a > > >> dozen > > >>>> companies, has received contributions from over one hundred > contributors > > >>>> on > > >>>> Github, 1500+ forks, and 12k+ stars. > > >>>> > > >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made > > >>>> significant contributions, and expressed their commitment to the > > >> project. > > >>>> The product is feature complete and has been viable for months. > It > > >> already > > >>>> serves as the main interface for consuming data at many > companies of > > >>>> different sizes. > > >>>> > > >>>> While the product is usable, there’s room for improvement > across the > > >>>> board, > > >>>> starting with providing a smoother user experience around > content > > >>>> creation, > > >>>> making sure all features work out-of-the-box on more platforms > and > > >>>> databases, providing better user training guides and videos, > having a > > >>>> predictable release process, and increasing the overall quality > of the > > >>>> Superset releases. > > >>>> > > >>>> === Meritocracy === > > >>>> We plan to invest in supporting a meritocracy. We will discuss > the > > >>>> requirements in an open forum. Several companies have expressed > interest > > >>>> in > > >>>> this project, and we intend to invite additional developers to > > >>>> participate. > > >>>> We will encourage and monitor community participation so that > privileges > > >>>> can be extended to those that contribute. > > >>>> > > >>>> === Community === > > >>>> The need for an enterprise-ready data visualization and > exploration > > >>>> platform in the open source community is tremendous. While > Superset is > > >>>> fairly well known, recognized and used within the Druid.io > community, > > >>>> adoption is currently limited outside of that niche. There is a > huge > > >>>> opportunity to grow the community to hundreds if not thousands > of > > >>>> organizations, and we are hoping that embracing “the Apache > way” will > > >>>> accelerate the growth of our community. > > >>>> > > >>>> We have already been active at seeking and inviting > contributions, and > > >> are > > >>>> planning to scale the project by investing time and growing the > support > > >>>> structure to grow the community. > > >>>> > > >>>> === Core Developers === > > >>>> The initial committers for Superset include experienced full > stack, > > >>>> front-end and data engineers: > > >>>> * Maxime Beauchemin (Airbnb) > > >>>> * Alanna Scott (Airbnb) > > >>>> * Bogdan Kyryliuk (Airbnb) > > >>>> * Vera Liu (Airbnb) > > >>>> * Jeff Feng (Airbnb) > > >>>> * Ashutosh Chauhan (Hortonworks) > > >>>> * Nishant Bangarwa (Hortonworks) > > >>>> * Slim Bouguerra (Hortonworks) > > >>>> * Priyank Shah (Hortonworks) > > >>>> * Sriharsha Chintalapani (Hortonworks) > > >>>> * Daniel Dai (Hortonworks) > > >>>> > > >>>> We realize that additional employer diversity is needed, and we > will > > >> work > > >>>> aggressively to recruit developers from additional companies. > > >>>> > > >>>> === Alignment === > > >>>> The initial committers strongly believe that a system for > interactive > > >>>> visualization of data will gain broader adoption as an open > source, > > >>>> community driven project, where the community can contribute > not only to > > >>>> the core components, but also to a growing collection of > connectors, > > >>>> visualizations and improving integration a all potential data > sources. > > >>>> Superset already integrates closely with Apache Hive, the Hive > > >> metastore, > > >>>> as well as most SQL-speaking databases found in modern data > ecosystems. > > >>>> > > >>>> == Known Risks == > > >>>> > > >>>> === Orphaned Products === > > >>>> Superset is a vital component for both visualizing, accessing > and > > >>>> democratizing data at Airbnb. Also at Hortonworks, Superset is > a core > > >>>> component of the DataFlow product offering. Thus, the risk of > the > > >> project > > >>>> being orphaned is relatively low. The project could be at risk > if > > >> Airbnb > > >>>> changes their approach for democratizing data or if Hortonworks > changes > > >>>> their strategy in the market. In such an event, the committers > plan to > > >>>> continue working on the project on their own time, thought the > progress > > >>>> will likely be slower. We plan to mitigate this risk by > recruiting > > >>>> additional committers. > > >>>> > > >>>> === Inexperience with Open Source === > > >>>> The initial committers include veteran Apache members > (committers and > > >> PPMC > > >>>> members) and other developers who have varying degrees of > experience > > >> with > > >>>> open source projects. All have been involved with source code > that has > > >>>> been > > >>>> released under an open source license, and several also have > experience > > >>>> developing code with an open source development process. > > >>>> > > >>>> === Homogenous Developers === > > >>>> The initial committers are employed by Airbnb Inc. and > Hortonworks. We > > >> are > > >>>> committed to recruiting additional committers from other > companies. > > >>>> > > >>>> === Reliance on Salaried Developers === > > >>>> It is expected that Superset development will occur on both > salaried > > >> time > > >>>> and on volunteer time, after hours. The majority of initial > committers > > >> are > > >>>> paid by their employer to contribute to this project. However, > they are > > >>>> all > > >>>> passionate about the project, and we are confident that the > project will > > >>>> continue even if no salaried developers contribute to the > project. We > > >> are > > >>>> committed to recruiting additional committers including > non-salaried > > >>>> developers. > > >>>> > > >>>> === Relationships with Other Apache Products === > > >>>> To the knowledge of the Initial Committers, there are no direct > > >>>> competitors > > >>>> to Superset within the Apache Software Foundation. That said, > Apache > > >>>> Zeppelin is an indirect competitor, but it solves a different > use case. > > >>>> > > >>>> Apache Zeppelin is a web-based notebook that enables > interactive data > > >>>> analytics. It enables the creation of beautiful data-driven, > interactive > > >>>> and collaborative documents with SQL, Scala and more. Although > a user > > >> can > > >>>> create data visualizations using this project, it leverages a > notebook > > >>>> style user interfaces and it is geared towards the Spark > community where > > >>>> Scala and SQL co-exist > > >>>> > > >>>> We look forward to collaborating with those communities, as > well as > > >> other > > >>>> Apache communities. > > >>>> > > >>>> === An Excessive Fascination with the Apache Brand === > > >>>> Superset is solving two huge challenges: > > >>>> The challenge of enabling every knowledge worker to make data > informed > > >>>> decisions, particularly those who are not deeply skilled at > writing SQL. > > >>>> The challenge of visualizing huge amounts of data interactively > and in > > >>>> real-time > > >>>> > > >>>> Superset was first developed as a data visualization solution > for > > >> Druid.io > > >>>> as a way to visualize billions of rows of data. Since then, > usage of > > >>>> Superset has expanded to address data visualization use cases > across SQL > > >>>> speaking data sources as well. > > >>>> > > >>>> Our rationale for developing Superset as an Apache project is > detailed > > >> in > > >>>> the Rationale Section. We believe that the Apache brand and > community > > >>>> process will help us attract more contributors to this project, > and help > > >>>> grow the footprint of the project through usage at other > organizations > > >> and > > >>>> within other applications. Establishing consensus among users > and > > >>>> developers will result in a more valuable tool for everyone. > > >>>> > > >>>> == Documentation == > > >>>> References to further reading material: > > >>>> * [[http://airbnb.io/superset/|Superset Documentation]] > > >>>> * [[ > > >>>> https://medium.com/airbnb-engineering/caravel-airbnb-s-data- > > >>>> exploration-platform-15a72aa610e5#.npqmmbu25|Blog > > >>>> Post: Superset: Airbnb’s Data Exploration Platform]] > > >>>> * [[ > > >>>> https://medium.com/airbnb-engineering/superset-scaling-data- > > >>>> access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. > a505zvb1t|Blog > > >>>> Post: Superset: Scaling Data Access & Visual Insights at > Airbnb]] > > >>>> > > >>>> == Initial Source == > > >>>> The origin of the proposed code base can be found at > > >>>> https://github.com/airbnb/superset. The code base is > primarily in > > >>>> Python. > > >>>> > > >>>> == Source and Intellectual Property Submission Plan == > > >>>> We do not expect any complications for the submission of the > Superset > > >> code > > >>>> base. Our code is already in Github and there is only a single > code > > >> base. > > >>>> > > >>>> == External Dependencies == > > >>>> List of Python packages, from the Python Package Index (Pypi): > > >>>> > > >>>> * boto3 > > >>>> * celery > > >>>> * cryptography > > >>>> * flask-appbuilder > > >>>> * flask-cache > > >>>> * flask-migrate > > >>>> * flask-script > > >>>> * flask-sqlalchemy > > >>>> * flask-testing > > >>>> * humanize > > >>>> * gunicorn > > >>>> * markdown > > >>>> * pandas > > >>>> * parsedatetime > > >>>> * pydruid > > >>>> * PyHive > > >>>> * python-dateutil > > >>>> * requests > > >>>> * simplejson > > >>>> * six > > >>>> * sqlalchemy > > >>>> * sqlalchemy-utils > > >>>> * sqlparse > > >>>> * thrift > > >>>> * thrift-sasl > > >>>> * werkzeug > > >>>> > > >>>> List of Javascript packages, from NPM: > > >>>> * autobind-decorator > > >>>> * bootstrap > > >>>> * bootstrap-datepicker > > >>>> * brace > > >>>> * brfs > > >>>> * cal-heatmap > > >>>> * classnames > > >>>> * d3 > > >>>> * d3-cloud > > >>>> * d3-sankey > > >>>> * d3-scale > > >>>> * d3-tip > > >>>> * datamaps > > >>>> * datatables-bootstrap3-plugin > > >>>> * datatables.net-bs > > >>>> * font-awesome > > >>>> * gridster > > >>>> * immutability-helper > > >>>> * immutable > > >>>> * jquery > > >>>> * lodash.throttle > > >>>> * mapbox-gl > > >>>> * moment > > >>>> * moments > > >>>> * mustache > > >>>> * nvd3 > > >>>> * react > > >>>> * react-ace > > >>>> * react-bootstrap > > >>>> * react-bootstrap-table > > >>>> * react-dom > > >>>> * react-draggable > > >>>> * react-gravatar > > >>>> * react-grid-layout > > >>>> * react-map-gl > > >>>> * react-redux > > >>>> * react-resizable > > >>>> * react-select > > >>>> * react-syntax-highlighter > > >>>> * reactable > > >>>> * redux > > >>>> * redux-localstorage > > >>>> * redux-thunk > > >>>> * shortid > > >>>> * style-loader > > >>>> * supercluster > > >>>> * topojson > > >>>> * victory > > >>>> * viewport-mercator-project > > >>>> > > >>>> == Cryptography == > > >>>> The proposal does not include cryptographic code. > > >>>> > > >>>> == Required Resources == > > >>>> > > >>>> === Mailing List === > > >>>> There is a current mailing list as a Google Group > “airbnb_superset” that > > >>>> we > > >>>> are planning on deprecating as the Apache.org become ready to > serve our > > >>>> community. > > >>>> > > >>>> * superset-private > > >>>> * superset-dev > > >>>> * superset-user > > >>>> > > >>>> === Subversion Directory === > > >>>> Git is the preferred source control system. > > >>>> http://svn.apache.org/repos/asf/incubator/superset > > >>>> > > >>>> == Git Repository == > > >>>> Git is the preferred source control system, we’re assuming > > >>>> https://github.com/apache/incubator-superset based on the > naming scheme > > >>>> > > >>>> == Issue Tracking == > > >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github > issues & > > >>>> PRs > > >>>> to manage our project as much as possible. It’s been said that > there are > > >>>> ways to keep Github’s issues in sync with Jira, allowing us to > get best > > >> of > > >>>> both worlds. If that is not possible, we will comply to using > Jira. > > >>>> > > >>>> == Other Resources == > > >>>> We currently use a set of Github integrated services that are > free to > > >> the > > >>>> open source community, like Travis-ci, Code Climate, Coveralls, > > >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like > to keep > > >>>> using > > >>>> these services as they allow us to scale contributions and > optimize our > > >>>> development flows. These services require some elevated rights > on the > > >>>> Github repository in order to set up or tune and we would like > for the > > >>>> committers to have the required rights. > > >>>> > > >>>> > > >>>> == Initial Committers == > > >>>> > > >>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PPMC & > Committer > > >>>> * Alanna Scott <alanna.sc...@airbnb.com> - PPMC & Committer > > >>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PPMC & Committer > > >>>> * Vera Liu <vera....@airbnb.com> - Committer > > >>>> * Jeff Feng <jeff.f...@airbnb.com> - PPMC & Committer > > >>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer > > >>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PPMC & > Committer > > >>>> * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > > >>>> * Priyank Shah <ps...@hortonworks.com> - Committer > > >>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - > Committer > > >>>> * Daniel Dai <da...@apache.org> - Champion & Committer > > >>>> * Luke Han <luke....@apache.org> - Mentor > > >>>> > > >>>> == Affiliations == > > >>>> The initial committers are employees of Airbnb Inc. and > Hortonworks. > > >>>> > > >>>> == Sponsors == > > >>>> > > >>>> === Champion === > > >>>> Daniel Dai <da...@apache.org> > > >>>> > > >>>> === Nominated Mentors === > > >>>> * Ashutosh Chauhan <hashut...@apache.org> > > >>>> * Luke Han <luke....@apache.org> > > >>>> > > >>>> === Sponsoring Entity === > > >>>> Incubator PMC > > >>>> > > >>> > > >>> > > >> > > > > > > ------------------------------------------------------------ > --------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > >