Happy to help too.
Best Regards! --------------------- Luke Han On Fri, Apr 14, 2017 at 1:17 AM, Debo Dutta (dedutta) <dedu...@cisco.com> wrote: > happy to help > > debo > > On 4/13/17, 9:59 AM, "Maxime Beauchemin" <maximebeauche...@gmail.com> > wrote: > > Hi Jean-Baptiste, > > We are indeed looking for more mentors. > > Should I update the wiki and replace all references to PMC by PPMC? > > Thanks, > > Max > > On Wed, Apr 12, 2017 at 12:51 PM, Jean-Baptiste Onofré < > j...@nanthrax.net> > wrote: > > > Hi Maxime, > > > > The proposal looks interesting. > > > > Just a note, it's PPMC (not PMC) during incubation. > > > > Are you seeking for other mentor (I see you only have one mentor and > one > > champion for now) ? > > > > Regards > > JB > > > > > > On 04/12/2017 09:41 PM, Maxime Beauchemin wrote: > > > >> Hi all, > >> > >> We would love feedback on the proposal. Do the veterans on this > mailing > >> list think that the proposal is ready for a vote!? > >> > >> Thanks, > >> > >> Max > >> > >> On Tue, Apr 4, 2017 at 5:26 PM, Luke Han <luke...@gmail.com> wrote: > >> > >> Hi Jeff, > >>> This is great project which have been mentioned many times in > >>> community. It looks cool and fun for data works. > >>> > >>> Thanks to proposal Superset to be Apache Incubator Project, > please > >>> let > >>> me know if there's anything I could help. > >>> > >>> Thanks. > >>> Luke > >>> > >>> > >>> Best Regards! > >>> --------------------- > >>> > >>> Luke Han > >>> > >>> On Sun, Apr 2, 2017 at 7:45 AM, Jeff Feng > <jeff.f...@airbnb.com.invalid> > >>> wrote: > >>> > >>> Dear Apache Incubator Community, > >>>> > >>>> We are excited to share our proposal for discussion and feedback > for > >>>> entering Apache Incubation. Superset is an enterprise-ready web > >>>> application for data exploration, data visualization and > dashboarding. > >>>> > >>>> Our Incubation proposal is at the following Wiki as well as > copied in > >>>> the > >>>> email below: > >>>> > >>>> https://wiki.apache.org/incubator/SupersetProposal > >>>> > >>>> We have an active Superset community including 400+ members and > nearly > >>>> > >>> 200 > >>> > >>>> topics. The Google Group can be found below. We plan to move the > >>>> discussion to the ASF: > >>>> > >>>> https://groups.google.com/forum/#!forum/airbnb_superset > >>>> > >>>> Thank you and look forward to the discussion! > >>>> > >>>> Jeff, Max & Alanna > >>>> > >>>> > >>>> > >>>> > >>>> = Superset = > >>>> > >>>> == Abstract == > >>>> > >>>> Superset is an enterprise-ready web application for data > exploration, > >>>> > >>> data > >>> > >>>> visualization and dashboarding. > >>>> > >>>> == Proposal == > >>>> > >>>> Superset is business intelligence (BI) software that helps modern > >>>> organizations visualize and interact with their data. Superset > enables > >>>> users explore data from a variety of databases, assemble beautiful > >>>> dashboards and share their findings. Superset works neatly with > all > >>>> > >>> modern > >>> > >>>> SQL-speaking databases, and integrates with Druid.io to provide > >>>> > >>> real-time, > >>> > >>>> interactive, blazing fast data access to large datasets. > >>>> > >>>> == Background == > >>>> > >>>> Data is mission critical. To succeed in this era, organizations > need to > >>>> provide low-friction, intuitive and interactive access to data. > It is > >>>> paramount for knowledge workers to be capable of answering their > own > >>>> questions by querying, exploring and visualizing data. > >>>> > >>>> The entire business intelligence industry has pivoted from a > model of > >>>> centralized top-down platforms driven by IT organizations to > >>>> self-service > >>>> analytics and agile workflows by any user. This shift unblocks > >>>> > >>> centralized > >>> > >>>> service bottlenecks for creating data visualizations while also > creating > >>>> > >>> an > >>> > >>>> environment that is iterative and fast-moving. This means that > business > >>>> intelligence software must also be easy and delightful to use. > >>>> Self-service analytics doesn’t mean that admin and governance > features > >>>> > >>> are > >>> > >>>> not needed. > >>>> > >>>> Modern BI tools provide fine-grain access controls and auditing > >>>> capabilities to understand how data is being used. Superset is a > >>>> > >>> solution > >>> > >>>> that delivers on all of these vectors. > >>>> > >>>> The technology stack is also constantly morphing - vendors are > >>>> struggling > >>>> to provide cheap, quick and easy solutions to access data. > Business > >>>> intelligence users are finding existing solutions lacking as these > >>>> > >>> software > >>> > >>>> products either disregard or react slowly to recent game-changing > >>>> technologies like Druid.io, PrestoDB, Apache Drill, Apache Kylin, > d3.js, > >>>> React.js and iPython’s Jupyter for instance. > >>>> > >>>> == Rationale == > >>>> > >>>> Business intelligence is more relevant today than at any other > point in > >>>> history. Organizations are currently very limited in options for > open > >>>> source data visualization solutions, especially solutions that > are both > >>>> self-service and enterprise-ready. Every company informing their > >>>> > >>> decisions > >>> > >>>> with data needs a BI tool. > >>>> > >>>> We believe that Superset will be a strong compliment to existing > Apache > >>>> Software Foundation technologies by offering scalable user > interactions > >>>> > >>> to > >>> > >>>> distributed storage and computation solutions. Users will often > find > >>>> > >>> that > >>> > >>>> Superset can act as a catalyst for tooling that can visualize the > >>>> > >>> byproduct > >>> > >>>> of data and computation infrastructure. > >>>> > >>>> Superset has many key design elements that help fill a gap in > current > >>>> solutions for organizations: > >>>> > >>>> * Easy, low friction access to data through a simple, web-based > data > >>>> exploration interface. Composing charts and dashboards are > intuitive. > >>>> Eliminating the need to write code or SQL empowers anyone to use > it. > >>>> > >>>> * Access to a wide array of rich, interactive data visualization > types. > >>>> > >>>> * Enterprise-ready: Integration with different authentication > mechanisms > >>>> and granular permissions centered around actions and data access. > >>>> > >>>> * Realtime & fast: Superset provides realtime analytics at the > speed of > >>>> thought on very large datasets when integrated with Druid.io. > >>>> > >>>> * Broad data access: Consume data out of any SQL-speaking > relational > >>>> database. > >>>> > >>>> * Extensible: Can be extended to talk to many noSQL databases like > >>>> Apache > >>>> Drill, Elastic Search, and other popular database engines. > >>>> > >>>> * Fast loading dashboards with configurable web-scale caching. > >>>> > >>>> * Plug-in framework that enables organizations to build custom > >>>> analytical > >>>> applications with new UI/UX interfaces. > >>>> > >>>> * SQL Lab, a state-of-the-art SQL IDE that empowers SQL-speaking > users > >>>> > >>> with > >>> > >>>> more flexibility. SQL Lab integrates with the visualization > engine > >>>> seamlessly. > >>>> > >>>> == Initial Goals == > >>>> > >>>> The initial goals of the Superset project are several-fold: > >>>> > >>>> Move the existing codebase to Apache and integrate with the Apache > >>>> development process. > >>>> > >>>> Redesign the user interface and interaction model for creating > >>>> visualizations/dashboards and connecting to data sources > >>>> > >>>> Build robust support for security and governance of the tool > including > >>>> popular authorization modules (including Apache Ranger and Apache > >>>> Sentry) > >>>> and a more sophisticated permissions system > >>>> > >>>> Grow the extensibility of the project both in terms of enhanced > >>>> connectivity to NoSQL-based data sources and creating a plug-in > >>>> framework > >>>> that enables organizations to build custom analytical > applications which > >>>> require a new UI/UX > >>>> > >>>> == Current Status == > >>>> > >>>> By many standards, Superset is already a successful open source > project. > >>>> > >>> As > >>> > >>>> of March 2017, Superset is officially used in production at about > a > >>>> dozen > >>>> companies, has received contributions from over one hundred > contributors > >>>> > >>> on > >>> > >>>> Github, 1500+ forks, and 12k+ stars. > >>>> > >>>> Sizeable companies like Airbnb, Yahoo! and Hortonworks have made > >>>> significant contributions, and expressed their commitment to the > >>>> project. > >>>> The product is feature complete and has been viable for months. It > >>>> > >>> already > >>> > >>>> serves as the main interface for consuming data at many companies > of > >>>> different sizes. > >>>> > >>>> While the product is usable, there’s room for improvement across > the > >>>> > >>> board, > >>> > >>>> starting with providing a smoother user experience around content > >>>> > >>> creation, > >>> > >>>> making sure all features work out-of-the-box on more platforms and > >>>> databases, providing better user training guides and videos, > having a > >>>> predictable release process, and increasing the overall quality > of the > >>>> Superset releases. > >>>> > >>>> === Meritocracy === > >>>> > >>>> We plan to invest in supporting a meritocracy. We will discuss the > >>>> requirements in an open forum. Several companies have expressed > interest > >>>> > >>> in > >>> > >>>> this project, and we intend to invite additional developers to > >>>> > >>> participate. > >>> > >>>> We will encourage and monitor community participation so that > privileges > >>>> can be extended to those that contribute. > >>>> > >>>> === Community === > >>>> > >>>> The need for an enterprise-ready data visualization and > exploration > >>>> platform in the open source community is tremendous. While > Superset is > >>>> fairly well known, recognized and used within the Druid.io > community, > >>>> adoption is currently limited outside of that niche. There is a > huge > >>>> opportunity to grow the community to hundreds if not thousands of > >>>> organizations, and we are hoping that embracing “the Apache way” > will > >>>> accelerate the growth of our community. > >>>> > >>>> We have already been active at seeking and inviting > contributions, and > >>>> > >>> are > >>> > >>>> planning to scale the project by investing time and growing the > support > >>>> structure to grow the community. > >>>> > >>>> === Core Developers === > >>>> > >>>> The initial committers for Superset include experienced full > stack, > >>>> front-end and data engineers: > >>>> > >>>> * Maxime Beauchemin (Airbnb) > >>>> > >>>> * Alanna Scott (Airbnb) > >>>> > >>>> * Bogdan Kyryliuk (Airbnb) > >>>> > >>>> * Vera Liu (Airbnb) > >>>> > >>>> * Jeff Feng (Airbnb) > >>>> > >>>> * Ashutosh Chauhan (Hortonworks) > >>>> > >>>> * Nishant Bangarwa (Hortonworks) > >>>> > >>>> * Slim Bouguerra (Hortonworks) > >>>> > >>>> * Priyank Shah (Hortonworks) > >>>> > >>>> * Sriharsha Chintalapani (Hortonworks) > >>>> > >>>> * Daniel Dai (Hortonworks) > >>>> > >>>> We realize that additional employer diversity is needed, and we > will > >>>> work > >>>> aggressively to recruit developers from additional companies. > >>>> > >>>> === Alignment === > >>>> > >>>> The initial committers strongly believe that a system for > interactive > >>>> visualization of data will gain broader adoption as an open > source, > >>>> community driven project, where the community can contribute not > only to > >>>> the core components, but also to a growing collection of > connectors, > >>>> visualizations and improving integration a all potential data > sources. > >>>> Superset already integrates closely with Apache Hive, the Hive > >>>> metastore, > >>>> as well as most SQL-speaking databases found in modern data > ecosystems. > >>>> > >>>> == Known Risks == > >>>> > >>>> === Orphaned Products === > >>>> > >>>> Superset is a vital component for both visualizing, accessing and > >>>> democratizing data at Airbnb. Also at Hortonworks, Superset is a > core > >>>> component of the DataFlow product offering. Thus, the risk of the > >>>> > >>> project > >>> > >>>> being orphaned is relatively low. The project could be at risk if > >>>> Airbnb > >>>> changes their approach for democratizing data or if Hortonworks > changes > >>>> their strategy in the market. In such an event, the committers > plan to > >>>> continue working on the project on their own time, thought the > progress > >>>> will likely be slower. We plan to mitigate this risk by > recruiting > >>>> additional committers. > >>>> > >>>> === Inexperience with Open Source === > >>>> > >>>> The initial committers include veteran Apache members (committers > and > >>>> PMC > >>>> members) and other developers who have varying degrees of > experience > >>>> with > >>>> open source projects. All have been involved with source code > that has > >>>> > >>> been > >>> > >>>> released under an open source license, and several also have > experience > >>>> developing code with an open source development process. > >>>> > >>>> === Homogenous Developers === > >>>> > >>>> The initial committers are employed by Airbnb Inc., and > Hortonworks. We > >>>> > >>> are > >>> > >>>> committed to recruiting additional committers from other > companies. > >>>> > >>>> === Reliance on Salaried Developers === > >>>> > >>>> It is expected that Superset development will occur on both > salaried > >>>> time > >>>> and on volunteer time, after hours. The majority of initial > committers > >>>> > >>> are > >>> > >>>> paid by their employer to contribute to this project. However, > they are > >>>> > >>> all > >>> > >>>> passionate about the project, and we are confident that the > project will > >>>> continue even if no salaried developers contribute to the > project. We > >>>> are > >>>> committed to recruiting additional committers including > non-salaried > >>>> developers. > >>>> > >>>> === Relationships with Other Apache Products === > >>>> > >>>> To the knowledge of the Initial Committers, there are no direct > >>>> > >>> competitors > >>> > >>>> to Superset within the Apache Software Foundation. That said, > Apache > >>>> Zeppelin is an indirect competitor, but it solves a different use > case. > >>>> > >>>> Apache Zeppelin is a web-based notebook that enables interactive > data > >>>> analytics. It enables the creation of beautiful data-driven, > interactive > >>>> and collaborative documents with SQL, Scala and more. Although a > user > >>>> > >>> can > >>> > >>>> create data visualizations using this project, it leverages a > notebook > >>>> style user interfaces and it is geared towards the Spark > community where > >>>> Scala and SQL co-exist > >>>> > >>>> We look forward to collaborating with those communities, as well > as > >>>> other > >>>> Apache communities. > >>>> > >>>> === An Excessive Fascination with the Apache Brand === > >>>> > >>>> Superset is solving two huge challenges: > >>>> > >>>> The challenge of enabling every knowledge worker to make data > informed > >>>> decisions, particularly those who are not deeply skilled at > writing SQL. > >>>> > >>>> The challenge of visualizing huge amounts of data interactively > and in > >>>> real-time > >>>> > >>>> Superset was first developed as a data visualization solution for > >>>> > >>> Druid.io > >>> > >>>> as a way to visualize billions of rows of data. Since then, > usage of > >>>> Superset has expanded to address data visualization use cases > across SQL > >>>> speaking data sources as well. > >>>> > >>>> Our rationale for developing Superset as an Apache project is > detailed > >>>> in > >>>> the Rationale Section. We believe that the Apache brand and > community > >>>> process will help us attract more contributors to this project, > and help > >>>> grow the footprint of the project through usage at other > organizations > >>>> > >>> and > >>> > >>>> within other applications. Establishing consensus among users and > >>>> developers will result in a more valuable tool for everyone. > >>>> > >>>> == Documentation == > >>>> > >>>> References to further reading material: > >>>> > >>>> * [[http://airbnb.io/superset/|Superset Documentation]] > >>>> > >>>> * [[https://medium.com/airbnb-engineering/caravel-airbnb-s-dat > >>>> a-exploration-platform-15a72aa610e5#.npqmmbu25|Blog Post: > Superset: > >>>> Airbnb’s Data Exploration Platform]] > >>>> > >>>> * [[https://medium.com/airbnb-engineering/superset-scaling-dat > >>>> a-access-and-visual-insights-at-airbnb-3ce3e9b88a7f#. > a505zvb1t|Blog > >>>> > >>> Post: > >>> > >>>> Superset: Scaling Data Access & Visual Insights at Airbnb]] > >>>> > >>>> == Initial Source == > >>>> > >>>> The origin of the proposed code base can be found at > >>>> https://github.com/airbnb/superset. The code base is primarily > in > >>>> > >>> Python. > >>> > >>>> > >>>> == Source and Intellectual Property Submission Plan == > >>>> > >>>> We do not expect any complications for the submission of the > Superset > >>>> > >>> code > >>> > >>>> base. Our code is already in Github and there is only a single > code > >>>> > >>> base. > >>> > >>>> > >>>> == External Dependencies == > >>>> > >>>> List of Python packages, from the Python Package Index (Pypi): > >>>> > >>>> * boto3 > >>>> > >>>> * celery > >>>> > >>>> * cryptography > >>>> > >>>> * flask-appbuilder > >>>> > >>>> * flask-cache > >>>> > >>>> * flask-migrate > >>>> > >>>> * flask-script > >>>> > >>>> * flask-sqlalchemy > >>>> > >>>> * flask-testing > >>>> > >>>> * humanize > >>>> > >>>> * gunicorn > >>>> > >>>> * markdown > >>>> > >>>> * pandas > >>>> > >>>> * parsedatetime > >>>> > >>>> * pydruid > >>>> > >>>> * PyHive > >>>> > >>>> * python-dateutil > >>>> > >>>> * requests > >>>> > >>>> * simplejson > >>>> > >>>> * six > >>>> > >>>> * sqlalchemy > >>>> > >>>> * sqlalchemy-utils > >>>> > >>>> * sqlparse > >>>> > >>>> * thrift > >>>> > >>>> * thrift-sasl > >>>> > >>>> * werkzeug > >>>> > >>>> List of Javascript packages, from NPM: > >>>> > >>>> * autobind-decorator > >>>> > >>>> * bootstrap > >>>> > >>>> * bootstrap-datepicker > >>>> > >>>> * brace > >>>> > >>>> * brfs > >>>> > >>>> * cal-heatmap > >>>> > >>>> * classnames > >>>> > >>>> * d3 > >>>> > >>>> * d3-cloud > >>>> > >>>> * d3-sankey > >>>> > >>>> * d3-scale > >>>> > >>>> * d3-tip > >>>> > >>>> * datamaps > >>>> > >>>> * datatables-bootstrap3-plugin > >>>> > >>>> * datatables.net-bs > >>>> > >>>> * font-awesome > >>>> > >>>> * gridster > >>>> > >>>> * immutability-helper > >>>> > >>>> * immutable > >>>> > >>>> * jquery > >>>> > >>>> * lodash.throttle > >>>> > >>>> * mapbox-gl > >>>> > >>>> * moment > >>>> > >>>> * moments > >>>> > >>>> * mustache > >>>> > >>>> * nvd3 > >>>> > >>>> * react > >>>> > >>>> * react-ace > >>>> > >>>> * react-bootstrap > >>>> > >>>> * react-bootstrap-table > >>>> > >>>> * react-dom > >>>> > >>>> * react-draggable > >>>> > >>>> * react-gravatar > >>>> > >>>> * react-grid-layout > >>>> > >>>> * react-map-gl > >>>> > >>>> * react-redux > >>>> > >>>> * react-resizable > >>>> > >>>> * react-select > >>>> > >>>> * react-syntax-highlighter > >>>> > >>>> * reactable > >>>> > >>>> * redux > >>>> > >>>> * redux-localstorage > >>>> > >>>> * redux-thunk > >>>> > >>>> * shortid > >>>> > >>>> * style-loader > >>>> > >>>> * supercluster > >>>> > >>>> * topojson > >>>> > >>>> * victory > >>>> > >>>> * viewport-mercator-project > >>>> > >>>> == Cryptography == > >>>> > >>>> The proposal does not include cryptographic code. > >>>> > >>>> == Required Resources == > >>>> > >>>> === Mailing List === > >>>> > >>>> There is a current mailing list as a Google Group > “airbnb_superset” that > >>>> > >>> we > >>> > >>>> are planning on deprecating as the Apache.org become ready to > serve our > >>>> community. > >>>> > >>>> * superset-private > >>>> > >>>> * superset-dev > >>>> > >>>> * superset-user > >>>> > >>>> === Subversion Directory === > >>>> > >>>> Git is the preferred source control system. > >>>> > >>> http://svn.apache.org/repos/as > >>> > >>>> f/incubator/superset > >>>> > >>>> == Git Repository == > >>>> > >>>> Git is the preferred source control system, we’re assuming > >>>> https://github.com/apache/incubator-superset based on the naming > scheme > >>>> > >>>> == Issue Tracking == > >>>> > >>>> JIRA Superset (SUPERSET). If possible, we’d like to use Github > issues & > >>>> > >>> PRs > >>> > >>>> to manage our project as much as possible. It’s been said that > there are > >>>> ways to keep Github’s issues in sync with Jira, allowing us to > get best > >>>> > >>> of > >>> > >>>> both worlds. If that is not possible, we will comply to using > Jira. > >>>> > >>>> == Other Resources == > >>>> > >>>> We currently use a set of Github integrated services that are > free to > >>>> the > >>>> open source community, like Travis-ci, Code Climate, Coveralls, > >>>> Landscape.io, Requires.io, david-dm and Gitter. We would like to > keep > >>>> > >>> using > >>> > >>>> these services as they allow us to scale contributions and > optimize our > >>>> development flows. These services require some elevated rights on > the > >>>> Github repository in order to set up or tune and we would like > for the > >>>> committers to have the required rights. > >>>> > >>>> > >>>> == Initial Committers == > >>>> > >>>> * Maxime Beauchemin <maxime.beauche...@airbnb.com> - PMC & > Committer > >>>> > >>>> * Alanna Scott <alanna.sc...@airbnb.com> - PMC & Committer > >>>> > >>>> * Bogdan Kyryliuk <b.kyryl...@gmail.com> - PMC & Committer > >>>> > >>>> * Vera Liu <vera....@airbnb.com> - Committer > >>>> > >>>> * Jeff Feng <jeff.f...@airbnb.com> - PMC & Committer > >>>> > >>>> * Ashutosh Chauhan <hashut...@apache.org> - Mentor & Committer > >>>> > >>>> * Nishant Bangarwa <nbanga...@hortonworks.com> - PMC & Committer > >>>> > >>>> * Slim Bouguerra <sbougue...@hortonworks.com> - Committer > >>>> > >>>> * Priyank Shah <ps...@hortonworks.com> - Committer > >>>> > >>>> * Harsha Chintalapani <schintalap...@hortonworks.com> - Committer > >>>> > >>>> * Daniel Dai <da...@apache.org> - Champion & Committer > >>>> > >>>> == Affiliations == > >>>> > >>>> The initial committers are employees of Airbnb Inc. and > Hortonworks. > >>>> > >>>> == Sponsors == > >>>> > >>>> === Champion === > >>>> > >>>> Daniel Dai <da...@apache.org> > >>>> > >>>> === Nominated Mentors === > >>>> > >>>> Ashutosh Chauhan <hashut...@apache.org> > >>>> > >>>> === Sponsoring Entity === > >>>> > >>>> Incubator PMC > >>>> > >>>> > >>>> -- > >>>> > >>>> *Jeff Feng* > >>>> Product Manager > >>>> m: (949)-610-5108 <(949)%20610-5108> > >>>> twitter: @jtfeng > >>>> > >>>> > >>> > >> > > -- > > Jean-Baptiste Onofré > > jbono...@apache.org > > http://blog.nanthrax.net > > Talend - http://www.talend.com > > > > ------------------------------------------------------------ > --------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org >