It’s great to see such interest and I’m sure the rest of the podling would agree that the more the better. I also agree with Suneel, people who know PIO should be given a short bit of time to get organized before we do the desired expansion. There will be lots of room to contribute, in any case. For instance try creating a template, no better way to learn the project.
On May 19, 2016, at 9:16 PM, Suneel Marthi <smar...@apache.org> wrote: I definitely have concerns about too many folks becoming initial committers and bringing their own corporate agendas to this project. I suggest that first we vote PIO into incubator then bring in those less experienced with the project. We have a good start with people who have worked on the project from several orgs. Let us get organized first and then bring in new people. I sincerely feel that this is getting real murky with too many cooks with their own agendas. The lesser external integration points to PIO the better the project would evolve. My 2 cents. On Thu, May 19, 2016 at 9:03 PM, Andrew Purtell <apurt...@apache.org> wrote: > Hi Nick, > > Unless there are any concerns or objections, I will add you and Mr. > Dusenberry to the proposal as initial committers tomorrow. > > Everyone, > > As it seems that discussion has died down I plan to start a VOTE thread on > this coming Monday. > > Thank you for the comment and attention thus far. > > > On Tue, May 17, 2016 at 12:58 PM, Nick Pentreath <nick.pentre...@gmail.com >> > wrote: > >> Hi there >> >> I'm glad to see the proposal to incubate PredictionIO. In my previous > life >> as a startup co-founder, I kept a close eye on the project, and it would > be >> fantastic to see it become an Apache incubating project! >> >> The folks working on Apache Spark and Apache SystemML (incubating) here > at >> IBM are excited about the possibilities for integrating PredictionIO and >> SystemML (Mike Dusenberry is a committer on that project), as well >> as further improving Spark integration (I'm a PMC member on that > project). >> >> Mike and I, together with Luciano (who is a mentor on this proposal) > would >> like to volunteer our services as initial committers, if that is > agreeable. >> >> Kind regards >> Nick >> mln...@apache.org >> >> >> >>> >>> ---------- Forwarded message ---------- >>> From: Andrew Purtell <apurt...@apache.org> >>> To: "general@incubator.apache.org" <general@incubator.apache.org> >>> Cc: >>> Date: Fri, 13 May 2016 13:41:38 -0700 >>> Subject: [DISCUSS] PredictionIO incubation proposal >>> Greetings, >>> >>> It is my pleasure to >>> >>> propose the PredictionIO project for incubation at the Apache Software >>> Foundation. >>> >>> PredictionIO is a >>> popular >>> open >>> >>> source Machine Learning Server built on top of a state-of-the-art open >>> source stack, including several Apache technologies, that >>> >>> enables developers to manage and deploy production-ready predictive >>> services for various kinds of machine learning tasks >>> , with more than 400 production deployments around the world and a >> growing >>> contributor community. >>> >>> >>> The text of the proposal is included below and is also available at >>> https://wiki.apache.org/incubator/PredictionIO >>> >>> Best regards, >>> Andrew Purtell >>> >>> >>> = PredictionIO Proposal = >>> >>> === Abstract === >>> PredictionIO is an open source Machine Learning Server built on top of >>> state-of-the-art open source stack, that enables developers to manage > and >>> deploy production-ready predictive services for various kinds of > machine >>> learning tasks. >>> >>> === Proposal === >>> The PredictionIO platform consists of the following components: >>> >>> * PredictionIO framework - provides the machine learning stack for >>> building, evaluating and deploying engines with machine learning >>> algorithms. It uses Apache Spark for processing. >>> >>> * Event Server - the machine learning analytics layer for unifying >> events >>> from multiple platforms. It can use Apache HBase or any JDBC backends >>> as its data store. >>> >>> The PredictionIO community also maintains a >>> >>> Template Gallery, a place to >>> publish and download (free or proprietary) engine templates for > different >>> types of machine learning applications, and is a complemental part of > the >>> project. At this point we exclude the Template Gallery from the > proposal, >>> as it has a separate set of contributors and we’re not familiar with an >>> Apache approved mechanism to maintain such a gallery. >>> >>> You can find the Template Gallery at https://templates.prediction.io/ >>> >>> === Background === >>> PredictionIO was started with a mission to democratize and bring > machine >>> learning to the masses. >>> >>> Machine learning has traditionally been a luxury for big companies like >>> Google, Facebook, and Netflix. There are ML libraries and tools lying >>> around the internet but the effort of putting them all together as a >>> production-ready infrastructure is a very resource-intensive task that > is >>> remotely reachable by individuals or small businesses. >>> >>> PredictionIO is a production-ready, full stack machine learning system >> that >>> allows organizations of any scale to quickly deploy machine learning >>> capabilities. It comes with official and community-contributed machine >>> learning engine templates that are easy to customize. >>> >>> === Rationale === >>> As usage and number of contributors to PredictionIO has grown bigger > and >>> more diverse, we have sought for an independent framework for the > project >>> to keep thriving. We believe the Apache foundation is a great fit. >> Joining >>> Apache would ensure that tried and true processes and procedures are in >>> place for the growing number of organizations interested in > contributing >>> to PredictionIO. PredictionIO is also a good fit for the Apache >> foundation. >>> PredictionIO was built on top of several Apache projects (HBase, Spark, >>> Hadoop). We are familiar with the Apache process and believe that the >>> democratic and meritocratic nature of the foundation aligns with the >>> project goals. >>> >>> === Initial Goals === >>> The initial milestones will be to move the existing codebase to Apache >> and >>> integrate with the Apache development process. Once this is > accomplished, >>> we plan for incremental development and releases that follow the Apache >>> guidelines, as well as growing our developer and user communities. >>> >>> === Current Status === >>> PredictionIO has undergone nine minor releases and many patches. >>> PredictionIO is being used in production by Salesforce.com as well as >> many >>> other organizations and apps. The PredictionIO codebase is currently >>> hosted at GitHub, which will form the basis of the Apache git > repository. >>> >>> ==== Meritocracy ==== >>> We plan to invest in supporting a meritocracy. We will discuss the >>> requirements in an open forum. We intend to invite additional > developers >>> to participate. We will encourage and monitor community participation > so >>> that privileges can be extended to those that contribute. >>> >>> ==== Community ==== >>> Acceptance into the Apache foundation would bolster the already strong >>> user and developer community around PredictionIO. That community > includes >>> many contributors from various other companies, and an active mailing >> list >>> composed of hundreds of users. >>> >>> ==== Core Developers ==== >>> The core developers of our project are listed in our contributors and >>> initial PPMC below. Though many are employed at Salesforce.com, there > are >>> also engineers from ActionML, and independent developers. >>> >>> === Alignment === >>> The ASF is the natural choice to host the PredictionIO project as its >> goal >>> is democratizing Machine Learning by making it more easily accessible > to >>> every user/developer. PredictionIO is built on top of several top level >>> Apache projects as outlined above. >>> >>> === Known Risks === >>> >>> ==== Orphaned products ==== >>> PredictionIO has a solid and growing community. It is deployed on >>> production environments by companies of all sizes to run various kinds > of >>> predictive engines. >>> >>> In addition to the community contribution to PredictionIO framework, > the >>> community is also actively contributing new engines to the Template >>> Gallery as well as SDKs and documentation for the project. Salesforce > is >>> committed to utilize and advance the PredictionIO code base and support >>> its user community. >>> >>> ==== Inexperience with Open Source ==== >>> PredictionIO has existed as a healthy open source project for almost > two >>> years and is the most starred Scala project on GitHub. All of the >> proposed >>> committers have contributed to ASF and Linux Foundation open source >>> projects. Several current committers on Apache projects and Apache >> Members >>> are involved in this proposal and intend to provide mentorship. >>> >>> ==== Homogeneous Developers ==== >>> The initial list of committers includes developers from several >>> institutions, including Salesforce, ActionML, Channel4, USC as well as >>> unaffiliated developers. >>> >>> ==== Reliance on Salaried Developers ==== >>> Like most open source projects, PredictionIO receives substantial > support >>> from salaried developers. PredictionIO development is partially > supported >>> by Salesforce.com, but there are many contributors from various other >>> companies, and an active mailing list composed of hundreds of users. We >>> will continue our efforts to ensure stewardship of the project to be >>> independent of salaried developers by meritocratically promoting those >>> contributors to committers. >>> >>> ==== Relationships with Other Apache Product ==== >>> PredictionIO relies heavily on top level apache projects such as Apache >>> Spark, HBase and Hadoop. However it brings a distinguished > functionality, >>> rather than just an abstraction - Machine Learning in a plug-and-play >>> fashion. >>> >>> Compared to Apache Mahout, which focuses on the development of a wide >>> variety of algorithms, PredictionIO offers a platform to manage the > whole >>> machine learning workflow, including data collection, data preparation, >>> modeling, deployment and management of predictive services in > production >>> environments. >>> >>> ==== An Excessive Fascination with the Apache Brand ==== >>> PredictionIO is already a widely known open source project. This > proposal >>> is not for the purpose of generating publicity. Rather, the primary >>> benefits to joining Apache are those outlined in the Rationale section. >>> >>> === Documentation === >>> PredictionIO boasts rich and live documentation, included in the code >> repo >>> (docs/manual directory), is built with Middleman, and publicly hosted > at >>> https://docs.prediction.io >>> >>> === Initial Source and Intellectual Property Submission Plan === >>> Currently, the PredictionIO codebase is distributed under the Apache > 2.0 >>> License and hosted on GitHub: >> https://github.com/PredictionIO/PredictionIO >>> >>> === External Dependencies === >>> PredictionIO has the following external dependencies: >>> * Apache Hadoop 2.4.0 (optional, required only if YARN and HDFS are >>> needed) >>> * Apache Spark 1.3.0 for Hadoop 2.4 >>> * Java SE Development Kit 8 >>> * and one of the following sets: >>> >>> * PostgreSQL 9.1 >>> >>> >>> or >>> >>> >>> * MySQL 5.1 >>> >>> or >>> >>> >>> * Apache HBase 0.98.6 >>> >>> >>> * Elasticsearch 1.4.0 >>> >>> Upon acceptance to the incubator, we would begin a thorough analysis of >>> all transitive dependencies to verify this information and introduce >>> license checking into the build and release process by integrating with >>> Apache RAT. >>> >>> === Cryptography === >>> PredictionIO does not include cryptographic code. We utilize standard >>> JCE and JSSE APIs provided by the Java Runtime Environment. >>> >>> === Required Resources === >>> We request that following resources be created for the project to use >>> >>> ==== Mailing lists ==== >>> >>> predictionio-priv...@incubator.apache.org (with moderated > subscriptions) >>> >>> predictionio-dev >>> >>> predictionio-user >>> >>> predictionio-commits >>> >>> We will migrate the existing PredictionIO mailing lists. >>> >>> ==== Git repository ==== >>> The PredictionIO team would like to use Git for source control, due to >> our >>> current use of GitHub. >>> >>> git://git.apache.org/incubator-predictionio >>> >>> ==== Documentation ==== >>> https://predictionio.incubator.apache.org/docs/ >>> >>> ==== JIRA instance ==== >>> PredictionIO currently uses the GitHub issue tracking system associated >>> with its repository: > https://github.com/PredictionIO/PredictionIO/issues >> . >>> We will migrate to Apache JIRA. >>> >>> JIRA PREDICTIONIO >>> https://issues.apache.org/jira/browse/PREDICTIONIO >>> >>> ==== Other Resources ==== >>> * TravisCI for builds and test running. >>> >>> * PredictionIO's documentation, included in the code repo (docs/manual >>> directory), is built with Middleman and publicly hosted >>> https://docs.prediction.io >>> >>> * A blog to drive adoption and excitement at > https://blog.prediction.io >>> >>> === Initial Committers === >>> >>> * Pat Ferrell >>> >>> * Tamas Jambor >>> >>> * Justin Yip >>> >>> * Xusen Yin >>> >>> * Lee Moon Soo >>> >>> * Donald Szeto >>> >>> * Kenneth Chan >>> >>> * Tom Chan >>> >>> * Simon Chan >>> >>> * Marco Vivero >>> >>> * Matthew Tovbin >>> >>> * Yevgeny Khodorkovsky >>> >>> * Felipe Oliveira >>> >>> * Vitaly Gordon >>> >>> === Affiliations === >>> >>> * Pat Ferrell - ActionML >>> >>> * Tamas Jambor - Channel4 >>> >>> * Justin Yip - independent >>> >>> * Xusen Yin - USC >>> >>> * Lee Moon Soo - NFLabs >>> >>> * Donald Szeto - Salesforce >>> >>> * Kenneth Chan - Salesforce >>> >>> * Tom Chan - Salesforce >>> >>> * Simon Chan - Salesforce >>> >>> * Marco Vivero - Salesforce >>> >>> * Matthew Tovbin - Salesforce >>> >>> * Yevgeny Khodorkovsky - Salesforce >>> >>> * Felipe Oliveira - Salesforce >>> >>> * Vitaly Gordon - Salesforce >>> >>> === Sponsors === >>> >>> ==== Champion ==== >>> >>> Andrew Purtell <apurtell at apache dot org> >>> >>> ==== Nominated Mentors ==== >>> >>> * Andrew Purtell <apurtell at apache dot org> >>> >>> * James Taylor <jtaylor at apache dot org> >>> >>> * Lars Hofhansl <larsh at apache dot org> >>> >>> * Suneel Marthi <smarthi at apache dot org> >>> >>> * Xiangrui Meng <meng at apache dot org> >>> >>> * Luciano Resende <lresende at apache dot org> >>> >>> ==== Sponsoring Entity ==== >>> >>> Apache Incubator PMC >>> >> > > > > -- > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein > (via Tom White) > --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org