Folks, Thanks for the great conversation around bringing Rya into the incubator. If there are no other questions, and if people are happy with the answers so far, I will announce the vote later today.
Cheers, Adam On Tue, Sep 8, 2015 at 9:06 PM, Adina Crainiceanu <ad...@usna.edu> wrote: > Rob, > > Thank you very much for your comments. > > > > > As someone already involved with other open source RDF projects both > > inside and outside Apache it would be nice to add to the Relationships > > with Other Apache Products a bit about what (if anything) you expect the > > relationship to other RDF related Apache projects (Jena, Clerezza, > > Stanbol, Marmotta, Commons RDF (incubating)) to be? > > > > > Jena API or Commons RDF API could become the RDF API used by Rya, but such > a decision was not made. Clerezza is database/triple store agnostic, and as > such could be complementary to Rya. Stanbol focuses on providing semantic > services, while Rya focuses on providing a distributed triple store > solution, with support for SPARQL and OWL reasoning. Marmotta provides an > implementation of a Linked Data Platform, and overlaps in some of the goals > and functionality with Rya (RDF triple store, SPARQL support among others). > There are many opportunities for collaboration with these projects and we > are looking forward to such a collaboration. > > Apache is about community over code and so never mandates any particular > > technical choices (that is always up to the individual communities) but > it > > would be useful to understand if you see any overlap with existing > > projects or any collaboration oppurtunities. The latter are particular > > interesting because one way you can help grow a new community is by > > attracting interested users in pre-existing communities who want to work > > on the specific problems you are aiming to tackle where their existing > > options don't address the problems while your approach does. > > > > There are indeed many opportunities for collaboration with the other > projects. > > > > It would also be nice to see some discussion in that section about things > > like versioning of your major dependencies. In particular you build on > > Accumulo so do you require specific version(s) thereof (since they appear > > to maintain 3 release lines currently) or simply require a version with a > > specific subset of Accumulo functionality? How (if at all) does this > > translate into risks in terms of adoption, community traction etc e.g. > > what happens if you rely on version X and the Accumulo community abandons > > that in favour of version Y or if you rely on a specific experimental > > feature that never makes it into Accumulo releases? > > > > > Rya is built on top of Accumulo, and uses features standard in all current > versions of Accumulo. We are not relying on any experimental feature. As > the Rya community evolves, we expect Rya to change to take advantage of > new/improved features in Accumulo or the other dependencies, if those lead > to an improvement in Rya. > > > > > Also it would be nice if the external dependencies section properly > linked > > to relevant web pages as right now it has several dead links and in some > > cases outdated naming. For example by Open RDF I assume you mean OpenRDF > > Sesame which now lives at rdf4j.org (though I'll admit to not > > understanding what I'm supposed to call it anymore either!) > > > > We fixed the links now > > > > Similarly the documentation section mentions papers but doesn't provide > > links, while both can be found online easily enough it would be nice to > > add the links in > > > > We added the links. > > > Thank you very much, > Adina > > > > > > > > > > >We would like to start a discussion on accepting Rya, a scalable RDF > data > > >management system built on top of Accumulo. into Apache Incubator. > > > > > >The proposal is available online at > > >https://wiki.apache.org/incubator/RyaProposal and also at the end of > this > > >email. > > > > > >We are looking for additional mentors to help us with the project. Any > > >advice and help will be appreciated. > > > > > >Thank you very much, > > >Adina > > > > > > > > > > > >= Rya Proposal = > > > > > >== Abstract == > > > > > >Rya (pronounced "ree-uh" /rēə/) is a cloud-based RDF triple store that > > >supports SPARQL queries. > > > > > >== Proposal == > > > > > >Rya is a scalable RDF data management system built on top of Accumulo. > Rya > > >uses novel storage methods, indexing schemes, and query processing > > >techniques that scale to billions of triples across multiple nodes. Rya > > >provides fast and easy access to the data through SPARQL, a conventional > > >query mechanism for RDF data. > > > > > >== Background == > > > > > >RDF is a World Wide Web Consortium (W3C) standard used in describing > > >resources on the Web. The smallest data unit is a triple consisting of > > >subject, predicate, and object. Using this framework, it is very easy to > > >describe any resource, not just Web related. For example, if you want to > > >say that Alice is a professor, you can represent this as an RDF triple > > >like > > >(Alice, rdf:type, Professor). In general, RDF is an open world framework > > >that allows anyone to make any statement about any resource, which makes > > >it > > > a popular choice for expressing a large variety of data. > > > > > >RDF is used in conjunction with the Web Ontology Language (OWL). OWL is > a > > >framework for describing models or ontologies for RDF. It defines > > >concepts, > > >relationships, and/or structure of RDF documents. These models can be > used > > >to 'reason/infer' information about entities within a given domain. For > > >example, you can express that a Professor is a sub class of Faculty, > > >(Professor, rdfs:subClassOf, Faculty) and knowing that (Alice, rdf:type, > > >Professor), it can be inferred that (Alice, rdf:type, Faculty). > > > > > >SPARQL is an RDF query language. Similar with SQL, SPARQL has SELECT and > > >WHERE clauses; however, it is based on querying and retrieving RDF > > >triples. > > > > > >Work on Rya, a large scale distributed system for storing and querying > > >RDF > > >data, started in 2010. > > > > > >== Rationale == > > > > > >With the increase in data size, there is a need for scalable systems for > > >storing and retrieving RDF data in a cluster of nodes. We believe that > Rya > > >can fulfil that role. We expect that communities within government, > health > > >care, finance, and others who generate large amounts of RDF data will be > > >most interested in this project. > > > > > >From its inception, the project operated with an Apache-style license, > but > > >it was open to mostly US government-related projects only. We believe > that > > >having the project and the development open for all will benefit both > the > > >project and the interested communities. > > > > > >== Current Status == > > > > > >The project source code and documentation are currently hosted in a > > >private > > >repository on Github. New users are added to the repository upon > request. > > > > > >=== Meritocracy === > > > > > >Meritocracy is the model that we currently follow, and we want to build > a > > >larger and more diverse developer community by becoming an Apache > project. > > > > > >=== Community === > > > > > >Rya has being building a community of users and developers for the past > 3 > > >years. There is currently an active workgroup with monthly meetings and > > >the > > >number of participants in the meeting is increasing. > > > > > >=== Core Developers === > > > > > >The core developers are a diverse group of people who are either > > >government > > >employees or former / current government contractors from different > > >companies. > > > > > >=== Alignment === > > > > > >Rya is built on top of Accumulo, an Apache project. > > > > > >== Known Risks == > > > > > >=== Orphaned Products === > > > > > >There is a very small risk of becoming orphaned. The current > contributors > > >are strongly committed to the project, there is a large enough number of > > >developers interested in contributing to the project, and we believe > that > > >the support for the project will continue to grow from the interested > > >communities. > > > > > >=== Inexperience with Open Source === > > > > > >The initial committers have various degrees of experience with open > source > > >projects - from very new to experienced. This project was open source > > >within government from the beginning. We do not expect to have > > >difficulties > > >in operating under Apache's development process. > > > > > >=== Homogenous Developers === > > > > > >The current list of developers form a heterogeneous group, with people > for > > >academia, government, and industry, collaborating from distributed > > >geographic locations. We aim to expand the list of contributors with the > > >help of the Apache incubation process. > > > > > >=== Reliance on Salaried Developers === > > > > > >Many but not all of the developers working on the project are salaried > > >employees, paid to work on this project. They will continue to > contribute > > >to the open source project. Some of the initial committers continued as > > >volunteers even if no longer employed to work on this project and they > > >plan > > >to continue supporting the project. > > > > > >=== Relationships with Other Apache Products === > > > > > >Rya uses Apache Accumulo, Hadoop, Zookeeper, Maven. > > > > > >=== Apache Brand === > > > > > >Rya has generated interest in the government. It also generated interest > > >within academia and industry. We believe that everyone could benefit > from > > >having Rya as an open source project. Due to its strong ties to > Accumulo, > > >an Apache project, and due to the values of the Apache Foundation, we > > >believe that Apache incubator is the right place for Rya. > > > > > >== Documentation == > > > > > >Two peer-reviewed publications [1,2] about Rya were published in 2012 > and > > >2015. More documentation is available in the code. > > > > > >[1] Roshan Punnoose, Adina Crainiceanu, David Rapp. Rya: A Scalable RDF > > >Triple Store for the Clouds. Proceedings of the 1st International > Workshop > > >on Cloud Intelligence, Pages 4:1-4:8, August 2012 > > > > > >[2] Roshan Punnoose, Adina Crainiceanu, David Rapp. SPARQL in the Clouds > > >Using Rya. Information Systems, Volume 48, Pages 181-195, March 2015 > > >(Available online 23 July 2013) > > > > > >== Initial Source == > > > > > >The code is currently available in a private Github repository. > > >https://github.com/LAS-NCSU/rya > > > > > >== Source and Intellectual Property Submission Plan == > > > > > >The source code has been released under the Apache License, Version 2. > > >Software grant, and CCLAs have been submitted. ICLAs for initial > > >committers > > >have been submitted or are in progress. > > > > > >== External Dependencies == > > > > > > * Open RDF (BSD license) > > > * GeoMesa (Apache License, Version 2.0) > > > * Accumulo (Apache License, Version 2.0) > > > * Hadoop (Apache License, Version 2.0) > > > * TinkerPop (Apache License, Version 2.0) > > > * IndexingSail (Apache License, Version 2.0) > > > > > >== Cryptography == > > > > > >The proposal does not involve any cryptographic code. > > > > > >== Required Resources == > > > > > >=== Mailing lists === > > > > > > * priv...@rya.incubator.apache.org > > > * d...@rya.incubator.apache.org > > > * comm...@rya.incubator.apache.org > > > > > >=== Git Repository === > > > > > >https://git-wip-us.apache.org/repos/asf/incubator-rya.git > > > > > >=== Issue Tracking === > > > > > >JIRA Rya > > > > > >== Initial Committers == > > > > > > * Roshan Punnoose, roshanp at gmail dot com > > > * David Rapp, dnrapp at ncsu dot edu > > > * Adina Crainiceanu, adinancr at gmail dot com > > > * Aaron Mihalik, aaron.mihalik at gmail dot com > > > * Puja Valiyil, pujav65 at gmail dot com > > > * Jennifer Brown, jennifer.brown at parsons dot com > > > * Steve Wagner, steve.r.wagner at gmail dot com > > > > > >== Affiliations == > > > > > > * Roshan Punnoose, Enlighten IT Consulting > > > * David Rapp, North Carolina State University > > > * Adina Crainiceanu, US Naval Academy > > > * Aaron Mihalik, Parsons > > > * Puja Valiyil, Parsons > > > * Jennifer Brown, Parsons > > > * Steve Wagner, Enlighten IT Consulting > > > > > >== Sponsors == > > > > > >=== Champion === > > > > > >Adam Fuchs, ASF Member, afuchs at apache dot org > > > > > >=== Nominated Mentors === > > > > > >Josh Elser josh dot elser at gmail dot com > > > > > >We are seeking additional mentors > > > > > >=== Sponsoring Entity === > > > > > >Apache Incubator > > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > > For additional commands, e-mail: general-h...@incubator.apache.org > > > > > > > -- > Dr. Adina Crainiceanu > http://www.usna.edu/Users/cs/adina/ >