+1

2011/9/27 Mattmann, Chris A (388J) <chris.a.mattm...@jpl.nasa.gov>:
> Hi Folks,
>
> OK, the proposal period had died now and I'm now calling a formal VOTE on
> the Any23 proposal located here:
>
> http://wiki.apache.org/incubator/Any23Proposal
>
> Proposal text copied at the bottom of this email. I'll leave the VOTE open 
> through the
> rest of the week, and close it around Saturday, October 1, early AM PDT.
>
> Please VOTE:
>
> [ ] +1 Accept Any23 into the Apache Incubator
> [ ] +0 Don't care
> [ ] -1  Don't Accept Any23 into the Apache Incubator because...
>
> Thanks!
>
> Cheers,
> Chris
>
> P.S. Here's my +1
>
> Proposal Text:
>
> = Any23 =
> == Abstract ==
> The following proposal is about ''Anything To Triples'' (shortly Any23) 
> defined as a Java library,  a Web service and a set of command line tools to 
> extract and validate structured data  in [[http://www.w3.org/RDF/|RDF]] 
> format from a variety of Web documents and markup formats.  Any23 is what it 
> is informally named an ''RDF Distiller''.
>
> == Proposal ==
> Any23 "Anything to Triples" is a library written in Java 6 and released under 
> the Apache 2.0 License. It provides a set of extractors for scraping semantic 
> markup (such as [[http://microformats.org/|Microformats]], 
> [[http://www.w3.org/TR/rdfa-syntax/|RDFa]] and 
> [[http://www.w3.org/TR/microdata/|Microdata]])  from several sources (HTML4, 
> XHTML5, CSV), a set of data validations, a set of parsers and writers to 
> handle the main RDF transport formats (RDFXML, Ntriples, NQuads, Turtle).  
> The library provides a command line tool for dealing with data extraction, 
> conversion and validation, and a REST service implementation. The library is 
> plugin based, allowing the hot loading of new extractors and validators. 
> Any23 enables third-parties developers to access structured data from Web 
> pages without the need of implementing ad-hoc scraping techniques. In this 
> sense, Any23 will relieve developers from build complex solutions when 
> developing data acquisition pipelines and processes targeted to semantically 
> marked-up Web data.
>
> == Background ==
> Any23 has been initially developed at [[http://www.deri.ie/|DERI (Digital 
> Enterprise Research Institute)]],  as main component of the RDF extraction 
> pipeline used in [[http://sindice.com/|Sindice (the Semantic Web Index)]], 
> now is evolved in joint effort with [[http://www.fbk.eu/|FBK (Fondazione 
> Bruno Kessler)]]. At present time the Any23 official 
> [[http://developers.any23.org|developers page]] contains all the 
> documentation, while the code is maintained on 
> [[http://code.google.com/p/any23/|Google Code]]. An official up-to-date 
> showcase [[http://any23.org|demo]] is also available.
>
> == Rationale ==
> Provide and maintain a robust, standard and updated library for extracting 
> and validating semantic markup from heterogeneous sources would provide large 
> benefits to the entire Open Source Community. Researchers and academic 
> projects are adopting RDF related technologies from years  while the industry 
> is actually moving toward Semantic Web technologies with more concreteness. 
> Several industry initiatives related to the 
> [[http://en.wikipedia.org/wiki/Semantic_Web|Web of Data]]  are taking place 
> in the these months. [[http://schema.org|Schema.org]], for example, is an 
> initiative sponsored by  
> [[http://www.google.com/about/corporate/company/|Google Inc]], 
> [[http://info.yahoo.com/center/us/yahoo/|Yahoo Inc]]  and 
> [[http://www.microsoft.com/about/companyinformation/en/us/default.aspx|Microsoft
>  Corporation]]  to structure the data in a harmonized way on 
> [[http://dev.w3.org/html5/spec/Overview.html|HTML5]] pages. 
> [[http://schema.org|Schema.org]] leverages on the 
> [[http://dev.w3.org/html5/md/|HTML5 Microdata]] native specification. 
> [[http://ogp.me/|OpenGraphProtocol]] is the open standard sponsored by  
> [[https://www.facebook.com/pages/Facebooking/114721225206500|Facebook Inc]] 
> to include metadata in HTML page headers.  
> [[http://ogp.me/|OpenGraphProtocol]], initially based on 
> [[http://www.w3.org/TR/xhtml-rdfa-primer/|RDFa]], allows to describe the 
> content of a Web page and its underlying vocabulary could be directly 
> represented using RDF.
>
> = Current Status =
> == Meritocracy ==
> The historical Any23 team believes in meritocracy and always acted as a 
> community. Mailing list, open issue tracker and other communication channels 
> have always been adopted since its first release. The adoption in a larger 
> community, such as Apache,  is the natural evolution for Any23. Moreover, the 
> Apache standards will enforce the existing Any23 community practices and will 
> be a foundation for future committers involvement.
>
> == Core Developers ==
> In alphabetical order:
>
>  * Davide Palmisano <dpalmisano at gmail dot com>
>  * Giovanni Tummarello <giovanni dot tummarello at deri dot org>
>  * Michele Mostarda <michele dot mostarda at gmail dot com>
>  * Richard Cyganiak <richard at cyganiak dot de>
>  * Reto Bachmann-Gmuer <reto at apache dot org>
>  * Simone Tripodi <simonetripodi at apache dot org>
>  * Szymon Danielczyk <danielczyk.szymon at gmail dot com>
>  * Tommaso Teofili <tommaso at apache dot org>
>
> == Alignment ==
> Main aim of the project is to develop and maintain a fully flavored semantic  
> markup distiller that can be used by other Apache projects that need an RDF 
> extraction tool. The Any23 library core is written using the following Apache 
> libraries.
>
>  * [[http://commons.apache.org/lang/|Apache Commons Lang]]
>  * [[http://hc.apache.org/httpclient-3.x/|Apache Commons HTTP Client]]
>  * [[http://commons.apache.org/codec/|Apache Commons Codec]]
>  * [[http://tika.apache.org/|Apache Tika]]
>  * [[http://commons.apache.org/cli/|Apache Commons CLI]]
>  * [[http://poi.apache.org/|Apache POI]]
>
> The Any23 service is targeted to run within any compliant Servlet  container 
> like Tomcat.
>
> = Known Risks =
> == Orphaned Products ==
> The increasing number of Any23 adopters and the raising interest for Semantic 
> Web related technologies let us believe that there is a minimal risk for this 
> work to being abandoned  from the community. Moreover Any23 has already been 
> used in production by Sindice.com and  other DERI projects for years.
>
> == Inexperience with Open Source ==
> All of the committers have experience working in one or more open source 
> projects inside and outside ASF.
>
> == Homogeneous Developers ==
> The list of initial committers are geographically distributed across Europe 
> with no one company being associated with a majority of the developers.  Many 
> of these initial developers are experienced Apache committers already  and 
> all are experienced with working in distributed development communities.
>
> == Reliance on Salaried Developers ==
> To the best of our knowledge, the biggest part of the initial committers is 
> being paid to develop code for this project due to the adoption of Any23 in 
> their organizations infrastructures. In any case, some of the core historical 
> developers (some of them no longer getting paid from the original companies 
> behind Any23)  are still committing even if Any23 is not employed in their 
> actual organizations. Any23 has already proven its capability to attract 
> external developers.
>
> == Relationships with Other Apache Products ==
> In the last years, other projects have been under ASF incubation process 
> relying on the Semantic Web technology stack, such as Apache Clerezza, 
> Stanbol and Jena. This could be seen as a proof of the consolidation and the 
> adoption growing tendency of such technologies. Apart the specificity of 
> those projects, sharing the same underlying stack, Any23 could be employed in 
> every projects needing a reliable framework to access structured semantic 
> markup. Any23 core could be easily released also as a  
> [[http://wiki.apache.org/nutch/PluginCentral|Apache Nutch Plugin]] and then, 
> used to handy fill 
> [[http://www.openrdf.org/doc/sesame2/system/ch05.html|SAIL-compliant]] triple 
> stores.
>
> == An Excessive Fascination with the Apache Brand ==
> Even if the Any23 community recognizes the power and the attractiveness  of 
> the ASF brand, we are absolutely aware of our already established role in the 
> wider Semantic Web developers community. Any23 already proved its reliability 
> in closely support all the new specifications coming  from the Microformats 
> communities, our major contributors in term of  opened issues about new 
> feature requests. Furthermore, we are convinced that we can enthusiastically 
> bring inside the ASF new and fresh energies in order to improve our visions, 
> insights and knowledge about the other  projects and, most important, to have 
> the possibility of enlarge our small  community with talented and passionate 
> developers.
>
> = Documentation =
> Any23 Documentation
>
>  1. [[http://developers.any23.org/|Any23 Project Homepage]]
>  1. [[http://code.google.com/p/any23/|Any23 Developer Homepage]]
>  1. [[http://any23.org/|Any23 Live Demo]]
>
> Any23 Related Specifications
>
>  1. [[http://www.w3.org/RDF/|RDF]]
>  1. [[http://www.w3.org/TR/html5/|HTML5]]
>  1. [[http://www.w3.org/TR/rdfa-syntax/|RDFa]]
>  1. [[http://www.w3.org/TR/microdata/|Microdata]]
>  1. [[http://microformats.org/|Microformats]]
>  1. [[http://www.w3.org/TR/rdf-syntax-grammar/|RDF/XML]]
>  1. [[http://www.w3.org/TeamSubmission/turtle/|Turtle]]
>  1. [[http://www.w3.org/TR/rdf-testcases/#ntriples|N-Triples]]
>  1. [[http://sw.deri.org/2008/07/n-quads/|N-Quads]]
>
> Any23 Other documentation
>
>  1. 
> [[http://www.slideshare.net/dpalmisano/distilling-the-web-of-data-drop-by-drop-with-java|Any23
>  presentation on Slideshare]]
>
> = Initial Source =
> The intial source comprises code developed on 
> [[http://code.google.com/p/any23/|GoogleCode]] licensed under the Apache 
> License 2.0 (to be contributed under Grant from Giovanni Tummarello for 
> Any23).
>
> = Source and Intellectual Property Submission Plan =
> Source code will be moved from [[http://code.google.com/p/any23/|GoogleCode]] 
> space inside the SVN space of the podling.
>
> = External Dependencies =
> All the external dependencies (and their licenses) used by Any23 follows:
>
>  * [[http://nekohtml.sourceforge.net/|Nekohtml]] (Apache 2.0)
>  * [[http://www.openrdf.org|OpenRDF Sesame]] (BSD-style license)
>  * [[http://jetty.codehaus.org/jetty/|Jetty]] (Apache License 2.0 and Eclipse 
> Public License 1.0)
>  * [[http://code.google.com/p/jspf/|Java Simple Plugin Framework]] (new BSD 
> License)
>  * [[http://code.google.com/p/boilerpipe/[|Boilerpipe]] (Apache License 2.0)
>  * [[http://www.slf4j.org/|slf4j]] (MIT License)
>  * [[http://www.junit.org/|junit]] (Common Public License - v 1.0)
>  * [[http://mockito.org/|Mockito]] (MIT License)
>
> = Cryptography =
> The project does not handle cryptography in any way.
>
> = Required Resources =
>  * Mailing lists
>  * any23-private (with moderated subscriptions)
>  * any23-dev
>  * any23-user
>  * any23-commits
>  * Subversion directory
>  * https://svn.apache.org/repos/asf/incubator/any23
>  * Website
>  * Confluence (ANY23)
>  * Issue Tracking
>  * JIRA (ANY23)
>
> = Initial Committers =
> Names of initial committers - in alphabetical order - with current ASF status:
>
>  * Chris Mattmann <mattmann at apache dot org> (Member)
>  * Davide Palmisano <dpalmisano at gmail dot com> (ICLA signed)
>  * Giovanni Tumarello <giovanni dot tummarello at deri dot org> (ICLA signed)
>  * Lewis John !McGibbney <lewismc at apache dot org> (PMC Member)
>  * Michele Mostarda <michele dot mostarda at gmail dot com> (ICLA signed)
>  * Paul Ramirez <pramirez at apache dot org> (Member)
>  * Reto Bachmann-Gmuer <reto at apache dot org> (Committer)
>  * Szymon Danielczyk <danielczyk.szymon at gmail dot com> (ICLA signed)
>
> = Sponsors =
> == Champion ==
>  * Chris Mattmann <mattmann at apache dot org> (Member)
>
> == Nominated Mentors ==
>  * Chris Mattmann <mattmann at apache dot org>
>  * Paul Ramirez <pramirez at apache dot org>
>  * Simone Tripodi <simonetripodi at apache dot org>
>  * Tommaso Teofili <tommaso at apache dot org>
>
> == Sponsoring Entity ==
>  * Tika PMC
>
> = Other interested people (in alphabetical order) =
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattm...@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>



-- 
Olivier Lamy
Talend : http://talend.com
http://twitter.com/olamy | http://linkedin.com/in/olamy

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to