On 08/11/10 18:32, Jeremy Carroll wrote:
To make the commons discussion more concrete I would suggest the
following items for the commons:

- an IRI library
- some code to do with vocabularies.
- connecting to a URL and doing semweb aware content negotiation (this
is typically done badly)

(Actually the IRI code should probably be wider, Jena initially used the
xerces URI code but then the needs exceeded what they supported)

Jeremy

Good idea. The IRI code is independent of the rest of Jena and is valuable in it's own right.

ARP (Jena RDF/XML parser) is also independent of the Jena code structure and once was (is it still possible to get just ARP?). It's just the final step of generation that turns the output of parsing into Jena-specific objects. Might be worth splitting out if it would be useful.

The lowest level of RIOT parsing, which defines the tokens for creating any of the Turtle family of langauges, is not Jena dependent. The actual RIOT parsers themselves are as they directly generate Jena-specific objects to avoid the copy overhead. It's a performance trade-off.

[RIOT is a set of faster parsers for non-XML serializations of RDF, currently part of ARQ, but should migrate to Jena core when fully stable. - original need was parsers for formats capable of delivering to the TDB database at full loading speed without heavy CPU load.]

But the command line tools based on RIOT which parse or validate one format are reusable - they use Jena internally, but the input and output are completely standard.

The RDF validator Eyeball is also a useful tool in its own right.

        Andy


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Reply via email to