On 17/01/15 12:00, Bruno P. Kinoshita wrote:
Hi Andy!

Jena can (and does) support multiple APIs over a common core.

A commons-rdf API can be added along side the existing APIs; that means
it is not a "big bang" to have commons-rdf interfaces supported.

That's great! Would the commons-rdf dependency go in jena-core/pom.xml? Is it 
going to be necessary to change some classes in the core? I think it will be 
transparent for other modules like ARQ, Fuseki, Text. Is that right?

I don't think so - Jena's core is "generalized" RDF and this is important.

Just adding any new interfaces to the code Node (etc) objects isn't ideal: you get multiple method names for the same thing. And the hashcode/equality contract to work across implementations (hashCode() of implementation A must be the same as hashCode() of implementation B when equality is the same ) is really quite tricky.

See also my comments about using classes not interfaces.

I personally do not see the worry about wrappers - for me the importance is the architectural difference of a presentation API, designed for applications to write code against, and systems API, designed to support the machinery. Java is really rather good at optimizing away the cost of wrappers, including with multisite method dispatch optimizations and coping with dynamic loading code that changes assumptions at a later time.

So a new module that is "jena-commons-rdf" that provides an application presentation API woudl be the obvious route to me. Fuseki etc

And this is only RDF, not Datasets or SPARQL. We discussed that and fairly easily came to the conclusion that getting some common sooner was better than a complete set of APIs. Some of the natural other ones are a lot more complicated - they would build on the terms provided by commons-rdf.

There is a lot more to working with RDF than the RDF API part - SPARQL
engines don't use that API if they want performance and/or scale. (1)
SPARQL queries collections of graphs and (2) for scale+persistence, you
need to work in parts at a level somewhat lower level than java objects,
and closer to the binary of persistence structures.

Good point. I'm enjoying learning about Jena code for JENA-632. Even though 
datasets, streaming queries collections and all that part about journaling and 
graph persistence can be a bit scary.

:-)

Luckily, journalling and persistent is orthogonal to implementation JENA-632 though as a application feature mapped over the whole system, its a good way of seeing across several components.

Probably that won't be covered in the commons-rdf, but I think that's correct.

I agree - there is a new world out here - a world of large memory machines, and quite likely, large scale persistent RAM in the not too distant future. Given the longevity of shared APIs, it's very hard to find a balance across requirements and expectations. The graph level is naturally driven by the specs but as soon as systems issues get thrown into the mix, the choice space is much larger.

        Andy


Thanks!
Bruno


----- Original Message -----
From: Andy Seaborne <a...@apache.org>
To: dev@commons.apache.org
Cc:
Sent: Saturday, January 17, 2015 7:40 AM
Subject: Re: [ALL][RDF] github Commons RDF vs. Apache Commons Sandbox RDF

On 15/01/15 11:52, Bruno P. Kinoshita wrote:

  Hello!


  I feel like I can't help much in the current discussion. But just
wanted to chime in
  and tell that I'm +1 for a [rdf] component in Apache Commons. As a
commons committer I'd
  like to help.

  I started watching the GitHub repository and have subscribed to the ongoing
discussion. I'll

  tryto contribute in some way; maybe testing and with small patches.


  My go-to Maven dependency for RDF, Turtle, N3, working with ontologies,
reasoners, etc,

  is Apache Jena. I think it would be very positive to have a common
interface that I could
  use in my code (mainly crawlers and data munging for Hadoop jobs) and that
would work

  with different implementations.


  Thanks!

  Bruno

Since you mention Jena ... :-)

Jena can (and does) support multiple APIs over a common core.

A commons-rdf API can be added along side the existing APIs; that means
it is not a "big bang" to have commons-rdf interfaces supported.

There is a lot more to working with RDF than the RDF API part - SPARQL
engines don't use that API if they want performance and/or scale. (1)
SPARQL queries collections of graphs and (2) for scale+persistence, you
need to work in parts at a level somewhat lower level than java objects,
and closer to the binary of persistence structures.

     Andy


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to