Re: your opinion on commons-rdf proposal

Reto Gmür Sun, 15 Feb 2015 12:54:22 -0800

My SMTP server was having some problems so the mail below was not posted.

I've made some more commits today with some initial code of a Sparql backed
implementation, but the interesting bits are still missing....



Cheers,
Reto

On Sun, Feb 8, 2015 at 6:37 PM, Reto Gmür <r...@apache.org> wrote:

> Hi Minto, all,
>
> As you suggested I removed the event listener support from the Graph
> Interface, it is now part of the WatchableGraph extending interface.
>
> Also I've created a new impl.utils project providing mainly abstract
> classes to facilitate implementations.With this implementors don't have to
> care about locking and about graph-isomorphism (for .equals in
> ImmutableGraph).
>
> As the final commons-rdf will take a while to emerge I think we should
> integrate already intermediate steps in clerezza for a smoother transition
> and especially to avoid working on different incompatible branches.
>
> Cheers,
> Reto
>
> On Fri, Jan 23, 2015 at 2:12 PM, Reto Gmür <r...@apache.org> wrote:
>
>> Hi Minto
>>
>> Thanks for your comments.
>>
>> >     1) I am glad you chose to derive from Collections. This opens up the
>>> >     possibility to use Java 8 streams to improve performance
>>> especially in
>>> >     the filter() method.
>>> >     2) Hmm, is filter() still required if we can use java 8 streams
>>> >     (collection.stream().filter())?
>>>
>>
>> I think only a dedicated filter method can be implemented perfomantly
>> (i.e. using indexex). Correct me if I'm wron, but I think with
>> stream().filter() an implementation would have to apply the function to
>> every triple.
>>
>>
>>> >     3) I dislike BlankNodeOrIri interface name. Judging from the
>>> >     github:commons-rdf comments the name should be Subject. Taking your
>>> >     comments Resource might be a better name. BTW, the comments for
>>> this
>>> >     interface differ between your sandbox and the github commons-rdf.
>>>
>>
>> BlankNodeOrIri used to be called NonLiteral. The term "resource" is used
>> in RDFS and also includes literals. So the old Resource interface is
>> equivalent to the new RdfTerm interface. The API documentation needs to be
>> improved, as it still used old terms.
>>
>>
>>
>>
>>> >     4) Why does GraphEvent only has one triple? What if you remove/add
>>> a
>>> >     large number triples?
>>>
>>
>> If one requests synchronous notification one always gets one event at the
>> time (except maybe for addAll, removeAll, ratinAll and clear). With
>> asynchronous notification one will get a bigger list of events. I think it
>> is better to get add-events and remove-events together, rather than getting
>> a single add-event with all the added triples and a single remove event
>> with all the removed triples.
>>
>>
>>> >     5) Events are not ready for extension. AddEvent accually is
>>> something
>>> >     like AddedTriple(s)Event. Same for remove. The (s) depends on the
>>> >     outcome of the previous point. See next point for additional
>>> events.
>>>
>> >     6) The API misses facilities to access/create/query graphs. If
>>> >     this gets
>>> >     included you probably also end up with events like AddedGraphEvent
>>> >     ditto
>>> >     for remove. For this I envision something along the lines of JDBC
>>> and
>>> >     DataSources.
>>>
>> You're right. For now there is no DataSet (aka TcProvider) in the API.
>> The main reason for this was to keep the scope close to github proposal. If
>> we add DataSet we should add respective events (DataSetEvent).
>>
>>
>>
>>> >     7) Also the whole event mechanism might be extremely difficult to
>>> >     realise. Of course from within the implementation it is easy, but
>>> >     think
>>> >     distributed here. Take for instance a sparql endpoints. It is
>>> >     relativily
>>> >     straightforward to create an implementation for this except for the
>>> >     eventing part. I wouldn't know how to implement eventing without
>>> >     polling
>>> >     the sparql endpoint every so often. Shouldn't events be something
>>> >     additional/optional.
>>>
>> Having a graph implementation backed by a SPARQL endpoint is not trivial
>> (unless you don't care aboutr blank nodes).
>>
>> The question is if the API must guarantee that all changes to the graph
>> fire respective events or if it is acceptable for an implementation to only
>> notify about changes via the instance. I think that even the latter is
>> useful. And given some abstract implementation classes support doesn't cost
>> a lot of effort by the implementors.
>>
>> If on the other we remove that from the core API and provide a
>> WatchableGraph API as extension we could provide a wrapper for
>> non-watchable graphs. I think both approaches would work.
>>
>>
>>
>>
>>> >
>>> >     So far for quickly scanning things.
>>> >
>>> >     Personally I'd also like to see a pure in memory based
>>> >     implementation it
>>> >     not only makes testing things easier for the API users, but also
>>> >     helps focus
>>> >     on what is best for a clean/clear API.
>>>
>> I suggest we use what is now the IndexedMGraph in Apache Stanbol for
>> this. This provide a more acceptable performance than the SimpleMGraph from
>> clerezza.
>>
>>
>>> Like I mentioned before,
>>> >     the API
>>> >     should be leading NOT the implementation. Also a test
>>> >     compatibility kit
>>> >     (TCK) might come in handy to ensure other implementations work as
>>> >     expected.
>>>
>>
>> Currently in clerezza we have rdf.core.tests part of it could become a
>> part of commons-test.
>>
>> I agree this is very important to ensure interoperability.
>>
>>
>> Cheers,
>>
>> Reto
>>
>>
>>> >
>>> >     And if we get this far we might as well try to make it a standard
>>> by
>>> >     submitting a JSR ;-)
>>>
>>
>>
>>
>>> >
>>> >     Regards,
>>> >
>>> >     Minto
>>> >
>>> >
>>> >     Reto Gmür schreef op 14-1-2015 om 15:15:
>>> >     > Hi Minto
>>> >     >
>>> >     > I would be very interested to learn abou your opinion on the
>>> >     > commons-rdf proposal I recently committed.
>>> >     >
>>> >     > Cheers,
>>> >     > Reto
>>> >
>>>
>>
>>
>

Re: your opinion on commons-rdf proposal

Reply via email to