Hi Flavio! I see initially two ways of doing this:
1) Do a series of joins. You start with your subject and join two or three times using the "objects-from-triplets == subject" to make one hop. You can filter the verbs from the triplets before if you are only interested in a special relationship. 2) If you want to recursively explode the subgraph (something like all reachable subjects) or do a rather long series of hops, then you should be able to model this nicely as a delta iterations, or as a vertex-centric graph computation. For that, you can use both "Gelly" (the graph library) or the standalone Spargel operator (Giraph-like). Does that help with your questions? Greetings, Stephan On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier <pomperma...@okkam.it> wrote: > Hi to all, > I'm back to this task again :) > > Summarizing again: I have some source dataset that has contains RDF "stars" > (SubjectURI, RdfType and a list of RDF triples belonging to this subject -> > the "a.k.a." star schema) > and I have to extract some sub-graphs for some RDF types of interest. > As described in the previous email I'd like to expand some root node (if > its type is of interest) and explode some of its path(s). > For example, if I'm interested in the expansion of rdf type Person (as in > the example), I could want to create a mini-graph with all of its triples > plus those obtained exploding the path(s) > knows.marriedWith and knows.knows.knows. > At the moment I do it with a punctual get from HBase but I didn't > get whether this could be done more efficiently with other strategies in > Flink. > @Vasiliki: you said that I could need "something like a BFS from each > vertex". Do you have an example that could fit my use case? Is it possible > to filter out those vertices I'm interested in? > > Thanks in advance, > Flavio > > > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> > wrote: > > > Hi Flavio, > > > > if you want to use Gelly to model your data as a graph, you can load your > > Tuple3s as Edges. > > This will result in "http://test/John", "Person", "Frank", etc to be > > vertices and "type", "name", "knows" to be edge values. > > In the first case, you can use filterOnEdges() to get the subgraph with > the > > relation edges. > > > > Once you have the graph, you could probably use a vertex-centric > iteration > > to generate the trees. > > It seems to me that you need something like a BFS from each vertex. Keep > in > > mind that this can be a very costly operation in terms of memory and > > communication for large graphs. > > > > Let me know if you have any questions! > > > > Cheers, > > V. > > > > On 3 March 2015 at 09:13, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > > > > > I have a nice case of RDF manipulation :) > > > Let's say I have the following RDF triples (Tuple3) in two files or > > tables: > > > > > > TABLE A: > > > http://test/John, type, Person > > > http://test/John, name, John > > > http://test/John, knows, http://test/Mary > > > http://test/John, knows, http://test/Jerry > > > http://test/Jerry, type, Person > > > http://test/Jerry, name, Jerry > > > http://test/Jerry, knows, http://test/Frank > > > http://test/Mary, type, Person > > > http://test/Mary, name, Mary > > > > > > TABLE B: > > > http://test/Frank, type, Person > > > http://test/Frank, name, Frank > > > http://test/Frank, marriedWith, http://test/Mary > > > > > > What is the best way to build up Person-rooted trees with all node's > data > > > properties and some expanded path like 'Person.knows.marriedWith' ? > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple punctuals > get > > > from a Key/value store or what? > > > > > > The expected 4 trees should be: > > > > > > tree 1 (root is John) ------------------ > > > http://test/John, type, Person > > > http://test/John, name, John > > > http://test/John, knows, http://test/Mary > > > http://test/John, knows, http://test/Jerry > > > http://test/Jerry, type, Person > > > http://test/Jerry, name, Jerry > > > http://test/Jerry, knows, http://test/Frank > > > http://test/Mary, type, Person > > > http://test/Mary, name, Mary > > > http://test/Frank, type, Person > > > http://test/Frank, name, Frank > > > http://test/Frank, marriedWith, http://test/Mary > > > > > > tree 2 (root is Jerry) ------------------ > > > http://test/Jerry, type, Person > > > http://test/Jerry, name, Jerry > > > http://test/Jerry, knows, http://test/Frank > > > http://test/Frank, type, Person > > > http://test/Frank, name, Frank > > > http://test/Frank, marriedWith, http://test/Mary > > > http://test/Mary, type, Person > > > http://test/Mary, name, Mary > > > > > > tree 3 (root is Mary) ------------------ > > > http://test/Mary, type, Person > > > http://test/Mary, name, Mary > > > > > > tree 4 (root is Frank) ------------------ > > > http://test/Frank, type, Person > > > http://test/Frank, name, Frank > > > http://test/Frank, marriedWith, http://test/Mary > > > http://test/Mary, type, Person > > > http://test/Mary, name, Mary > > > > > > Thanks in advance, > > > Flavio > > > > > > On Mon, Mar 2, 2015 at 5:04 PM, Stephan Ewen <se...@apache.org> wrote: > > > > > > > Hey Santosh! > > > > > > > > RDF processing often involves either joins, or graph-query like > > > operations > > > > (transitive). Flink is fairly good at both types of operations. > > > > > > > > I would look into the graph examples and the graph API for a start: > > > > > > > > - Graph examples: > > > > > > > > > > > > > > https://github.com/apache/flink/tree/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/graph > > > > - Graph API: > > > > > > > > > > > > > > https://github.com/apache/flink/tree/master/flink-staging/flink-gelly/src/main/java/org/apache/flink/graph > > > > > > > > If you have a more specific question, I can give you better pointers > > ;-) > > > > > > > > Stephan > > > > > > > > > > > > On Fri, Feb 27, 2015 at 4:48 PM, santosh_rajaguru <sani...@gmail.com > > > > > > wrote: > > > > > > > > > Hello, > > > > > > > > > > how can flink be useful for processing the data to RDFs and build > the > > > > > ontology? > > > > > > > > > > Regards, > > > > > Santosh > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > View this message in context: > > > > > > > > > > > > > > > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Queries-regarding-RDFs-with-Flink-tp4130.html > > > > > Sent from the Apache Flink (Incubator) Mailing List archive. > mailing > > > list > > > > > archive at Nabble.com. > > > > > > > > > > > > > > >