[jira] [Created] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult
Márton Balassi created FLINK-1767: - Summary: StreamExecutionEnvironment's execute should return JobExecutionResult Key: FLINK-1767 URL: https://issues.apache.org/jira/browse/FLINK-1767 Project: Flink Issue Type: Improvement Components: Streaming Reporter: Márton Balassi Although the streaming API does not make use of the accumulators it is still a nice handle for the execution time and might wrap other features in the future. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Queries regarding RDFs with Flink
Hi Stephan, thanks for the response. Unfortunately I'm not familiar with the new Gelly APIs and the old Spargel ones (I still don't understand the difference actually). Do you think it is possible to add such an example to the documentation/examples? Best, Flavio On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen wrote: > Hi Flavio! > > I see initially two ways of doing this: > > 1) Do a series of joins. You start with your subject and join two or three > times using the "objects-from-triplets == subject" to make one hop. You can > filter the verbs from the triplets before if you are only interested in a > special relationship. > > 2) If you want to recursively explode the subgraph (something like all > reachable subjects) or do a rather long series of hops, then you should be > able to model this nicely as a delta iterations, or as a vertex-centric > graph computation. For that, you can use both "Gelly" (the graph library) > or the standalone Spargel operator (Giraph-like). > > Does that help with your questions? > > Greetings, > Stephan > > > On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier > wrote: > > > Hi to all, > > I'm back to this task again :) > > > > Summarizing again: I have some source dataset that has contains RDF > "stars" > > (SubjectURI, RdfType and a list of RDF triples belonging to this subject > -> > > the "a.k.a." star schema) > > and I have to extract some sub-graphs for some RDF types of interest. > > As described in the previous email I'd like to expand some root node (if > > its type is of interest) and explode some of its path(s). > > For example, if I'm interested in the expansion of rdf type Person (as in > > the example), I could want to create a mini-graph with all of its triples > > plus those obtained exploding the path(s) > > knows.marriedWith and knows.knows.knows. > > At the moment I do it with a punctual get from HBase but I didn't > > get whether this could be done more efficiently with other strategies in > > Flink. > > @Vasiliki: you said that I could need "something like a BFS from each > > vertex". Do you have an example that could fit my use case? Is it > possible > > to filter out those vertices I'm interested in? > > > > Thanks in advance, > > Flavio > > > > > > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri < > > vasilikikala...@gmail.com> > > wrote: > > > > > Hi Flavio, > > > > > > if you want to use Gelly to model your data as a graph, you can load > your > > > Tuple3s as Edges. > > > This will result in "http://test/John";, "Person", "Frank", etc to be > > > vertices and "type", "name", "knows" to be edge values. > > > In the first case, you can use filterOnEdges() to get the subgraph with > > the > > > relation edges. > > > > > > Once you have the graph, you could probably use a vertex-centric > > iteration > > > to generate the trees. > > > It seems to me that you need something like a BFS from each vertex. > Keep > > in > > > mind that this can be a very costly operation in terms of memory and > > > communication for large graphs. > > > > > > Let me know if you have any questions! > > > > > > Cheers, > > > V. > > > > > > On 3 March 2015 at 09:13, Flavio Pompermaier > > wrote: > > > > > > > I have a nice case of RDF manipulation :) > > > > Let's say I have the following RDF triples (Tuple3) in two files or > > > tables: > > > > > > > > TABLE A: > > > > http://test/John, type, Person > > > > http://test/John, name, John > > > > http://test/John, knows, http://test/Mary > > > > http://test/John, knows, http://test/Jerry > > > > http://test/Jerry, type, Person > > > > http://test/Jerry, name, Jerry > > > > http://test/Jerry, knows, http://test/Frank > > > > http://test/Mary, type, Person > > > > http://test/Mary, name, Mary > > > > > > > > TABLE B: > > > > http://test/Frank, type, Person > > > > http://test/Frank, name, Frank > > > > http://test/Frank, marriedWith, http://test/Mary > > > > > > > > What is the best way to build up Person-rooted trees with all node's > > data > > > > properties and some expanded path like 'Person.knows.marriedWith' ? > > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple punctuals > > get > > > > from a Key/value store or what? > > > > > > > > The expected 4 trees should be: > > > > > > > > tree 1 (root is John) -- > > > > http://test/John, type, Person > > > > http://test/John, name, John > > > > http://test/John, knows, http://test/Mary > > > > http://test/John, knows, http://test/Jerry > > > > http://test/Jerry, type, Person > > > > http://test/Jerry, name, Jerry > > > > http://test/Jerry, knows, http://test/Frank > > > > http://test/Mary, type, Person > > > > http://test/Mary, name, Mary > > > > http://test/Frank, type, Person > > > > http://test/Frank, name, Frank > > > > http://test/Frank, marriedWith, http://test/Mary > > > > > > > > tree 2 (root is Jerry) -- > > > > http://test/Jerry, type, Person > > > > http://test/Jerry, name, Jerry > > > > http://test/Jerry, knows,
Re: Queries regarding RDFs with Flink
Gelly has a section in the docs, it should explain the vertex-centric iterations. Is that not extensive enough? Am 22.03.2015 12:04 schrieb "Flavio Pompermaier" : > Hi Stephan, > thanks for the response. Unfortunately I'm not familiar with the new Gelly > APIs and the old Spargel ones (I still don't understand the difference > actually). > Do you think it is possible to add such an example to the > documentation/examples? > > Best, > Flavio > > > > On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen wrote: > > > Hi Flavio! > > > > I see initially two ways of doing this: > > > > 1) Do a series of joins. You start with your subject and join two or > three > > times using the "objects-from-triplets == subject" to make one hop. You > can > > filter the verbs from the triplets before if you are only interested in a > > special relationship. > > > > 2) If you want to recursively explode the subgraph (something like all > > reachable subjects) or do a rather long series of hops, then you should > be > > able to model this nicely as a delta iterations, or as a vertex-centric > > graph computation. For that, you can use both "Gelly" (the graph library) > > or the standalone Spargel operator (Giraph-like). > > > > Does that help with your questions? > > > > Greetings, > > Stephan > > > > > > On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier < > pomperma...@okkam.it> > > wrote: > > > > > Hi to all, > > > I'm back to this task again :) > > > > > > Summarizing again: I have some source dataset that has contains RDF > > "stars" > > > (SubjectURI, RdfType and a list of RDF triples belonging to this > subject > > -> > > > the "a.k.a." star schema) > > > and I have to extract some sub-graphs for some RDF types of interest. > > > As described in the previous email I'd like to expand some root node > (if > > > its type is of interest) and explode some of its path(s). > > > For example, if I'm interested in the expansion of rdf type Person (as > in > > > the example), I could want to create a mini-graph with all of its > triples > > > plus those obtained exploding the path(s) > > > knows.marriedWith and knows.knows.knows. > > > At the moment I do it with a punctual get from HBase but I didn't > > > get whether this could be done more efficiently with other strategies > in > > > Flink. > > > @Vasiliki: you said that I could need "something like a BFS from each > > > vertex". Do you have an example that could fit my use case? Is it > > possible > > > to filter out those vertices I'm interested in? > > > > > > Thanks in advance, > > > Flavio > > > > > > > > > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri < > > > vasilikikala...@gmail.com> > > > wrote: > > > > > > > Hi Flavio, > > > > > > > > if you want to use Gelly to model your data as a graph, you can load > > your > > > > Tuple3s as Edges. > > > > This will result in "http://test/John";, "Person", "Frank", etc to be > > > > vertices and "type", "name", "knows" to be edge values. > > > > In the first case, you can use filterOnEdges() to get the subgraph > with > > > the > > > > relation edges. > > > > > > > > Once you have the graph, you could probably use a vertex-centric > > > iteration > > > > to generate the trees. > > > > It seems to me that you need something like a BFS from each vertex. > > Keep > > > in > > > > mind that this can be a very costly operation in terms of memory and > > > > communication for large graphs. > > > > > > > > Let me know if you have any questions! > > > > > > > > Cheers, > > > > V. > > > > > > > > On 3 March 2015 at 09:13, Flavio Pompermaier > > > wrote: > > > > > > > > > I have a nice case of RDF manipulation :) > > > > > Let's say I have the following RDF triples (Tuple3) in two files or > > > > tables: > > > > > > > > > > TABLE A: > > > > > http://test/John, type, Person > > > > > http://test/John, name, John > > > > > http://test/John, knows, http://test/Mary > > > > > http://test/John, knows, http://test/Jerry > > > > > http://test/Jerry, type, Person > > > > > http://test/Jerry, name, Jerry > > > > > http://test/Jerry, knows, http://test/Frank > > > > > http://test/Mary, type, Person > > > > > http://test/Mary, name, Mary > > > > > > > > > > TABLE B: > > > > > http://test/Frank, type, Person > > > > > http://test/Frank, name, Frank > > > > > http://test/Frank, marriedWith, http://test/Mary > > > > > > > > > > What is the best way to build up Person-rooted trees with all > node's > > > data > > > > > properties and some expanded path like 'Person.knows.marriedWith' ? > > > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple > punctuals > > > get > > > > > from a Key/value store or what? > > > > > > > > > > The expected 4 trees should be: > > > > > > > > > > tree 1 (root is John) -- > > > > > http://test/John, type, Person > > > > > http://test/John, name, John > > > > > http://test/John, knows, http://test/Mary > > > > > http://test/John, knows, http://test/Jerry > > > > > http://test/Jerry, type, Person > >
Re: Queries regarding RDFs with Flink
Hi Flavio, also, Gelly is a superset of Spargel. It provides the same features and much more. Since RDF is graph-structured, Gelly might be a good fit for your use case. Cheers, Fabian
Re: Queries regarding RDFs with Flink
Is there anu example about rdf graph generation based on a skeleton structure? On Mar 22, 2015 12:28 PM, "Fabian Hueske" wrote: > Hi Flavio, > > also, Gelly is a superset of Spargel. It provides the same features and > much more. > > Since RDF is graph-structured, Gelly might be a good fit for your use case. > > Cheers, Fabian >
Re: Queries regarding RDFs with Flink
Hi Flavio, We don't have a specific example for generating RDF graphs using Gelly, but I will try to drop some lines of code here and hope you will find them useful. An RDF statement is formed of Subject - Predicate - Object triples. In Edge notation, the Subject and the Object will be the source and target vertices respectively, while the edge value will be the predicate. This being said, say you have an input plain text file that represents the edges. A line would look like this : http://test/Frank, marriedWith, http://test/Mary This is internally coded in Flink like a Tuple3. So, to read this edge file you should have something like this: private static DataSet> getEdgesDataSet(ExecutionEnvironment env) { if (fileOutput) { return env.readCsvFile(edgesInputPath) .lineDelimiter("\n") // the subject, predicate, object .types(String.class, String.class, String.class) .map(new MapFunction, Edge>() { @Override public Edge map(Tuple3 tuple3) throws Exception { return new Edge(tuple3.f0, tuple3.f2, tuple3.f1); } }); } else { return getDefaultEdges(env); } } After you have this, in your main method, you just write: Graph rdfGraph = Graph.fromDataSet(edges, env); I picked up the conversation later on, not sure if that's what you meant by "graph generation"... Cheers, Andra On Sun, Mar 22, 2015 at 12:42 PM, Flavio Pompermaier wrote: > Is there anu example about rdf graph generation based on a skeleton > structure? > On Mar 22, 2015 12:28 PM, "Fabian Hueske" wrote: > > > Hi Flavio, > > > > also, Gelly is a superset of Spargel. It provides the same features and > > much more. > > > > Since RDF is graph-structured, Gelly might be a good fit for your use > case. > > > > Cheers, Fabian > > >
Re: Queries regarding RDFs with Flink
Thanks Andrea for the help! For graph generation I mean that I'd like to materialize subgraphs of my overall knowledge starting from some root nodes whose rdf type is of interest (something similar to what JSON-LD does). Is that clear? On Mar 22, 2015 1:09 PM, "Andra Lungu" wrote: > Hi Flavio, > > We don't have a specific example for generating RDF graphs using Gelly, but > I will try to drop some lines of code here and hope you will find them > useful. > > An RDF statement is formed of Subject - Predicate - Object triples. In Edge > notation, the Subject and the Object will be the source and target vertices > respectively, while the edge value will be the predicate. > > This being said, say you have an input plain text file that represents the > edges. > A line would look like this : http://test/Frank, marriedWith, > http://test/Mary > > This is internally coded in Flink like a Tuple3. So, to read this edge file > you should have something like this: > > private static DataSet> > getEdgesDataSet(ExecutionEnvironment env) { >if (fileOutput) { > return env.readCsvFile(edgesInputPath) > .lineDelimiter("\n") > > // the subject, predicate, object > > .types(String.class, String.class, String.class) > .map(new MapFunction, > Edge String>>() { > >@Override >public Edge map(Tuple3 String> tuple3) throws Exception { > return new Edge(tuple3.f0, tuple3.f2, tuple3.f1); >} > }); >} else { > return getDefaultEdges(env); >} > } > > After you have this, in your main method, you just write: > Graph rdfGraph = Graph.fromDataSet(edges, env); > > I picked up the conversation later on, not sure if that's what you meant by > "graph generation"... > > Cheers, > Andra > > On Sun, Mar 22, 2015 at 12:42 PM, Flavio Pompermaier > > wrote: > > > Is there anu example about rdf graph generation based on a skeleton > > structure? > > On Mar 22, 2015 12:28 PM, "Fabian Hueske" wrote: > > > > > Hi Flavio, > > > > > > also, Gelly is a superset of Spargel. It provides the same features and > > > much more. > > > > > > Since RDF is graph-structured, Gelly might be a good fit for your use > > case. > > > > > > Cheers, Fabian > > > > > >
[jira] [Created] (FLINK-1768) Fix the bug of BlobServerConnection's LOG.
Sibao Hong created FLINK-1768: - Summary: Fix the bug of BlobServerConnection's LOG. Key: FLINK-1768 URL: https://issues.apache.org/jira/browse/FLINK-1768 Project: Flink Issue Type: Bug Reporter: Sibao Hong The LOG in class of BlobServerConnection should be created by classloader of BlobServerConnection.class -- This message was sent by Atlassian JIRA (v6.3.4#6332)