date:20150322

[jira] [Created] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult

2015-03-22 Thread JIRA

Márton Balassi created FLINK-1767:
-

 Summary: StreamExecutionEnvironment's execute should return 
JobExecutionResult
 Key: FLINK-1767
 URL: https://issues.apache.org/jira/browse/FLINK-1767
 Project: Flink
  Issue Type: Improvement
  Components: Streaming
Reporter: Márton Balassi


Although the streaming API does not make use of the accumulators it is still a 
nice handle for the execution time and might wrap other features in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Flavio Pompermaier

Hi Stephan,
thanks for the response. Unfortunately I'm not familiar with the new Gelly
APIs and the old Spargel ones (I still don't understand the difference
actually).
Do you think it is possible to add such an example to the
documentation/examples?

Best,
Flavio



On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen  wrote:

> Hi Flavio!
>
> I see initially two ways of doing this:
>
> 1) Do a series of joins. You start with your subject and join two or three
> times using the "objects-from-triplets == subject" to make one hop. You can
> filter the verbs from the triplets before if you are only interested in a
> special relationship.
>
> 2) If you want to recursively explode the subgraph (something like all
> reachable subjects) or do a rather long series of hops, then you should be
> able to model this nicely as a delta iterations, or as a vertex-centric
> graph computation. For that, you can use both "Gelly" (the graph library)
> or the standalone Spargel operator (Giraph-like).
>
> Does that help with your questions?
>
> Greetings,
> Stephan
>
>
> On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier 
> wrote:
>
> > Hi to all,
> > I'm back to this task again :)
> >
> > Summarizing again: I have some source dataset that has contains RDF
> "stars"
> > (SubjectURI, RdfType and a list of RDF triples belonging to this subject
> ->
> > the "a.k.a." star schema)
> > and I have to extract some sub-graphs for some RDF types of interest.
> > As described in the previous email I'd like to expand some root node (if
> > its type is of interest) and explode some of its path(s).
> > For example, if I'm interested in the expansion of rdf type Person (as in
> > the example), I could want to create a mini-graph with all of its triples
> > plus those obtained exploding the path(s)
> > knows.marriedWith and knows.knows.knows.
> > At the moment I do it with a punctual get from HBase but I didn't
> > get whether this could be done more efficiently with other strategies in
> > Flink.
> > @Vasiliki: you said that I could need "something like a BFS from each
> > vertex".  Do you have an example that could fit my use case? Is it
> possible
> > to filter out those vertices I'm interested in?
> >
> > Thanks in advance,
> > Flavio
> >
> >
> > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri <
> > vasilikikala...@gmail.com>
> > wrote:
> >
> > > Hi Flavio,
> > >
> > > if you want to use Gelly to model your data as a graph, you can load
> your
> > > Tuple3s as Edges.
> > > This will result in "http://test/John";, "Person", "Frank", etc to be
> > > vertices and "type", "name", "knows" to be edge values.
> > > In the first case, you can use filterOnEdges() to get the subgraph with
> > the
> > > relation edges.
> > >
> > > Once you have the graph, you could probably use a vertex-centric
> > iteration
> > > to generate the trees.
> > > It seems to me that you need something like a BFS from each vertex.
> Keep
> > in
> > > mind that this can be a very costly operation in terms of memory and
> > > communication for large graphs.
> > >
> > > Let me know if you have any questions!
> > >
> > > Cheers,
> > > V.
> > >
> > > On 3 March 2015 at 09:13, Flavio Pompermaier 
> > wrote:
> > >
> > > > I have a nice case of RDF manipulation :)
> > > > Let's say I have the following RDF triples (Tuple3) in two files or
> > > tables:
> > > >
> > > > TABLE A:
> > > > http://test/John, type, Person
> > > > http://test/John, name, John
> > > > http://test/John, knows, http://test/Mary
> > > > http://test/John, knows, http://test/Jerry
> > > > http://test/Jerry, type, Person
> > > > http://test/Jerry, name, Jerry
> > > > http://test/Jerry, knows, http://test/Frank
> > > > http://test/Mary, type, Person
> > > > http://test/Mary, name, Mary
> > > >
> > > > TABLE B:
> > > > http://test/Frank, type, Person
> > > > http://test/Frank, name, Frank
> > > > http://test/Frank, marriedWith, http://test/Mary
> > > >
> > > > What is the best way to build up Person-rooted trees with all node's
> > data
> > > > properties and some expanded path like 'Person.knows.marriedWith' ?
> > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple punctuals
> > get
> > > > from a Key/value store or what?
> > > >
> > > > The expected 4 trees should be:
> > > >
> > > > tree 1 (root is John) --
> > > > http://test/John, type, Person
> > > > http://test/John, name, John
> > > > http://test/John, knows, http://test/Mary
> > > > http://test/John, knows, http://test/Jerry
> > > > http://test/Jerry, type, Person
> > > > http://test/Jerry, name, Jerry
> > > > http://test/Jerry, knows, http://test/Frank
> > > > http://test/Mary, type, Person
> > > > http://test/Mary, name, Mary
> > > > http://test/Frank, type, Person
> > > > http://test/Frank, name, Frank
> > > > http://test/Frank, marriedWith, http://test/Mary
> > > >
> > > > tree 2 (root is Jerry) --
> > > > http://test/Jerry, type, Person
> > > > http://test/Jerry, name, Jerry
> > > > http://test/Jerry, knows,

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Stephan Ewen

Gelly has a section in the docs, it should explain the vertex-centric
iterations. Is that not extensive enough?
Am 22.03.2015 12:04 schrieb "Flavio Pompermaier" :

> Hi Stephan,
> thanks for the response. Unfortunately I'm not familiar with the new Gelly
> APIs and the old Spargel ones (I still don't understand the difference
> actually).
> Do you think it is possible to add such an example to the
> documentation/examples?
>
> Best,
> Flavio
>
>
>
> On Sat, Mar 21, 2015 at 7:48 PM, Stephan Ewen  wrote:
>
> > Hi Flavio!
> >
> > I see initially two ways of doing this:
> >
> > 1) Do a series of joins. You start with your subject and join two or
> three
> > times using the "objects-from-triplets == subject" to make one hop. You
> can
> > filter the verbs from the triplets before if you are only interested in a
> > special relationship.
> >
> > 2) If you want to recursively explode the subgraph (something like all
> > reachable subjects) or do a rather long series of hops, then you should
> be
> > able to model this nicely as a delta iterations, or as a vertex-centric
> > graph computation. For that, you can use both "Gelly" (the graph library)
> > or the standalone Spargel operator (Giraph-like).
> >
> > Does that help with your questions?
> >
> > Greetings,
> > Stephan
> >
> >
> > On Thu, Mar 19, 2015 at 2:57 PM, Flavio Pompermaier <
> pomperma...@okkam.it>
> > wrote:
> >
> > > Hi to all,
> > > I'm back to this task again :)
> > >
> > > Summarizing again: I have some source dataset that has contains RDF
> > "stars"
> > > (SubjectURI, RdfType and a list of RDF triples belonging to this
> subject
> > ->
> > > the "a.k.a." star schema)
> > > and I have to extract some sub-graphs for some RDF types of interest.
> > > As described in the previous email I'd like to expand some root node
> (if
> > > its type is of interest) and explode some of its path(s).
> > > For example, if I'm interested in the expansion of rdf type Person (as
> in
> > > the example), I could want to create a mini-graph with all of its
> triples
> > > plus those obtained exploding the path(s)
> > > knows.marriedWith and knows.knows.knows.
> > > At the moment I do it with a punctual get from HBase but I didn't
> > > get whether this could be done more efficiently with other strategies
> in
> > > Flink.
> > > @Vasiliki: you said that I could need "something like a BFS from each
> > > vertex".  Do you have an example that could fit my use case? Is it
> > possible
> > > to filter out those vertices I'm interested in?
> > >
> > > Thanks in advance,
> > > Flavio
> > >
> > >
> > > On Tue, Mar 3, 2015 at 8:32 PM, Vasiliki Kalavri <
> > > vasilikikala...@gmail.com>
> > > wrote:
> > >
> > > > Hi Flavio,
> > > >
> > > > if you want to use Gelly to model your data as a graph, you can load
> > your
> > > > Tuple3s as Edges.
> > > > This will result in "http://test/John";, "Person", "Frank", etc to be
> > > > vertices and "type", "name", "knows" to be edge values.
> > > > In the first case, you can use filterOnEdges() to get the subgraph
> with
> > > the
> > > > relation edges.
> > > >
> > > > Once you have the graph, you could probably use a vertex-centric
> > > iteration
> > > > to generate the trees.
> > > > It seems to me that you need something like a BFS from each vertex.
> > Keep
> > > in
> > > > mind that this can be a very costly operation in terms of memory and
> > > > communication for large graphs.
> > > >
> > > > Let me know if you have any questions!
> > > >
> > > > Cheers,
> > > > V.
> > > >
> > > > On 3 March 2015 at 09:13, Flavio Pompermaier 
> > > wrote:
> > > >
> > > > > I have a nice case of RDF manipulation :)
> > > > > Let's say I have the following RDF triples (Tuple3) in two files or
> > > > tables:
> > > > >
> > > > > TABLE A:
> > > > > http://test/John, type, Person
> > > > > http://test/John, name, John
> > > > > http://test/John, knows, http://test/Mary
> > > > > http://test/John, knows, http://test/Jerry
> > > > > http://test/Jerry, type, Person
> > > > > http://test/Jerry, name, Jerry
> > > > > http://test/Jerry, knows, http://test/Frank
> > > > > http://test/Mary, type, Person
> > > > > http://test/Mary, name, Mary
> > > > >
> > > > > TABLE B:
> > > > > http://test/Frank, type, Person
> > > > > http://test/Frank, name, Frank
> > > > > http://test/Frank, marriedWith, http://test/Mary
> > > > >
> > > > > What is the best way to build up Person-rooted trees with all
> node's
> > > data
> > > > > properties and some expanded path like 'Person.knows.marriedWith' ?
> > > > > Is it better to use Graph/Gelly APIs, Flink Joins, multiple
> punctuals
> > > get
> > > > > from a Key/value store or what?
> > > > >
> > > > > The expected 4 trees should be:
> > > > >
> > > > > tree 1 (root is John) --
> > > > > http://test/John, type, Person
> > > > > http://test/John, name, John
> > > > > http://test/John, knows, http://test/Mary
> > > > > http://test/John, knows, http://test/Jerry
> > > > > http://test/Jerry, type, Person
> >

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Fabian Hueske

Hi Flavio,

also, Gelly is a superset of Spargel. It provides the same features and
much more.

Since RDF is graph-structured, Gelly might be a good fit for your use case.

Cheers, Fabian

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Flavio Pompermaier

Is there anu example about rdf graph generation based on a skeleton
structure?
On Mar 22, 2015 12:28 PM, "Fabian Hueske"  wrote:

> Hi Flavio,
>
> also, Gelly is a superset of Spargel. It provides the same features and
> much more.
>
> Since RDF is graph-structured, Gelly might be a good fit for your use case.
>
> Cheers, Fabian
>

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Andra Lungu

Hi Flavio,

We don't have a specific example for generating RDF graphs using Gelly, but
I will try to drop some lines of code here and hope you will find them
useful.

An RDF statement is formed of Subject - Predicate - Object triples. In Edge
notation, the Subject and the Object will be the source and target vertices
respectively, while the edge value will be the predicate.

This being said, say you have an input plain text file that represents the
edges.
A line would look like this : http://test/Frank, marriedWith,
http://test/Mary

This is internally coded in Flink like a Tuple3. So, to read this edge file
you should have something like this:

private static DataSet>
getEdgesDataSet(ExecutionEnvironment env) {
   if (fileOutput) {
  return env.readCsvFile(edgesInputPath)
.lineDelimiter("\n")

// the subject, predicate, object

.types(String.class, String.class, String.class)
.map(new MapFunction,
  Edge>() {

   @Override
   public Edge map(Tuple3 tuple3) throws Exception {
  return new Edge(tuple3.f0, tuple3.f2, tuple3.f1);
   }
});
   } else {
  return getDefaultEdges(env);
   }
}

After you have this, in your main method, you just write:
Graph rdfGraph = Graph.fromDataSet(edges, env);

I picked up the conversation later on, not sure if that's what you meant by
"graph generation"...

Cheers,
Andra

On Sun, Mar 22, 2015 at 12:42 PM, Flavio Pompermaier 
wrote:

> Is there anu example about rdf graph generation based on a skeleton
> structure?
> On Mar 22, 2015 12:28 PM, "Fabian Hueske"  wrote:
>
> > Hi Flavio,
> >
> > also, Gelly is a superset of Spargel. It provides the same features and
> > much more.
> >
> > Since RDF is graph-structured, Gelly might be a good fit for your use
> case.
> >
> > Cheers, Fabian
> >
>

Re: Queries regarding RDFs with Flink

2015-03-22 Thread Flavio Pompermaier

Thanks Andrea for the help!
For graph generation I mean that I'd like to materialize subgraphs of my
overall knowledge starting from some root nodes whose rdf type is of
interest (something similar to what JSON-LD does). Is that clear?
On Mar 22, 2015 1:09 PM, "Andra Lungu"  wrote:

> Hi Flavio,
>
> We don't have a specific example for generating RDF graphs using Gelly, but
> I will try to drop some lines of code here and hope you will find them
> useful.
>
> An RDF statement is formed of Subject - Predicate - Object triples. In Edge
> notation, the Subject and the Object will be the source and target vertices
> respectively, while the edge value will be the predicate.
>
> This being said, say you have an input plain text file that represents the
> edges.
> A line would look like this : http://test/Frank, marriedWith,
> http://test/Mary
>
> This is internally coded in Flink like a Tuple3. So, to read this edge file
> you should have something like this:
>
> private static DataSet>
> getEdgesDataSet(ExecutionEnvironment env) {
>if (fileOutput) {
>   return env.readCsvFile(edgesInputPath)
> .lineDelimiter("\n")
>
> // the subject, predicate, object
>
> .types(String.class, String.class, String.class)
> .map(new MapFunction,
>   Edge String>>() {
>
>@Override
>public Edge map(Tuple3 String> tuple3) throws Exception {
>   return new Edge(tuple3.f0, tuple3.f2, tuple3.f1);
>}
> });
>} else {
>   return getDefaultEdges(env);
>}
> }
>
> After you have this, in your main method, you just write:
> Graph rdfGraph = Graph.fromDataSet(edges, env);
>
> I picked up the conversation later on, not sure if that's what you meant by
> "graph generation"...
>
> Cheers,
> Andra
>
> On Sun, Mar 22, 2015 at 12:42 PM, Flavio Pompermaier  >
> wrote:
>
> > Is there anu example about rdf graph generation based on a skeleton
> > structure?
> > On Mar 22, 2015 12:28 PM, "Fabian Hueske"  wrote:
> >
> > > Hi Flavio,
> > >
> > > also, Gelly is a superset of Spargel. It provides the same features and
> > > much more.
> > >
> > > Since RDF is graph-structured, Gelly might be a good fit for your use
> > case.
> > >
> > > Cheers, Fabian
> > >
> >
>

[jira] [Created] (FLINK-1768) Fix the bug of BlobServerConnection's LOG.

2015-03-22 Thread Sibao Hong (JIRA)

Sibao Hong created FLINK-1768:
-

 Summary: Fix the bug of BlobServerConnection's LOG.
 Key: FLINK-1768
 URL: https://issues.apache.org/jira/browse/FLINK-1768
 Project: Flink
  Issue Type: Bug
Reporter: Sibao Hong


The LOG in class of BlobServerConnection should be created by classloader of 
BlobServerConnection.class



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (FLINK-1767) StreamExecutionEnvironment's execute should return JobExecutionResult

Re: Queries regarding RDFs with Flink

Re: Queries regarding RDFs with Flink

Re: Queries regarding RDFs with Flink

Re: Queries regarding RDFs with Flink

Re: Queries regarding RDFs with Flink

Re: Queries regarding RDFs with Flink

[jira] [Created] (FLINK-1768) Fix the bug of BlobServerConnection's LOG.

8 matches

Site Navigation

Mail list logo

Footer information