Re: LDBC Graph Data into Flink

2015-10-06 Thread Vasiliki Kalavri
Hi Martin, thanks a lot for sharing! This is a very useful tool. I only had a quick look, but if we merge label and payload inside a Tuple2, then it should also be Gelly-compatible :) Cheers, Vasia. On 6 October 2015 at 10:03, Martin Junghanns wrote: > Hi all, > > For our benchmarks with Flink

Re: compile flink-gelly-scala using sbt

2015-10-27 Thread Vasiliki Kalavri
Hi Do, I don't really have experience with sbt, but one thing that might cause problems is that your dependencies point to Flink 0.9.1 and gelly-scala wasn't part of that release. You can either try to use the 0.10-SNAPSHOT or wait a few days for the 0.10 release. Cheers, -Vasia. On 27 October 2

Re: compile flink-gelly-scala using sbt

2015-10-28 Thread Vasiliki Kalavri
t;>> Le Quoc Do >>> Dresden University of Technology >>> Faculty of Computer Science >>> Institute for System Architecture >>> Systems Engineering Group >>> 01062 Dresden >>> E-Mail: d...@se.inf.tu-dresden.de >>> >>> On Wed,

Re: Creating Graphs from DataStream in Flink Gelly

2015-11-02 Thread Vasiliki Kalavri
Hi Ufuoma, Gelly doesn't support dynamic streaming graphs yet. The project Andra has linked to is a prototype for *one-pass* streaming graph analytics, i.e. no graph state is maintained. If you would like to keep and maintain the graph state in your streaming program, you would have to implement

Re: Zeppelin Integration

2015-11-04 Thread Vasiliki Kalavri
Great tutorial! Thanks a lot ^^ On 4 November 2015 at 17:12, Leonard Wolters wrote: > Indeed very nice! Thanks > On Nov 4, 2015 5:04 PM, "Till Rohrmann" wrote: > >> Really cool tutorial Trevor :-) >> >> On Wed, Nov 4, 2015 at 3:26 PM, Robert Metzger >> wrote: >> >>> For those interested, Trevo

Creating a representative streaming workload

2015-11-16 Thread Vasiliki Kalavri
Hello squirrels, with some colleagues and students here at KTH, we have started 2 projects to evaluate (1) performance and (2) behavior in the presence of memory interference in cloud environments, for Flink and other systems. We want to provide our students with a workload of representative appli

Re: Creating a representative streaming workload

2015-11-16 Thread Vasiliki Kalavri
k here: > http://www.sparkbigdata.com/102-spark-blog-slim-baltagi/14-results-of-a-benchmark-between-apache-flink-and-apache-spark > > Best regards, > Ovidiu > > On 16 Nov 2015, at 15:21, Vasiliki Kalavri > wrote: > > Hello squirrels, > > with some colleagues and students here at K

Re: LDBC Graph Data into Flink

2015-11-24 Thread Vasiliki Kalavri
t; Best, > Martin > > On 06.10.2015 11:00, Martin Junghanns wrote: > > Hi Vasia, > > > > No problem. Sure, Gelly is just a map() call away :) > > > > Best, > > Martin > > > > On 06.10.2015 10:53, Vasiliki Kalavri wrote: > >> Hi Martin, &g

Re: store and retrieve Graph object

2015-11-25 Thread Vasiliki Kalavri
Hi Stefane, let me know if I understand the problem correctly. The vertex values are POJOs that you're somehow inferring from the edge list and this value creation is what takes a lot of time? Since a graph is just a set of 2 datasets (vertices and edges), you could store the values to disk and ha

Re: store and retrieve Graph object

2015-11-25 Thread Vasiliki Kalavri
s wrote: > Hi Vasia, > > my graph object is the following: > > Graph graph = Graph.fromCollection( > edgeList.collect(), env); > > The vertex is a POJO not the value. So the problem is how could i store > and retrieve the vertex list? > > Thanks, > Stefanos >

Re: store and retrieve Graph object

2015-11-25 Thread Vasiliki Kalavri
Good to know :) On 25 November 2015 at 21:44, Stefanos Antaris wrote: > Hi, > > It works fine using this approach. > > Thanks, > Stefanos > > On 25 Nov 2015, at 20:32, Vasiliki Kalavri > wrote: > > Hey, > > you can preprocess your data, create the vertic

Re: 2015: A Year in Review for Apache Flink

2015-12-31 Thread Vasiliki Kalavri
Happy new year everyone! Looking forward to all the great things the Apache Flink community will accomplish in 2016 :)) Greetings from snowy Greece! -Vasia. On 31 December 2015 at 04:22, Henry Saputra wrote: > Dear All, > > It is almost end of 2015 and it has been busy and great year for Apache

Re: Machine Learning on Apache Fink

2016-01-09 Thread Vasiliki Kalavri
Hi Ashutosh, Flink has a Machine Learning library, Flink-ML. You can find more information and examples the documentation [1]. The code is currently in the flink-staging repository. There is also material on Slideshare that you might find interesting [2, 3, 4]. I hope this helps! -Vasia. [1]: ht

Re: Graph with stream of updates

2016-02-26 Thread Vasiliki Kalavri
Hi Ankur, you can have custom state in your Flink operators, including a graph. There is no graph state abstraction provided at the moment, but it shouldn't be too hard for you to implement your own. If your use-case only requires processing edge additions only, then you might want to take a look

Re: time spent for iteration

2016-03-09 Thread Vasiliki Kalavri
I think it would be useful to allow for easier retrieval of this information. Wouldn't it make sense to expose this to the web UI for example? We actually had a discussion about this some time ago [1]. -Vasia. [1]: https://issues.apache.org/jira/browse/FLINK-1759 On 9 March 2016 at 14:37, Gábor

Re: Memory ran out PageRank

2016-03-14 Thread Vasiliki Kalavri
Hi Ovidiu, this option won't fix the problem if your system doesn't have enough memory :) It only defines whether the solution set is kept in managed memory or not. For more iteration configuration options, take a look at the Gelly documentation [1]. -Vasia. [1]: https://ci.apache.org/projects/f

Re: Intermediate solution set of delta iteration

2016-03-23 Thread Vasiliki Kalavri
Hi Mengqi, if what you are trying to do is output the solution set of every iteration, before the iteration has finished, then that is not possible. i.e. you can not output the solution set to a sink or another operator during the iteration. However, you can add elements to the solution set and g

Re: About flink stream table API

2016-04-26 Thread Vasiliki Kalavri
Hello, the stream table API is currently under heavy development. So far, we support selection, filtering, and union operations. For these operations we use the stream SQL syntax of Apache Calcite [1]. This is as simple as adding the "STREAM" keyword. Registering a datastream table and running a

Re: Job hangs

2016-04-27 Thread Vasiliki Kalavri
Hi Timur, I've previously seen large batch jobs hang because of join deadlocks. We should have fixed those problems, but we might have missed some corner case. Did you check whether there was any cpu activity when the job hangs? Can you try running htop on the taskmanager machines and see if they'

Re: Gelly CommunityDetection in scala example

2016-04-27 Thread Vasiliki Kalavri
Hi Trevor, note that the community detection algorithm returns a new graph where the vertex values correspond to the computed communities. Also, note that the current implementation expects a graph with java.lang.Long vertex values and java.lang.Double edge values. The following should work: imp

Re: aggregation problem

2016-04-28 Thread Vasiliki Kalavri
Hi Riccardo, can you please be a bit more specific? What do you mean by "it didn't work"? Did it crash? Did it give you a wrong value? Something else? -Vasia. On 28 April 2016 at 16:52, Riccardo Diomedi wrote: > Hi everybody > > In a DeltaIteration I have a DataSet>> where, at a > certain poin

Re: Bug while using Table API

2016-05-04 Thread Vasiliki Kalavri
Hi Simone, I tried reproducing your problem with no luck. I ran the WordCountTable example using sbt quickstart with Flink 1.1-SNAPSHOT and Scala 2.10 and it worked fine. Can you maybe post the code you tried? Thanks, -Vasia. On 4 May 2016 at 11:20, Simone Robutti wrote: > Hello, > > while try

Re: Bug while using Table API

2016-05-04 Thread Vasiliki Kalavri
Thanks Simone! I've managed to reproduce the error. I'll try to figure out what's wrong and I'll keep you updated. -Vasia. On May 4, 2016 3:25 PM, "Simone Robutti" wrote: > Here is the code: > > package org.example > > import org.apache.flink.api.scala._ > import org.apache.flink.api.table.Table

Re: Bug while using Table API

2016-05-11 Thread Vasiliki Kalavri
/flink/commit/7ed07933d2dd3cf41948287dc8fd79dbef902311 On 4 May 2016 at 17:33, Vasiliki Kalavri wrote: > Thanks Simone! I've managed to reproduce the error. I'll try to figure out > what's wrong and I'll keep you updated. > > -Vasia. > On May 4, 2016 3:25 PM, &quo

Re: Bug while using Table API

2016-05-12 Thread Vasiliki Kalavri
Good to know :) On 12 May 2016 at 11:16, Simone Robutti wrote: > Ok, I tested it and it works on the same example. :) > > 2016-05-11 12:25 GMT+02:00 Vasiliki Kalavri : > >> Hi Simone, >> >> Fabian has pushed a fix for the streaming TableSources that removed

Re: normalize vertex values

2016-05-12 Thread Vasiliki Kalavri
Hi Lydia, there is no dedicated Gelly API method that performs normalization. If you know the max value, then a mapVertices() would suffice. Otherwise, you can get the Dataset of vertices with getVertices() and apply any kind of operation supported by the Dataset API on it. Best, -Vasia. On May 1

Re: Scatter-Gather Iteration aggregators

2016-05-12 Thread Vasiliki Kalavri
Hi Lydia, registered aggregators through the ScatterGatherConfiguration are accessible both in the VertexUpdateFunction and in the MessageFunction. Cheers, -Vasia. On 12 May 2016 at 20:08, Lydia Ickler wrote: > Hi, > > I have a question regarding the Aggregators of a Scatter-Gather Iteration.

Re: Scatter-Gather Iteration aggregators

2016-05-13 Thread Vasiliki Kalavri
ly within each Function or not? > > If I set the aggregator in VertexUpdateFunction then the newly set value > is not visible in the MessageFunction. > Or am I doing something wrong? I would like to have a shared aggregator > to normalize vertices. > > > Am 13.05.2016

Re: Scatter-Gather Iteration aggregators

2016-05-13 Thread Vasiliki Kalavri
sages) { > double sum = 0; > for (double msg : inMessages) { > sum = sum + (msg); > } > > if((Math.abs(sum) > Math.abs(aggregator.getAggregate().getValue({ > > aggregator.reset(); > aggregator

Re: "Memory ran out" error when running connected components

2016-05-13 Thread Vasiliki Kalavri
Hi Rob, On 13 May 2016 at 11:22, Arkay wrote: > Hi to all, > > I’m aware there are a few threads on this, but I haven’t been able to solve > an issue I am seeing and hoped someone can help. I’m trying to run the > following: > > val connectedNetwork = new org.apache.flink.api.scala.DataSet[Ver

Re: "Memory ran out" error when running connected components

2016-05-13 Thread Vasiliki Kalavri
Thanks for checking Rob! I don't see any reason for the job to fail with this configuration and input size. I have no experience running Flink on windows though, so I might be missing something. Do you get a similar error with smaller inputs? -Vasia. On 13 May 2016 at 13:27, Arkay wrote: > Than

Re: "Memory ran out" error when running connected components

2016-05-13 Thread Vasiliki Kalavri
On 13 May 2016 at 14:28, Arkay wrote: > Hi Vasia, > > It seems to work OK up to about 50MB of input, and dies after that point. > If i disable just this connected components step the rest of my program is > happy with the full 1.5GB test dataset. It seems to be specifically > limited > to GraphA

Re: "Memory ran out" error when running connected components

2016-05-14 Thread Vasiliki Kalavri
Hey Rob, On 13 May 2016 at 15:45, Arkay wrote: > Thanks for the link, I had experimented with those options, apart from > taskmanager.memory.off-heap: true. Turns out that allows it to run through > happily! I don't know if that is a peculiarity of a windows JVM, as I > understand that setting

Re: Gelly scatter/gather

2016-06-13 Thread Vasiliki Kalavri
Hi Alieh, the VertexUpdateFunction and the MessagingFunction both have a method "getSuperstepNumber()" which will give you the current iteration number. -Vasia. On 13 June 2016 at 18:06, Alieh Saeedi wrote: > Hi > Is it possible to access iteration number in gelly scatter/gather? > > thanks in

Re: Gelly Scatter/Gather - Vertex update

2016-06-15 Thread Vasiliki Kalavri
Hi Alieh, the scatter-gather model is built on top of Flink delta iterations exactly for the reason to allow de-activating vertices that do not need to participate in the computation of a certain superstep. If you want all vertices to participate in all iterations of scatter-gather, you can send d

Re: Send to all in gelly scatter

2016-06-15 Thread Vasiliki Kalavri
Hi Alieh, you can send a message from any vertex to any other vertex if you know the vertex ID. In [1] you will find a table that compares the update logic and communication scope for all gelly iteration models. Bear in mind though, that sending a message from all vertices to all other vertices is

Re: Send to all in gelly scatter

2016-06-15 Thread Vasiliki Kalavri
I forgot the reference [1] :S Here it is: [1] https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/gelly.html#iteration-abstractions-comparison On 15 June 2016 at 20:59, Vasiliki Kalavri wrote: > Hi Alieh, > > you can send a message from any vertex to any other vert

Re: Parameters inside an iteration?

2016-07-05 Thread Vasiliki Kalavri
Hi Christoph, if I understand what you want to do correctly, making your RichMapFunction a standalone class and passing your object to the constructor should work. Cheers, -Vasia. On 5 July 2016 at 18:16, Boden, Christoph wrote: > Dear Flink Community, > > > is there a compact and efficient wa

Re: Flink stops deploying jobs on normal iteration

2016-07-05 Thread Vasiliki Kalavri
Hi Truong, I'm afraid what you're experiencing is to be expected. Currently, for loops do not perform well in Flink since there is no support for caching intermediate results yet. This has been a quite often requested feature lately, so maybe it will be added soon :) Until then, I suggest you try

Re: Flink stops deploying jobs on normal iteration

2016-07-07 Thread Vasiliki Kalavri
does not support > that feature, so I wondered if you have a workround for interate or > iterateDelta? > > Thanks, > Truong > > On Tue, Jul 5, 2016 at 8:46 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >> Hi Truong, >> >> I'm afr

Re: Graph with stream of updates

2016-07-07 Thread Vasiliki Kalavri
Hi Milindu, as far as I know, there is currently no way to query the state from outside of Flink. That's a feature in the roadmap, but I'm not sure when it will be provided. Maybe someone else can give us an update. For now, you can either implement your queries inside you streaming job and output

Re: sampling function

2016-07-11 Thread Vasiliki Kalavri
Hi Do, Paris and Martha worked on sampling techniques for data streams on Flink last year. If you want to implement your own samplers, you might find Martha's master thesis helpful [1]. -Vasia. [1]: http://kth.diva-portal.org/smash/get/diva2:910695/FULLTEXT01.pdf On 11 July 2016 at 11:31, Kosta

Re: [ANNOUNCE] Flink 1.1.0 Released

2016-08-08 Thread Vasiliki Kalavri
yoo-hoo finally announced 🎉 Thanks for managing the release Ufuk! On 8 August 2016 at 18:36, Ufuk Celebi wrote: > The Flink PMC is pleased to announce the availability of Flink 1.1.0. > > On behalf of the PMC, I would like to thank everybody who contributed > to the release. > > The release anno

Re: Flink error: Too few memory segments provided

2016-10-20 Thread Vasiliki Kalavri
Also pay attention to the Flink version you are using. The configuration link you have provided points to an old version (0.8). Gelly wasn't part of Flink then :) You probably need to look in [1]. Cheers, -Vasia. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.1/setup/config.html

Re: Flink error: Too few memory segments provided

2016-10-21 Thread Vasiliki Kalavri
Hi, On 21 October 2016 at 11:17, otherwise777 wrote: > I tried increasing the taskmanager.network.numberOfBuffers to 4k and > later to > 8k, i'm not sure if my configuration file is even read, it's stored inside > my IDE as follows: http://prntscr.com/cx0vrx > i buil

Re: Retrieving a single element from a DataSet

2016-11-05 Thread Vasiliki Kalavri
Hi all, @Wouter: I'm not sure I completely understand what you want to do, but would broadcast variables [1] help? @all: All-pairs-shortest-paths and betweenness centrality are very challenging algorithms to implement efficiently in a distributed way. APSP requires each vertex to store distances

Re: Too few memory segments provided. Hash Table needs at least 33 memory segments.

2016-11-15 Thread Vasiliki Kalavri
Hi Miguel, I'm sorry for the late reply; this e-mail got stuck in my spam folder. I'm glad that you've found a solution :) I've never used flink with docker, so I'm probably not the best person to advise you on this. However, if I understand correctly, you're changing the configuration before sub

Re: 33 segments problem with configuration set

2016-11-16 Thread Vasiliki Kalavri
Dear Wouter, first of all, as I noted in another thread already, betweenness centrality is an extremely demanding algorithm and a distributed data engine such as Flink is probably not the best system to implement it into. On top of that, the message-passing model for graph computations would gener

Re: Type of TypeVariable 'K' in 'class <> could not be determined

2016-11-17 Thread Vasiliki Kalavri
Hi Wouter, with InitVerticesMapper() are you trying to map the vertex value to a Tuple2 or to a Double? Your mapper is turning the vertex values into a Tuple2<> but your scatter-gather UDFs are defining Double vertex values. -Vasia. On 17 November 2016 at 14:03, otherwise777 wrote: > Hello tim

Re: Type of TypeVariable 'K' in 'class <> could not be determined

2016-11-18 Thread Vasiliki Kalavri
Hi Timo, thanks for looking into this! Are you referring to the 4th argument in [1]? Thanks, -Vasia. [1]: https://github.com/apache/flink/blob/master/ flink-libraries/flink-gelly/src/main/java/org/apache/ flink/graph/Graph.java#L506 On 18 November 2016 at 10:25, Timo Walther wrote: > I think

Re: Type of TypeVariable 'K' in 'class <> could not be determined

2016-11-18 Thread Vasiliki Kalavri
t know if it solve this problem but in general if the input > type is known it should be passed for input type inference. > > Am 18/11/16 um 11:28 schrieb Vasiliki Kalavri: > > Hi Timo, > > thanks for looking into this! Are you referring to the 4th argument in [1]? > > Thanks,

Re: Executing graph algorithms on Gelly that are larger then memmory

2016-11-30 Thread Vasiliki Kalavri
Hi, can you give us some more details about the algorithm you are testing and your configuration? Flink DataSet operators like join, coGroup, reduce, etc. spill to disk if there is not enough memory. If you are using a delta iteration operator though, the state that is kept across iterations (sol

Re: Apache Flink 1.1.4 - Java 8 - CommunityDetection.java:158 - java.lang.NullPointerException

2017-01-15 Thread Vasiliki Kalavri
Hi Miguel, this is a bug, thanks a lot for reporting! I think the problem is that the implementation assumes that labelsWithHighestScores contains the vertex itself as initial label. Could you please open a JIRA ticket for this and attach your code and data as an example to reproduce? We should a

Re: Apache Flink 1.1.4 - Java 8 - CommunityDetection.java:158 - java.lang.NullPointerException

2017-01-16 Thread Vasiliki Kalavri
Hi Miguel, thank you for opening the issue! Changes/improvements to the documentation are also typically handled with JIRAs and pull requests [1]. Would you like to give it a try and improve the community detection docs? Cheers, -Vasia. [1]: https://flink.apache.org/contribute-documentation.html

Re: Apache Flink 1.1.4 - Java 8 - CommunityDetection.java:158 - java.lang.NullPointerException

2017-01-18 Thread Vasiliki Kalavri
Great! Let us know if you need help. -Vasia. On 17 January 2017 at 10:30, Miguel Coimbra wrote: > Hello Vasia, > > I am going to look into this. > Hopefully I will contribute to the implementation and documentation. > > Regards, > > -- Forwarded message ---

Re: Apache Flink 1.1.4 - Gelly - LocalClusteringCoefficient - Returning values above 1?

2017-01-20 Thread Vasiliki Kalavri
Hi Miguel, the LocalClusteringCoefficient algorithm returns a DataSet of type Result, which basically wraps a vertex id, its degree, and the number of triangles containing this vertex. The number 11 you see is indeed the degree of vertex 5113. The Result type contains the method getLocalClustering

Re: Apache Flink 1.1.4 - Gelly - LocalClusteringCoefficient - Returning values above 1?

2017-01-23 Thread Vasiliki Kalavri
>> The '--output print' option describes the values and also displays the >> local clustering coefficient value. >> >> You're running the undirected algorithm on a directed graph. In 1.2 there >> is an option '--simplify true' that will add rev

Re: Questions about the V-C Iteration in Gelly

2017-02-09 Thread Vasiliki Kalavri
Hi Xingcan, On 7 February 2017 at 10:10, Xingcan Cui wrote: > Hi all, > > I got some question about the vertex-centric iteration in Gelly. > > a) It seems the postSuperstep method is called before the superstep > barrier (I got different aggregate values of the same superstep in this > method).

Re: Questions about the V-C Iteration in Gelly

2017-02-10 Thread Vasiliki Kalavri
26(MST Lib&Example). Considering the > complexity, the example is not > provided.) > > Really appreciate for all your help. > > Best, > Xingcan > > On Thu, Feb 9, 2017 at 5:36 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >> Hi Xingcan

Re: Questions about the V-C Iteration in Gelly

2017-02-13 Thread Vasiliki Kalavri
es become inactive in last phase, >>>> it could be hard to reactive them again by message since we even don't know >>>> which vertices to send to. The only solution is to keep all vertices >>>> active, whether by updating vertices values in each super step

Re: Questions about the V-C Iteration in Gelly

2017-02-14 Thread Vasiliki Kalavri
gt; in FLINK-1526, though with an ugly format). Now everything's clear and I > think this thread should be closed here. > > Thanks. @Vasia @Greg > > Best, > Xingcan > > On Tue, Feb 14, 2017 at 3:55 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >

Re: Graph iteration with triplets or access to edges

2017-04-28 Thread Vasiliki Kalavri
Hi Marc, you can access the edge values inside the ScatterFunction using the getEdges() method. For an example look at SingleSourceShortestPaths [1] which sums up edge values to compute distances. I hope that helps! -Vasia. [1]: https://github.com/apache/flink/blob/master/flink-libraries/flink-g

Re: Gelly - bipartite graph runs vertex-centric

2017-06-26 Thread Vasiliki Kalavri
Hi Marc, the BipartiteGraph type doesn't support vertex-centric iterations yet. You can either represent your bipartite graph using the Graph type and e.g. having an extra attribute in the vertex value to distinguish between top and bottom vertices or define your own custom delta iteration on top

Re: Graph Analytics on HBase With HGraphDB and Apache Flink Gelly

2017-07-27 Thread Vasiliki Kalavri
Thank you for sharing! On 28 July 2017 at 05:01, Robert Yokota wrote: > Also Google Cloud Bigtable has such a page at https://cloud.google.com/ > bigtable/docs/integrations > > On Thu, Jul 27, 2017 at 6:57 PM, Robert Yokota wrote: > >> >> One thing I really appreciate about HBase is its flexibi

Re: flink loop

2015-02-05 Thread Vasiliki Kalavri
Hi, I'm not familiar with the particular algorithm, but you can most probably use one of the two iterate operators in Flink. You can read a description and see some examples in the documentation: http://flink.apache.org/docs/0.8/programming_guide.html#iteration-operators Let us know if you have

Re: DeltaIterations: shrink solution set

2015-02-10 Thread Vasiliki Kalavri
Hi, It's hard to tell without details about your algorithm, but what you're describing sounds to me like something you can use the workset for. -V. On Feb 10, 2015 6:54 PM, "Alexander Alexandrov" < alexander.s.alexand...@gmail.com> wrote: > I am not sure whether this is supported at the moment.

Re: Multiple sources shortest path

2015-02-15 Thread Vasiliki Kalavri
Hi, you can certainly use a for-loop like this to run SSSP several times. Just make sure you return or store the result of the computation for each source, by adding a data sink e.g.: for (id : Ids) { graph.run(new SingleSourceShortestPaths(id, maxIterations)) .getVertices().print();

Re: Can a master class control the superstep in Flink Spargel ?

2015-02-15 Thread Vasiliki Kalavri
Hi, currently, there is no such built-in master compute class, but you can easily have the equivalent functionality it as follows: - If your algorithm has a fixed pattern of superstep types, e.g. an initialization superstep, a main phase and a finalization superstep, then you can simply chain the

Re: Using Spargel's FilterOnVerices gets stuck.

2015-02-18 Thread Vasiliki Kalavri
Hi Hung, can you share some details on your algorithm and dataset? I could not reproduce this by just running a filterOnVertices on large input. Thank you, Vasia. On 18 February 2015 at 19:03, HungChang wrote: > Hi, > > I have a question about generating the sub-graph using Spargel API. > We u

Re: Using Spargel's FilterOnVerices gets stuck.

2015-02-18 Thread Vasiliki Kalavri
Hi Hung, I am under the impression that circular dependencies like the one you are describing are not allowed in the Flink execution graph. I would actually expect something like this to cause an error. Maybe someone else can elaborate on that? In any case, the proper way to write iterative prog

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
Hi Mihail, Robert, I've tried reproducing this, but I couldn't. I'm using the same twitter input graph from SNAP that you link to and also Scala IDE. The job finishes without a problem (both the SSSP example from Gelly and the unweighted version). The only thing I changed to run your version was

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
cation but > that it will work if you throw enough memory at it. > > Or did your setup succeed with an amount of memory comparable to Mihail's > and mine? > > My main point is that it shouldn't take 10x more memory than the input > size for such a job. > > Cheers,

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
630 > > I need the vertices to be generated from a file for my future work. > > Cheers, > Mihail > > > > On 18.03.2015 17:04, Vasiliki Kalavri wrote: > > Hi Mihail, Robert, > > I've tried reproducing this, but I couldn't. > I'm using t

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
workaround. > > An odd thing occurs now though. The distances aren't computed correctly > for the SNAP graph and remain the one set in InitVerticesMapper(). For the > small graph in SSSPDataUnweighted they are OK. I'm currently investigating > this behavior. > > Cheer

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
the small > graph (also read from files) but not for the larger one. > The messages appear to be wrong in the latter case. > > > On 18.03.2015 21:55, Vasiliki Kalavri wrote: > > hmm, I'm starting to run out of ideas... > What's your source ID parameter? I ran mine wi

Re: RuntimeException Gelly API: Memory ran out. Compaction failed.

2015-03-18 Thread Vasiliki Kalavri
clear everything out :-) Cheers, V. On 18 March 2015 at 23:44, Vasiliki Kalavri wrote: > Well, one thing I notice is that your vertices and edges args are flipped. > Might be the source of error :-) > > On 18 March 2015 at 23:04, Mihail Vieru > wrote: > >> I'm a

Re: Gelly available already?

2015-03-24 Thread Vasiliki Kalavri
Hi all, there is no Scala API for Gelly yet and no corresponding JIRA either. It's definitely in our plans, just not for 0.9 :-) Cheers, -V. On 24 March 2015 at 00:21, Henry Saputra wrote: > Any JIRA filed to add Scala counterparts for Gelly? > > - Henry > > On Mon, Mar 23, 2015 at 3:44 PM, An

Re: ArrayIndexOutOfBoundsException when running job from JAR

2015-06-26 Thread Vasiliki Kalavri
Hi Mihail, could you share your code or at least the implementations of getVerticesDataSet() and InitVerticesMapper so I can take a look? Where is InitVerticesMapper called above? Cheers, Vasia. On 26 June 2015 at 10:51, Mihail Vieru wrote: > Hi Robert, > > I'm using the same input data, as

Re: ArrayIndexOutOfBoundsException when running job from JAR

2015-06-28 Thread Vasiliki Kalavri
maxIterations, parameters);* > *}* > > I'll send you the full code via a private e-mail. > > Cheers, > Mihail > > > On 26.06.2015 11:10, Vasiliki Kalavri wrote: > > Hi Mihail, > > could you share your code or at least the implementa

Re: ArrayIndexOutOfBoundsException when running job from JAR

2015-06-29 Thread Vasiliki Kalavri
e at least two JVMs > involved, and code running in the JM/TM can not access the value from the > static variable in the Cli frontend. > > On Sun, Jun 28, 2015 at 9:43 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >> Hi everyone, >> >> Mihail

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Vasiliki Kalavri
Hi, Apart from the amplab benchmark, you might also find [1] and [2] interesting. The first is a survey on existing benchmarks, while the second proposes one. However, they are also limited to SQL-like queries. Regarding graph processing benchmarks, I recently came across Graphalytics [3]. The be

Re: Gelly forward

2015-07-08 Thread Vasiliki Kalavri
Hi Flavio! Are you talking about vertex-centric iterations in gelly? If yes, you can send messages to a particular vertex with "sendMessageTo(vertexId, msg)" and to all neighbors with "sendMessageToAllNeighbors(msg)". These methods are available inside the MessagingFunction. Accessing received me

Re: Gelly forward

2015-07-08 Thread Vasiliki Kalavri
ages (destination, message) in each vertex and reset it in the > postSuperstep() of the VertexUpdateFunction? > > On Wed, Jul 8, 2015 at 9:38 AM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >> Hi Flavio! >> >> Are you talking about vertex-centric iterations in ge

Re: Gelly forward

2015-07-08 Thread Vasiliki Kalavri
and wait for the response. The problem is that node 3 for > example, once queried for property containedIn.name from node 1 it just > have to forward this path to node 4 and thell to 4 to reply to 1. > > Is that possible? > > > On Wed, Jul 8, 2015 at 10:19 AM, Vasiliki Kalavri &

Re: Gelly forward

2015-07-08 Thread Vasiliki Kalavri
i.e. it > knows all tuples belonging to it..). > So in Vertex 1 I have a field (an HashMap) containing the following info: > >- type=Person >- livesIn=2 (and I know also that 2 is a vertexId) > > In Vertex 3 I know: > >- type=Place >- name=Berlin >-

Re: Gelly EOFException

2015-07-18 Thread Vasiliki Kalavri
Hi Flavio, Gelly currently makes no sanity checks regarding the input graph data. We decided to leave it to the user to check that they have a valid graph, for performance reasons. That means that there might exist Gelly methods that assume that your input graph is valid, i.e. no duplicate vertice

Re: Containment Join Support

2015-07-18 Thread Vasiliki Kalavri
Hi Martin, I'm really glad to see that you've started using Gelly :) I think that a graph summarization library method would be a great addition! Let me know if you need help and if you want to discuss ideas or other methods. Cheers, Vasia. On 17 July 2015 at 12:25, Martin Junghanns wrote: >

Re: Too few memory segments provided exception

2015-07-20 Thread Vasiliki Kalavri
Hi Shivani, why are you using a vertex-centric iteration to compute the approximate Adamic-Adar? It's not an iterative computation :) In fact, it should be as complex (in terms of operators) as the exact Adamic-Adar, only more efficient because of the different neighborhood representation. Are yo

Re: Too few memory segments provided exception

2015-07-20 Thread Vasiliki Kalavri
is example, I am > curious how we should handle it in the future. > > On Mon, Jul 20, 2015 at 3:15 PM, Vasiliki Kalavri < > vasilikikala...@gmail.com> wrote: > >> Hi Shivani, >> >> why are you using a vertex-centric iteration to compute the approximate >>

Re: Too few memory segments provided exception

2015-07-20 Thread Vasiliki Kalavri
;> I thought we agreed that the BloomFilters are to be sent as messages to >> the vertices? >> >> The exact version is passing all the tests. >> >> On removing the final GroupReduce the program is working but I need it to >> add the Partial Adamic Adar edges