Hi Hung, can you share some details on your algorithm and dataset? I could not reproduce this by just running a filterOnVertices on large input.
Thank you, Vasia. On 18 February 2015 at 19:03, HungChang <unicorn.bana...@gmail.com> wrote: > Hi, > > I have a question about generating the sub-graph using Spargel API. > We use filterOnVertices to generate it. > With 30MB edges, the code gets stuck at Join(Join at filterOnVertices) > With 2MB edges, the code doesn't have this issue. > > Log > > ------------------------------------------------------------------------------------------------------------------ > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (7/20) > switched to FINISHED > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (12/20) > switched to FINISHED > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (14/20) > switched to FINISHED > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (17/20) > switched to FINISHED > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (20/20) > switched to FINISHED > 02/18/2015 10:34:23: Join(Join at filterOnVertices(Graph.java:615)) > (13/20) > switched to FINISHED > 02/18/2015 10:34:24: Join(Join at filterOnVertices(Graph.java:615)) > (8/20) > switched to FINISHED > 02/18/2015 10:34:24: Join(Join at filterOnVertices(Graph.java:615)) > (2/20) > switched to FINISHED > 02/18/2015 10:34:24: Join(Join at filterOnVertices(Graph.java:615)) > (3/20) > switched to FINISHED > 02/18/2015 10:34:24: Join(Join at filterOnVertices(Graph.java:615)) > (19/20) > switched to FINISHED > 02/18/2015 10:34:24: Join(Join at filterOnVertices(Graph.java:615)) > (16/20) > switched to FINISHED > > It takes more than 10 minutes to continue while other operators complete in > seconds. > From the log, it looks like some workers finish and some doesn't. > > The Spargel API shows it uses join twice so this operator looks a bit > expensive. > Would it be the reason that the job gets stuck? > Our goal of using filterOnVertices is to use the sub-graph as an input for > next iteration. > > > ------------------------------------------------------------------------------------------------------------------ > public Graph<K, VV, EV> filterOnVertices(FilterFunction<Vertex<K, VV>> > vertexFilter) { > > DataSet<Vertex<K, VV>> filteredVertices = > this.vertices.filter(vertexFilter); > > DataSet<Edge<K, EV>> remainingEdges = > this.edges.join(filteredVertices) > .where(0).equalTo(0) > .with(new ProjectEdge<K, VV, EV>()) > .join(filteredVertices).where(1).equalTo(0) > .with(new ProjectEdge<K, VV, EV>()); > > return new Graph<K, VV, EV>(filteredVertices, > remainingEdges, > this.context); > } > > Best regards, > > Hung > > > > -- > View this message in context: > http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Using-Spargel-s-FilterOnVerices-gets-stuck-tp743.html > Sent from the Apache Flink (Incubator) User Mailing List archive. mailing > list archive at Nabble.com. >