Re: GraphX Connected Components

2016-11-08 Thread Robineast
context: http://apache-spark-user-list.1001560.n3.nabble.com/GraphX-Connected-Components-tp10869p28049.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr

RE: Question about GraphX connected-components

2015-10-12 Thread John Lilley
needed. Arriving at the answer through experimentation isn’t a good approach, because that assumes -- chicken-and-egg problem -- that we have already arrived at an optimal configuration. -- Does GraphX connected-components performance degrade slowly or catastrophically when that memory limit is

Re: Question about GraphX connected-components

2015-10-10 Thread Igor Berman
015 at 00:13, John Lilley wrote: > Greetings, > > We are looking into using the GraphX connected-components algorithm on > Hadoop for grouping operations. Our typical data is on the order of > 50-200M vertices with an edge:vertex ratio between 2 and 30. While there > are pathologic

Question about GraphX connected-components

2015-10-09 Thread John Lilley
Greetings, We are looking into using the GraphX connected-components algorithm on Hadoop for grouping operations. Our typical data is on the order of 50-200M vertices with an edge:vertex ratio between 2 and 30. While there are pathological cases of very large groups, they tend to be small. I

Re: SparkR -Graphx Connected components

2015-08-11 Thread Robineast
respectively. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24209.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: SparkR -Graphx Connected components

2015-08-09 Thread smagadi
een 6,0 (3,3)-OK (7,7)-This shd have been 7,3 (5,3)-OK (2,0)-OK -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24190.html Sent from the Apache Spark User List mailing

Re: SparkR -Graphx Connected components

2015-08-07 Thread Robineast
Manning Publications Co. http://www.manning.com/malak/ <http://www.manning.com/malak/> -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Connected-components-tp24165p24166.html Sent from the Apache Spark User List mailing list archive at Nabb

SparkR -Graphx Connected components

2015-08-07 Thread smagadi
Id, Int]= graph.stronglyConnectedComponents(10). help needed in completing the code.I do not know from now on how to get stronglyconnected nodes .Pls help in completing this code/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-Graphx-Conn

java.nio.channels.CancelledKeyException in Graphx Connected Components

2014-08-18 Thread Jeffrey Picard
Hey all, I’m trying to run connected components in graphx on about 400GB of data on 50 m3.xlarge nodes on emr. I keep getting java.nio.channels.CancelledKeyException when it gets to "mapPartitions at VertexRDD.scala:347”. I haven’t been able to find much about this online, and nothing that seem

Re: GraphX Connected Components

2014-07-30 Thread Ankur Dave
On Wed, Jul 30, 2014 at 11:32 PM, Jeffrey Picard wrote: > That worked! The entire thing ran in about an hour and a half, thanks! Great! > Is there by chance an easy way to build spark apps using the master branch > build of spark? I’ve been having to use the spark-shell. The easiest way is pro

Re: GraphX Connected Components

2014-07-30 Thread Jeffrey Picard
On Jul 30, 2014, at 4:39 PM, Ankur Dave wrote: > Jeffrey Picard writes: >> I tried unpersisting the edges and vertices of the graph by hand, then >> persisting the graph with persist(StorageLevel.MEMORY_AND_DISK). I still see >> the same behavior in connected components however, and the same th

Re: GraphX Connected Components

2014-07-30 Thread Ankur Dave
Jeffrey Picard writes: > I tried unpersisting the edges and vertices of the graph by hand, then > persisting the graph with persist(StorageLevel.MEMORY_AND_DISK). I still see > the same behavior in connected components however, and the same thing you > described in the storage page. Unfortunately

Re: GraphX Connected Components

2014-07-30 Thread Jeffrey Picard
On Jul 30, 2014, at 5:18 AM, Ankur Dave wrote: > Jeffrey Picard writes: >> As the program runs I’m seeing each iteration take longer and longer to >> complete, this seems counter intuitive to me, especially since I am seeing >> the shuffle read/write amounts decrease with each iteration. I wo

Re: GraphX Connected Components

2014-07-30 Thread Ankur Dave
Jeffrey Picard writes: > As the program runs I’m seeing each iteration take longer and longer to > complete, this seems counter intuitive to me, especially since I am seeing > the shuffle read/write amounts decrease with each iteration. I would think > that as more and more vertices converged t

GraphX Connected Components

2014-07-29 Thread Jeffrey Picard
Hey all, I’m currently trying to run connected components using GraphX on a large graph (~1.8b vertices and ~3b edges, most of them are self edges where the only edge that exists for vertex v is v->v) on emr using 50 m3.xlarge nodes. As the program runs I’m seeing each iteration take longer and