Re: Graphx : Perfomance comparison over cluster

2014-07-23 Thread ShreyanshB
ed in benchmarking it. It'd be great if you can tell me how to configure and invoke this spark version. On Sun, Jul 20, 2014 at 9:02 PM, ankurdave [via Apache Spark User List] < ml-node+s1001560n10281...@n3.nabble.com> wrote: > On Fri, Jul 18, 2014 at 9:07 PM, ShreyanshB <

Re: Graphx : Perfomance comparison over cluster

2014-07-18 Thread ShreyanshB
then, and the way to configure and invoke Spark is > different. I can send you the correct configuration/invocation for this if > you're interested in benchmarking it. > > On Fri, Jul 18, 2014 at 7:14 PM, ShreyanshB <[hidden email] > <http://user/SendEmail.jtp?type=node&a

Graphx : Perfomance comparison over cluster

2014-07-18 Thread ShreyanshB
Hi, I am trying to compare Graphx and other distributed graph processing systems (graphlab) on my cluster of 64 nodes, each node having 32 cores and connected with infinite band. I looked at http://arxiv.org/pdf/1402.2394.pdf and stats provided over there. I had few questions regarding configura

Re: Graphx : optimal partitions for a graph and error in logs

2014-07-11 Thread ShreyanshB
Perfect! Thanks Ankur. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-optimal-partitions-for-a-graph-and-error-in-logs-tp9455p9488.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Graphx : optimal partitions for a graph and error in logs

2014-07-11 Thread ShreyanshB
Great! Thanks a lot. Hate to say this but I promise this is last quickie I looked at the configurations but I didn't find any parameter to tune for network bandwidth i.e. Is there anyway to tell graphx (spark) that I'm using 1G network or 10G network or infinite band? Does it figure out on its ow

Re: Graphx : optimal partitions for a graph and error in logs

2014-07-11 Thread ShreyanshB
Thanks a lot Ankur, I'll follow that. A last quick Does that error affect performance? ~Shreyansh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-optimal-partitions-for-a-graph-and-error-in-logs-tp9455p9462.html Sent from the Apache Spark User List

Graphx : optimal partitions for a graph and error in logs

2014-07-11 Thread ShreyanshB
Hi, I am trying graphx on live journal data. I have a cluster of 17 computing nodes, 1 master and 16 workers. I had few questions about this. * I built spark from spark-master (to avoid partitionBy error of spark 1.0). * I am using edgeFileList() to load data and I figured I need to provide part