Re: GraphX graph partitioning strategy

2014-09-17 Thread Larry Xiao
Hi Ankur, all, I've implemented few graph partitioning algorithms, and done some evaluation. The goal is to lower replication factor and produce better balanced graph, so to make work load more balance. Detailed description and result: https://issues.apache.org/jira/browse/SPARK-3523 Can you

Re: GraphX graph partitioning strategy

2014-07-25 Thread Larry Xiao
On 7/26/14, 4:03 AM, Ankur Dave wrote: Oops, the code should be: val unpartitionedGraph: Graph[Int, Int] = ...val numPartitions: Int = 128 def getTripletPartition(e: EdgeTriplet[Int, Int]): PartitionID = ... // Get the triplets using GraphX, then use Spark to repartition themval partitionedEdges

Re: GraphX graph partitioning strategy

2014-07-25 Thread Ankur Dave
Oops, the code should be: val unpartitionedGraph: Graph[Int, Int] = ...val numPartitions: Int = 128 def getTripletPartition(e: EdgeTriplet[Int, Int]): PartitionID = ... // Get the triplets using GraphX, then use Spark to repartition themval partitionedEdges = unpartitionedGraph.triplets .map(e =

Re: GraphX graph partitioning strategy

2014-07-25 Thread Ankur Dave
Hi Larry, GraphX's graph constructor leaves the edges in their original partitions by default. To support arbitrary multipass graph partitioning, one idea is to take advantage of that by partitioning the graph externally to GraphX (though possibly using information from GraphX such as the degrees)

GraphX graph partitioning strategy

2014-07-24 Thread Larry Xiao
Hi all, I'm implementing graph partitioning strategy for GraphX, learning from researches on graph computing. I have two questions: - a specific implement question: In current design, only vertex ID of src and dst are provided (PartitionStrategy.scala). And some strategies require knowledge