It looks like you adding vertices one-by-one, you definitely don’t want to do 
that. What happens when you batch together 400 vertices into an RDD and then 
add 400 in one go?
-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action 
<http://www.manning.com/books/spark-graphx-in-action>





> On 24 Feb 2016, at 05:49, Udbhav Agarwal <udbhav.agar...@syncoms.com> wrote:
> 
> Thank you Robin for your reply.
> Actually I am adding bunch of vertices in a graph in graphx using the 
> following method . I am facing the problem of latency. First time an addition 
> of say 400 vertices to a graph with 100,000 nodes takes around 7 seconds. 
> next time its taking 15 seconds. So every subsequent adds are taking more 
> time than the previous one. Hence I tried to do reindex() so the subsequent 
> operations can also be performed fast. 
> FYI My cluster is presently having one machine with 8 core and 8 gb ram. I am 
> running in local mode.
>  
> def addVertex(rdd: RDD[String], sc: SparkContext, session: String): Long = {
>     val defaultUser = (0, 0)
>     rdd.collect().foreach { x =>
>       {
>         val aVertex: RDD[(VertexId, (Int, Int))] = 
> sc.parallelize(Array((x.toLong, (100, 100))))
>         gVertices = gVertices.union(aVertex)
>       }
>     }
>     inputGraph = Graph(gVertices, gEdges, defaultUser)
>     inputGraph.cache()
>     gVertices = inputGraph.vertices
>     gVertices.cache()
>     val count = gVertices.count
>     println(count);
> 
>     return 1;
>   }
>  
>  
> From: Robin East [mailto:robin.e...@xense.co.uk] 
> Sent: Tuesday, February 23, 2016 8:15 PM
> To: Udbhav Agarwal <udbhav.agar...@syncoms.com>
> Subject: Re: Reindexing in graphx
>  
> Hi
>  
> Well this is the line that is failing in VertexRDDImpl:
>  
> require(partitionsRDD.partitioner.isDefined)
>  
> But really you shouldn’t need to be calling the reindex() function as it 
> deals with some internals of the GraphX implementation - it looks to me like 
> it ought to be a private method. Perhaps you could explain what you are 
> trying to achieve.
> -------------------------------------------------------------------------------
> Robin East
> Spark GraphX in Action Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action 
> <http://www.manning.com/books/spark-graphx-in-action>
>  
>  
>  
> 
>  
> On 23 Feb 2016, at 12:18, Udbhav Agarwal <udbhav.agar...@syncoms.com 
> <mailto:udbhav.agar...@syncoms.com>> wrote:
>  
> Hi,
> I am trying to add vertices to a graph in graphx and I want to do reindexing 
> in the graph. I can see there is an option of vertices.reindex() in graphX. 
> But when I am doing graph.vertices.reindex() am getting 
> Java.lang.IllegalArgumentException: requirement failed.
> Please help me know what I am missing with the syntax as I have seen the API 
> documentation where only vertices.reindex() is mentioned.
>  
> Thanks,
> Udbhav Agarwal

Reply via email to