So first up GraphX is not really designed for real-time graph mutation time 
situations. That’s not to say it can’t be done but you may be butting up 
against some of the design limitations in that area. As a first point of 
interrogation you should look at the WebUI to see what particular tasks/stages 
are taking a long time, and what resource (CPU, IO, network, shuffles) do they 
seem to be bottle-necking on.
-------------------------------------------------------------------------------
Robin East
Spark GraphX in Action Michael Malak and Robin East
Manning Publications Co.
http://www.manning.com/books/spark-graphx-in-action 
<http://www.manning.com/books/spark-graphx-in-action>





> On 24 Feb 2016, at 12:05, Udbhav Agarwal <udbhav.agar...@syncoms.com> wrote:
> 
> Sounds useful Robin. Thanks. I will try that. But fyi in another case I 
> tested with adding only one vertex to the graph. In that case also the 
> latency for subsequent addition was increasing like for first addition of a 
> vertex its 3 seconds, then for second its 7 seconds and so on. This is a case 
> when I want to add vertices to graph as and when they are coming in our 
> system since it’s a real time system which I am trying to build so vertices 
> will be keep on coming.
>  
> Thanks.
> From: Robin East [mailto:robin.e...@xense.co.uk] 
> Sent: Wednesday, February 24, 2016 3:54 PM
> To: Udbhav Agarwal <udbhav.agar...@syncoms.com>
> Cc: user@spark.apache.org
> Subject: Re: Reindexing in graphx
>  
> It looks like you adding vertices one-by-one, you definitely don’t want to do 
> that. What happens when you batch together 400 vertices into an RDD and then 
> add 400 in one go?
> -------------------------------------------------------------------------------
> Robin East
> Spark GraphX in Action Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action 
> <http://www.manning.com/books/spark-graphx-in-action>
>  
>  
>  
> 
>  
> On 24 Feb 2016, at 05:49, Udbhav Agarwal <udbhav.agar...@syncoms.com 
> <mailto:udbhav.agar...@syncoms.com>> wrote:
>  
> Thank you Robin for your reply.
> Actually I am adding bunch of vertices in a graph in graphx using the 
> following method . I am facing the problem of latency. First time an addition 
> of say 400 vertices to a graph with 100,000 nodes takes around 7 seconds. 
> next time its taking 15 seconds. So every subsequent adds are taking more 
> time than the previous one. Hence I tried to do reindex() so the subsequent 
> operations can also be performed fast. 
> FYI My cluster is presently having one machine with 8 core and 8 gb ram. I am 
> running in local mode.
>  
> def addVertex(rdd: RDD[String], sc: SparkContext, session: String): Long = {
>     val defaultUser = (0, 0)
>     rdd.collect().foreach { x =>
>       {
>         val aVertex: RDD[(VertexId, (Int, Int))] = 
> sc.parallelize(Array((x.toLong, (100, 100))))
>         gVertices = gVertices.union(aVertex)
>       }
>     }
>     inputGraph = Graph(gVertices, gEdges, defaultUser)
>     inputGraph.cache()
>     gVertices = inputGraph.vertices
>     gVertices.cache()
>     val count = gVertices.count
>     println(count);
> 
>     return 1;
>   }
>  
>  
> From: Robin East [mailto:robin.e...@xense.co.uk 
> <mailto:robin.e...@xense.co.uk>] 
> Sent: Tuesday, February 23, 2016 8:15 PM
> To: Udbhav Agarwal <udbhav.agar...@syncoms.com 
> <mailto:udbhav.agar...@syncoms.com>>
> Subject: Re: Reindexing in graphx
>  
> Hi
>  
> Well this is the line that is failing in VertexRDDImpl:
>  
> require(partitionsRDD.partitioner.isDefined)
>  
> But really you shouldn’t need to be calling the reindex() function as it 
> deals with some internals of the GraphX implementation - it looks to me like 
> it ought to be a private method. Perhaps you could explain what you are 
> trying to achieve.
> -------------------------------------------------------------------------------
> Robin East
> Spark GraphX in Action Michael Malak and Robin East
> Manning Publications Co.
> http://www.manning.com/books/spark-graphx-in-action 
> <http://www.manning.com/books/spark-graphx-in-action>
>  
>  
>  
> 
>  
> On 23 Feb 2016, at 12:18, Udbhav Agarwal <udbhav.agar...@syncoms.com 
> <mailto:udbhav.agar...@syncoms.com>> wrote:
>  
> Hi,
> I am trying to add vertices to a graph in graphx and I want to do reindexing 
> in the graph. I can see there is an option of vertices.reindex() in graphX. 
> But when I am doing graph.vertices.reindex() am getting 
> Java.lang.IllegalArgumentException: requirement failed.
> Please help me know what I am missing with the syntax as I have seen the API 
> documentation where only vertices.reindex() is mentioned.
>  
> Thanks,
> Udbhav Agarwal

Reply via email to