Re: Exception in saving MatrixFactorizationModel

2015-09-06 Thread Ranjana Rajendran
It looks like you hit https://issues.apache.org/jira/browse/SPARK-7837 . As I understand this occurs if there is skew in unpartitioned data. Can you try partitioning model before saving it ? On Sat, Sep 5, 2015 at 11:16 PM, Madawa Soysa wrote: > outPath is correct. In the path, there are two di

Re: Creating RDD with key and Subkey

2015-08-19 Thread Ranjana Rajendran
Hi Ratika, I tried the following: val l = List("apple", "orange", "banana") var inner = new scala.collection.mutable.HashMap[String, List[String]] inner.put("fruits",l) var list = new scala.collection.mutable.HashMap[String, scala.collection.mutable.HashMap[String, List[String]]] list.put("fo

Graphx - how to add vertices to a HashSet of vertices ?

2015-08-13 Thread Ranjana Rajendran
Hi, sampledVertices is a HashSet of vertices var sampledVertices: HashSet[VertexId] = HashSet() In each iteration, I am making a list of neighborVertexIds val neighborVertexIds = burnEdges.map((e:Edge[Int]) => e.dstId) I want to add this neighborVertexIds to the sampledVertices Has

Re: Switch from Sort based to Hash based shuffle

2015-08-13 Thread Ranjana Rajendran
Hi Cheez, You can set the parameter spark.shuffle.manager when you submit the Spark job. --conf spark.shuffle.manager=hash Thank you, Ranjana On Thu, Aug 13, 2015 at 2:26 AM, cheez <11besemja...@seecs.edu.pk> wrote: > I understand that the current master branch of Spark uses Sort based > shuff