Hi,

  I posted a query yesterday and have tried out all the options given in
responses..

Basically, I am reading a very fat matrix (2000 by 500000 dimension matrix)
and am trying to run kmeans on it.
I keep on getting heap error..
Now, I am even using persist(StorageLevel.DISK_ONLY_2) option..
How do I process this large file..
The conf I am currently using
conf =
SparkConf().set("spark.executor.memory","16g").set("spark.akka.frameSize",
"100000000").set("spark.driver.memory","4g").set("spark.rdd.compress","true")
?
Thanks

Reply via email to