That did the trick, Abhishek! Thanks for the explanation, that answered a lot
of questions I had.
Dave
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.o
, Dave Jaffe
Mich-
Sparkperf from Databricks (https://github.com/databricks/spark-perf) is a good
stress test, covering a wide range of Spark functionality but especially ML.
I’ve tested it with Spark 1.6.0 on CDH 5.7. It may need some work for Spark 2.0.
Dave Jaffe
BigData Performance
VMware
dja
No, I am not using serializing either with memory or disk.
Dave Jaffe
VMware
dja...@vmware.com
From: Shreya Agarwal
Date: Monday, November 7, 2016 at 3:29 PM
To: Dave Jaffe , "user@spark.apache.org"
Subject: RE: Anomalous Spark RDD persistence behavior
I don’t think this is corre
disk take more memory than caching to
memory?
Is this behavior expected as dataset size exceeds available memory?
Thanks in advance,
Dave Jaffe
Big Data Performance
VMware
dja...@vmware.com