+1, had to learn this the hard way when some of my objects were written
as pointers, rather than translated correctly to strings :)
On 7/18/14, 11:52 AM, Xiangrui Meng wrote:
You can save RDDs to text files using RDD.saveAsTextFile and load it back using
sc.textFile. But make sure the record to string conversion is correctly
implemented if the type is not primitive and you have the parser to load them
back. -Xiangrui
On Jul 18, 2014, at 8:39 AM, Roch Denis <rde...@exostatic.com> wrote:
Hello,
Just to make sure I correctly read the doc and the forums. It's my
understanding that currently in python with Spark 1.0.1 there is no way to
save my RDD to disk that I can just reload. The hadoop RDD are not yet
present in Python.
Is that correct? I just want to make sure that's the case before I write a
workaround.
Thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-saving-reloading-RDD-tp10172.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.