Re: Python: saving/reloading RDD

Shannon Quinn Fri, 18 Jul 2014 08:56:45 -0700

+1, had to learn this the hard way when some of my objects were writtenas pointers, rather than translated correctly to strings :)


On 7/18/14, 11:52 AM, Xiangrui Meng wrote:

You can save RDDs to text files using RDD.saveAsTextFile and load it back using 
sc.textFile. But make sure the record to string conversion is correctly 
implemented if the type is not primitive and you have the parser to load them 
back. -Xiangrui

On Jul 18, 2014, at 8:39 AM, Roch Denis <rde...@exostatic.com> wrote:

Hello,

Just to make sure I correctly read the doc and the forums. It's my
understanding that currently in python with Spark 1.0.1 there is no way to
save my RDD to disk that I can just reload. The hadoop RDD are not yet
present in Python.

Is that correct? I just want to make sure that's the case before I write a
workaround.

Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Python-saving-reloading-RDD-tp10172.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Python: saving/reloading RDD

Reply via email to