Hello, Sorry I am a spark newbie. In pyspark session, I want to store the RDD so that next time I run pyspark again, the RDD will be reloaded.
I tried this: >>> fruit.count() 1000 >>> fruit.take(5) [('peach', 1), ('apricot', 2), ('apple', 3), ('haw', 1), ('persimmon', 9)] >>> fruit.persist(StorageLevel.DISK_ONLY) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'StorageLevel' is not defined RDD.persist method seems not working for me. How to store a RDD to disk and how can I reload it again? Thank you in advance. Kendall