Hello I tried saveAsTextFile but this saves the structure as text. After reading from the text file I can't access the structure directly. So how?
Thanks again. On Sun, Nov 28, 2021 at 1:24 PM Sean Owen <sro...@gmail.com> wrote: > You didn't import the class. > persist() does not save across sessions. You need to write with methods > like saveAsTextFile or whatever is appropriate, or .write methods on a > DataFrame. > > On Sat, Nov 27, 2021 at 9:13 PM Kendall Wagner <kendawag...@gmail.com> > wrote: > >> Hello, >> >> Sorry I am a spark newbie. >> In pyspark session, I want to store the RDD so that next time I run >> pyspark again, the RDD will be reloaded. >> >> I tried this: >> >> >>> fruit.count() >> 1000 >> >> >>> fruit.take(5) >> [('peach', 1), ('apricot', 2), ('apple', 3), ('haw', 1), ('persimmon', 9)] >> >> >>> fruit.persist(StorageLevel.DISK_ONLY) >> Traceback (most recent call last): >> File "<stdin>", line 1, in <module> >> NameError: name 'StorageLevel' is not defined >> >> >> RDD.persist method seems not working for me. >> How to store a RDD to disk and how can I reload it again? >> >> >> Thank you in advance. >> Kendall >> >> >>