Re: pyspark: results differ based on whether persist() has been called

2015-10-19 Thread Davies Liu
| ShuffledRowRDD[29] at persist at NativeMethodAccessorImpl.java:-2 > [] > +-(2) MapPartitionsRDD[28] at persist at > NativeMethodAccessorImpl.java:-2 [] > | MapPartitionsRDD[27] at persist at > NativeMethodAccessorImpl.java:-2 [] > | MapParti

pyspark: results differ based on whether persist() has been called

2015-10-19 Thread peay2
rsist at NativeMethodAccessorImpl.java:-2 [] | ShuffledRowRDD[38] at persist at NativeMethodAccessorImpl.java:-2 [] +-(200) MapPartitionsRDD[37] at persist at NativeMethodAccessorImpl.java:-2 [] | MapPartitionsRDD[36] at persist at NativeMethodAccessorImpl.java:-2 []