Finally, I'm using file to save RDDs, and then reload it. It works fine, because Gibbs Sampling for LDA is really slow. It's about 10min to sampling 10k wiki document for 10 round(1 round/min).
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Why-Spark-require-this-object-to-be-serializerable-tp5009p5036.html Sent from the Apache Spark User List mailing list archive at Nabble.com.