The code is here:https://github.com/Earthson/sparklda/blob/master/src/main/scala/net/earthson/nlp/lda/lda.scala
I've change it to from Broadcast to Serializable. Now it works:) But There are too many rdd cache, It is the problem? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Why-Spark-require-this-object-to-be-serializerable-tp5009p5024.html Sent from the Apache Spark User List mailing list archive at Nabble.com.