This matrix is the format of a Document Term Matrix. Each row represents all the words in a single document, each column represents just one of the possible words, and the elements of the matrix are the corresponding word counts.
Simple example here http://en.wikipedia.org/wiki/Document-term_matrix -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/The-explanation-of-input-text-format-using-LDA-in-Spark-tp22781p22858.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org