What exactly is this probability distribution? For each word in your vocabulary
it is the probability that a randomly drawn word from a topic is that word.
Another way to visualise it is a 2-column vector where the 1st column is a word
in your vocabulary and the 2nd column is the probability of
Hello,
I have been trying to understand the LDA topic modeling example provided here:
https://spark.apache.org/docs/latest/mllib-clustering.html#latent-dirichlet-allocation-lda.
In the example, they load word count vectors from a text file that contains
these word counts and then they output th