Any help is appreciated to proceed in this problem. On Sep 12, 2016 11:45 AM, "janardhan shetty" <janardhan...@gmail.com> wrote:
> Hi, > > I am trying to visualize the LDA model developed in spark scala (2.0 ML) > in LDAvis. > > Is there any links to convert the spark model parameters to the following > 5 params to visualize ? > > 1. φ, the K × W matrix containing the estimated probability mass function > over the W terms in the vocabulary for each of the K topics in the model. > Note that φkw > 0 for all k ∈ 1...K and all w ∈ 1...W, because of the > priors. (Although our software allows values of zero due to rounding). Each > of the K rows of φ must sum to one. > 2. θ, the D × K matrix containing the estimated probability mass function > over the K topics in the model for each of the D documents in the corpus. > Note that θdk > 0 for all d ∈ 1...D and all k ∈ 1...K, because of the > priors (although, as above, our software accepts zeroes due to rounding). > Each of the D rows of θ must sum to one. > 3. nd, the number of tokens observed in document d, where nd is required > to be an integer greater than zero, for documents d = 1...D. Denoted > doc.length in our code. > 4. vocab, the length-W character vector containing the terms in the > vocabulary (listed in the same order as the columns of φ). > 5. Mw, the frequency of term w across the entire corpus, where Mw is > required to be an integer greater than zero for each term w = 1...W. > Denoted term.frequency in our code. >