Dear Spark users,

I created an LDA model using Spark in Java and would like to do some similarity 
queries now, I'm especially interested in "query -> most similar docs" method. 
I spent many hours looking for some examples how to map the query to LDA space, 
but didn't come out with any clear solution. I would be very grateful if you 
could suggest some resources on that point.

String query = "java developer"

DistributedLDAModel ldaModel = DistributedLDAModel.load(sc.sc(), 
"test_lda_model");

// topic distribution over docs

JavaPairRDD<Long, org.apache.spark.mllib.linalg.Vector> 
topicDistributionsOverDocs = ldaModel.javaTopicDistributions();



// Inferred topics, where each topic is represented by a distribution over 
terms. k is the number of topics
Matrix topics = ldaModel.topicsMatrix();



The main question is how I can convert my query to LDA vector space..



Thank you and have a nice day!

Olga


Reply via email to