Dear Spark users, I created an LDA model using Spark in Java and would like to do some similarity queries now, I'm especially interested in "query -> most similar docs" method. I spent many hours looking for some examples how to map the query to LDA space, but didn't come out with any clear solution. I would be very grateful if you could suggest some resources on that point.
String query = "java developer" DistributedLDAModel ldaModel = DistributedLDAModel.load(sc.sc(), "test_lda_model"); // topic distribution over docs JavaPairRDD<Long, org.apache.spark.mllib.linalg.Vector> topicDistributionsOverDocs = ldaModel.javaTopicDistributions(); // Inferred topics, where each topic is represented by a distribution over terms. k is the number of topics Matrix topics = ldaModel.topicsMatrix(); The main question is how I can convert my query to LDA vector space.. Thank you and have a nice day! Olga