Yes, "MoreLikeThis" is more like what I want.
But theres one problem. Even here one has to run the query against an indexed set of documents. While I would like to create two Queries through "MoreLikeThis" and get a score of how similar they are to each other. Siddharth Otis Gospodnetic wrote: > > Hi, > > Have a look at MoreLikeThis: > > [EMAIL PROTECTED] trunk]$ ff \*MoreLikeThis\*.java > ./contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThisQuery.java > ./contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java > > > I think that or something a lot like it is what you are after. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > ----- Original Message ---- >> From: Sangrish <[EMAIL PROTECTED]> >> To: java-user@lucene.apache.org >> Sent: Friday, June 20, 2008 12:20:02 AM >> Subject: Re: Arbitrary String to String Similarity Score >> >> >> Given 2 text documents I want to quantitatively find, how similar they >> are, >> with respect to each other. Say, I want to find Cosine Similarity score >> between any two given documents. I am trying to use Lucene for it (is it >> good for this purpose?) >> >> This use case is different from querying against a set of documents >> >> I am not sure if Lucene provides a direct API to evaluate this score. >> >> Siddharth >> >> >> >> >> >> Grant Ingersoll-6 wrote: >> > >> > You might also have a look at the MemoryIndex. Question, though, is >> > what are you hoping to gain from doing a Query against a single >> > String? Are you doing a FuzzyQuery? You might look at the >> > SecondString project on SourceForge for doing string comparisons. >> > >> > I guess I am a bit confused by your problem statement. Perhaps you >> > can explain more what you are trying to do at a higher level, as it >> > sounds like to me you have str1 and str2, so why do you need to inject >> > an index into the middle of it? >> > >> > -Grant >> > >> > On Jun 19, 2008, at 8:33 PM, Sangrish wrote: >> > >> >> >> >> I have a use case for comparing two given strings (attached to a >> >> specific >> >> field) >> >> using Lucene and get the similarity scores. >> >> >> >> I tried but could not find any built-in way to do so. Hence >> >> assuming that >> >> Lucene only compares a Query against Indexed documents, I came up >> >> with the >> >> following approach: >> >> (Let the 2 strings be, str1 and str2 ) >> >> >> >> 1) Create an IndexWriter using a RAMDirectory (I don't want to store >> >> those >> >> strings on the disk) >> >> 2) Index str1 and store it >> >> 3) Search str2 in the index. ( shall the indexWriter be closed >> >> before you >> >> search on the index? ) >> >> 4) Get the similarity score & publish it >> >> 5) Delete str1 from the index and make the index available for a new >> >> comparison >> >> >> >> Any comments & suggestions on making the process optimal >> >> >> >> Siddharth >> >> >> >> -- >> >> View this message in context: >> >> >> http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18020806.html >> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> > >> > -------------------------- >> > Grant Ingersoll >> > http://www.lucidimagination.com >> > >> > Lucene Helpful Hints: >> > http://wiki.apache.org/lucene-java/BasicsOfPerformance >> > http://wiki.apache.org/lucene-java/LuceneFAQ >> > >> > >> > >> > >> > >> > >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: [EMAIL PROTECTED] >> > For additional commands, e-mail: [EMAIL PROTECTED] >> > >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18022691.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18034468.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]