Yes, "MoreLikeThis" is more like what I want.  

But theres one problem. Even here one has to run the query against an
indexed set of documents.

While I would like to create two Queries through "MoreLikeThis" and get a
score of how similar they are to each other.

Siddharth







Otis Gospodnetic wrote:
> 
> Hi,
> 
> Have a look at MoreLikeThis:
> 
> [EMAIL PROTECTED] trunk]$ ff \*MoreLikeThis\*.java
> ./contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThisQuery.java
> ./contrib/queries/src/java/org/apache/lucene/search/similar/MoreLikeThis.java
> 
> 
> I think that or something a lot like it is what you are after.
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> ----- Original Message ----
>> From: Sangrish <[EMAIL PROTECTED]>
>> To: java-user@lucene.apache.org
>> Sent: Friday, June 20, 2008 12:20:02 AM
>> Subject: Re: Arbitrary String to String Similarity Score
>> 
>> 
>> Given 2 text documents I want to quantitatively find, how similar they
>> are,
>> with respect to each other. Say, I want to find Cosine Similarity score
>> between any two given documents. I am trying to use Lucene for it (is it
>> good for this purpose?)
>> 
>> This use case is different from querying against a set of documents
>> 
>> I am not sure if Lucene provides a direct API to evaluate this score.
>> 
>> Siddharth
>> 
>> 
>> 
>> 
>> 
>> Grant Ingersoll-6 wrote:
>> > 
>> > You might also have a look at the MemoryIndex.  Question, though, is  
>> > what are you hoping to gain from doing a Query against a single  
>> > String?  Are you doing a FuzzyQuery?  You might look at the  
>> > SecondString project on SourceForge for doing string comparisons.
>> > 
>> > I guess I am a bit confused by your problem statement.  Perhaps you  
>> > can explain more what you are trying to do at a higher level, as it  
>> > sounds like to me you have str1 and str2, so why do you need to inject  
>> > an index into the middle of it?
>> > 
>> > -Grant
>> > 
>> > On Jun 19, 2008, at 8:33 PM, Sangrish wrote:
>> > 
>> >>
>> >> I have a use case for comparing two given strings (attached to a  
>> >> specific
>> >> field)
>> >> using Lucene and get the similarity scores.
>> >>
>> >>  I tried but could not find any built-in way to do so. Hence  
>> >> assuming that
>> >> Lucene only compares a Query against Indexed documents, I came up  
>> >> with the
>> >> following approach:
>> >> (Let the 2 strings be, str1 and str2 )
>> >>
>> >> 1) Create an IndexWriter using a RAMDirectory (I don't want to store  
>> >> those
>> >> strings on the disk)
>> >> 2) Index str1 and store it
>> >> 3) Search str2 in the index. ( shall the indexWriter be closed  
>> >> before you
>> >> search on the index? )
>> >> 4) Get the similarity score & publish it
>> >> 5) Delete str1 from the index and make the index available for a new
>> >> comparison
>> >>
>> >> Any comments & suggestions on making the process optimal
>> >>
>> >> Siddharth
>> >>
>> >> -- 
>> >> View this message in context:
>> >> 
>> http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18020806.html
>> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> >> For additional commands, e-mail: [EMAIL PROTECTED]
>> >>
>> > 
>> > --------------------------
>> > Grant Ingersoll
>> > http://www.lucidimagination.com
>> > 
>> > Lucene Helpful Hints:
>> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
>> > http://wiki.apache.org/lucene-java/LuceneFAQ
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: [EMAIL PROTECTED]
>> > For additional commands, e-mail: [EMAIL PROTECTED]
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18022691.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Arbitrary-String-to-String-Similarity-Score-tp18020806p18034468.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to