I think you'll have to go with MoreLikeThis (assuming your emails as tokenized 
suitably) and go through matches yourself to check for the % match.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Michael Prichard <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Monday, January 21, 2008 3:38:14 PM
Subject: Matching w/in X% ?

Say I have a field of To addresses from an email archive.  I do a
 search and I get 10 To addresses for a single hit.  Then I want to find
 similar email with the To addresses containing roughly 75% of those email
 addresses as well.  How would I do this?

In other words:
I get a result with:
To:  [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
 [EMAIL PROTECTED], [EMAIL PROTECTED]

Now I want to find similar emails with 75% of this addresses in the To
 field.....

Thanks!
Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to