As you might have already seen, Andrew Schetinin and I have published (at 
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200603.mbox/[EMAIL 
PROTECTED]) a source code that handles synonyms at search time (query 
expansion). 

This code includes also a de-boost factor for synonyms (compared with root 
term). It also fixes the distortion created by IDF relationships of the root 
terms and their synonyms. The way it produces the score is to simulate an 
aggregated frequency from frequencies of all synonym terms in each document, 
and then constructs a score for the "joint frequency.

I realize you are asking about processing synonyms at index time, but note that 
terms injection does not allow you to (in addition to the de-boost issues you 
raise in your post): 
a) Change the synonym dictionary, once index is built.
b) Change the boost factor once index is built.
c) enable/disable the option of using synonyms (e.g. some applications has an 
"exact match" feature, or the client simply doesn't want to drift from "car" to 
"auto"). 


BTW, for reply please use ziv.gome_gmail_com (replace "_" where appropriate)

Thanx,
Ziv Gome

-----Original Message-----
From: zzzzz shalev [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, May 10, 2006 10:36 AM
To: java-user@lucene.apache.org
Subject: lowering score of doc if synonyms matched (synonyms indexed)

i am currently adding synonyms at index time (and not expanding the query), i 
fear that there is a problem with this implementation:
   
  is there a way to lower the score of a document if it was found due to a 
synonyms match and not due to a match of the word queried. from what i 
understand the synonyms are indexed with the same placement as the original 
word which may make this impossible?
   
  thanks,
   
   

                
---------------------------------
Blab-away for as little as 1ยข/min. Make  PC-to-Phone Calls using Yahoo! 
Messenger with Voice.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to