As you might have already seen, Andrew Schetinin and I have published (at http://mail-archives.apache.org/mod_mbox/lucene-java-user/200603.mbox/[EMAIL PROTECTED]) a source code that handles synonyms at search time (query expansion).
This code includes also a de-boost factor for synonyms (compared with root term). It also fixes the distortion created by IDF relationships of the root terms and their synonyms. The way it produces the score is to simulate an aggregated frequency from frequencies of all synonym terms in each document, and then constructs a score for the "joint frequency. I realize you are asking about processing synonyms at index time, but note that terms injection does not allow you to (in addition to the de-boost issues you raise in your post): a) Change the synonym dictionary, once index is built. b) Change the boost factor once index is built. c) enable/disable the option of using synonyms (e.g. some applications has an "exact match" feature, or the client simply doesn't want to drift from "car" to "auto"). BTW, for reply please use ziv.gome_gmail_com (replace "_" where appropriate) Thanx, Ziv Gome -----Original Message----- From: zzzzz shalev [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 10, 2006 10:36 AM To: java-user@lucene.apache.org Subject: lowering score of doc if synonyms matched (synonyms indexed) i am currently adding synonyms at index time (and not expanding the query), i fear that there is a problem with this implementation: is there a way to lower the score of a document if it was found due to a synonyms match and not due to a match of the word queried. from what i understand the synonyms are indexed with the same placement as the original word which may make this impossible? thanks, --------------------------------- Blab-away for as little as 1ยข/min. Make PC-to-Phone Calls using Yahoo! Messenger with Voice. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]