As far as I know Lucene only handle single word synonyms at index time. My life would be much simpler if it was possible to add synonyms that spanned over multiple tokens, such as "lucene in action"="lia". I have a couple of workarounds that are OK but it really isn't the same thing when it comes down to the scoring.

The thing that does the best job at scoring was to assemble several permutations of the same document. But it doesn't feel good. I have cases where that means several hundred documents, and I have to do post processing to filter out the "duplicate" hits. It can turn out to be rather expensive. And I'm sure it mess with the scoring in several ways I did not notice yet.

I've also considering creating some multi dimensional term position space, but I'd say that could take a lot of time to implement.

Are there any good solutions to this?


        karl

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to