The synonym analyzer shown in Lucene In Action is a good place to start. You need to change *all* occurrences of one form into another, both an index and search time to get consistent results.
There are some "interesting" implications for this, though, but they only really need to be considered if you need either phrase or span queries. For instance, let's say you have the following doc fragments: doc1: "this is a tcp interaction that I want to deal with" doc2: "this is a transmission control protocol interaction that I want to deal with" is "this" within 4 of "interaction" in both documents? Do you care? Also, is the phrase "transmission control protocol" match for the first document? Would the user be confused by matching a document with "tcp" in it for that phrase? For that matter, does searching on "transmission" match doc1? Mostly, these are issues that may or may not be relevant depending on the intent of the application... Highlighting also becomes interesting. Best Erick On 6/27/07, Aliaksandr Radzivanovich <[EMAIL PROTECTED]> wrote:
What if I need to search for synonyms, but synonyms can be expanded to phrases of several words? For example, user enters query "tcp", then my application should also find documents containing phrase "Transmission Control Protocol". And conversely, user enters "Transmission Control Protocol", then my application should also find documents with word "tcp". It seems like Lucene does not support this scenario out of the box. Then where to look for the solution? What Lucene extensions/classes/interfaces should I investigate? Thanks. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]