Of the late I started using Lucene as main search library for all documents in 
our intranet. It works extremely well. I am trying to use similarity kinda 
functionality to find similarity between two sentences/documents and trying to 
use Wordnet in our searching solution. I have used wordnet contrib package and 
it really works well to expand queries with synonyms and get results. But I can 
get handicap when searching for documents with query like "Steve Jobs" and 
documents containing "apple" should be returned. In the same way "pirated" and 
"willfull downloading copyrighted material". This comes finding meaning of a 
word wrt its context. Has anybody done any kind of such context based indexing 
that means while tokenization based on context of each word/token and searching 
the same after expanding the query using synonyms. I have come across some sf 
projects like http://wn-similarity.sourceforge.net/  to semantically relating 
words using wordnet but I am
 still kinda confused on how to move ahead with such kind of context based 
search. Appreciate your help. I understand that this might not be directly 
related to Lucene but somehow this falls in the same domain search solution.


- RB


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Similarity

Reply via email to