You might want to look at stemming for "de pluralization" it boils down words
to their "root"
So bombs and bomming get stemmed to bomb.
I'm using the snowball stemmer, which handles different languages as well as
engl
Mufaddal Khumri wrote:
Are there
analyzers that do this already?
Its not an analyzer, but the "norm" feature of this tool does a good job
at getting to the normalized form of the words...
http://umlslex.nlm.nih.gov/lvg/current/
http://umlslex.nlm.nih.gov/lvg/current/docs/userDoc/norm.htm
Hello,
I am just posting this question out here since this might be a common
problem and some of you might have good pointers.
Is there algorithms/api built into lucene that would help de pluralize
words while indexing and/or while searching the index? Are there
analyzers that do this already?
T