Stemmers are heuristic transformations aiming at reducing the vocabulary's dimensionality (and for other purposes I don't want to discuss here). For accurate transformations one would use a lemmatization engine (typically dictionary-driven) combined with morphological analysis for ambiguity resolution. So, stemming should be perceived as a "one-way" transformation from inflected forms to some form of a unique identifier for a common lemma (a set of word forms with identical meaning).
I don't know if you can call it a "reverse stemmer", but there are tools for generating inflected forms of lemmas (let's call them "root words") given the morphological tag or annotation. This is particularly useful for languages with rich inflection paradigms (so that you can construct grammatically correct sequences of words). One example of such a project is Morfologik: http://morfologik.blogspot.com/ Like Erick mentioned, though, this is probably far from what you actually need... Dawid On Tue, Oct 6, 2009 at 9:31 AM, David Leangen <apa...@leangen.net> wrote: > > Hello, > > I've been using Lucene in a very basic way for some time now, and I'm > starting to take advantage of some of the linguistic capabilities only now. > > I am making use of the snowball analyzer for stemming, and it works very > well. > > > Question: is there any such thing as a "reverse stemmer"? In other words, > given the stem of a word, is there any algorithm to find the original word? > Or is this just fantasy? ;-) > > Now, I understand that there is a 1:n mapping of stems:words. I can deal > with that. > > > Thanks! > =David > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org