How to use Hunspell dictionary to do the reverse of stemming ?

julien Blaize Tue, 24 Oct 2017 08:05:23 -0700

Hello,

i am lookingfor a way to efficiently do the reverse of stemming.
Example : if i give to the program the verb "drug" it will give me
"drugged', "drugging", "drugs", "drugstore" etc...


I have used the program wordforms from hunspell to generate all possibles
combinations of the input word (even all the ridiculous one's that does not
match a real word). The i use org.apache.lucene.analysis.hunspell.Dictionary
class to check if the word exists and map to the original word.
This is really long and not efficient.

I was looking at the internals of the Dictionary class and saw the use of
patterns and FST (finite state machine). This seems a very efficient way to
check for the stem of a word. But i was unable to find a way to do the
reverse operation.

I am wondering if anyone has tried to do something similar ? Can someone
who understand FST and the usage of patterns in the Dictionary class give
me hints of wether what i am trying to do is possible and will be efficient
?

Kind Regards.

--
Julien Blaize

How to use Hunspell dictionary to do the reverse of stemming ?

Reply via email to