On Thu, Jan 6, 2011 at 11:53 AM, Pulkit Singhal <pulkitsing...@gmail.com> wrote:
> Hello,
>
> What's a good source to get dictionaries (for spellcorrections) and/or
> thesaurus (for synonyms) that can be used with Lucene for non-English
> languages such as Fresh, Chinese, Korean etc?

if you can't find a wordlist of correctly-spelled words somewhere
else, you can always try
http://wiki.services.openoffice.org/wiki/Dictionaries, grab the
openoffice spellchecker dictionary for that language, and use the
hunspell "unmunch" command (sort of like morphological generation) to
generate a list of words you could then use with PlainTextDictionary.

>
> For example, the wordnet contrib module is based on the data set
> provided by the Princeton based wordnet system but I'm wondering where
> the Lucene users go for similar reliable source for other languages?
>

in this case i would also investigate the openoffice thesaurus data,
if you cant find anything else.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to