On Thu, Jan 6, 2011 at 11:53 AM, Pulkit Singhal <pulkitsing...@gmail.com> wrote: > Hello, > > What's a good source to get dictionaries (for spellcorrections) and/or > thesaurus (for synonyms) that can be used with Lucene for non-English > languages such as Fresh, Chinese, Korean etc?
if you can't find a wordlist of correctly-spelled words somewhere else, you can always try http://wiki.services.openoffice.org/wiki/Dictionaries, grab the openoffice spellchecker dictionary for that language, and use the hunspell "unmunch" command (sort of like morphological generation) to generate a list of words you could then use with PlainTextDictionary. > > For example, the wordnet contrib module is based on the data set > provided by the Princeton based wordnet system but I'm wondering where > the Lucene users go for similar reliable source for other languages? > in this case i would also investigate the openoffice thesaurus data, if you cant find anything else. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org