Somehow, I had the impression that the TrebleCLEF and EuroMatrix european projects are meant to gather this kind of information sources.
But honestly, it's not as homogeneous as in OpenOffice. Mozilla also has dictionaries. Wiktionary can also be helpful. paul Le 7 janv. 2011 à 22:26, Robert Muir a écrit : > On Thu, Jan 6, 2011 at 11:53 AM, Pulkit Singhal <pulkitsing...@gmail.com> > wrote: >> Hello, >> >> What's a good source to get dictionaries (for spellcorrections) and/or >> thesaurus (for synonyms) that can be used with Lucene for non-English >> languages such as Fresh, Chinese, Korean etc? > > if you can't find a wordlist of correctly-spelled words somewhere > else, you can always try > http://wiki.services.openoffice.org/wiki/Dictionaries, grab the > openoffice spellchecker dictionary for that language, and use the > hunspell "unmunch" command (sort of like morphological generation) to > generate a list of words you could then use with PlainTextDictionary. > >> >> For example, the wordnet contrib module is based on the data set >> provided by the Princeton based wordnet system but I'm wondering where >> the Lucene users go for similar reliable source for other languages? >> > > in this case i would also investigate the openoffice thesaurus data, > if you cant find anything else. > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org