2011/4/23 Weiwei Wang <ww.wang...@gmail.com>: > hi,all > I'm working on a Chinese contact search project, I need to transform > the Chinese words to its Pinyin form. > > e.g. > 中国--> zhongguo > > The problem I encounter is that for some chinese words which have more than > one transforms, like. 贾-> jia, 贾->gu, ... > > I already used the ICUTransformFilter(Han->Latin/Names),how could i get all > the transforms instead just one of them? >
Maybe use the unihan database (e.g. generate synonyms or something from it, or make a special filter) ? http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%E8%B4%BE kMandarin JIA3 GU3 JIA4 you can download this as a zip file. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org