2011/4/23 Weiwei Wang <ww.wang...@gmail.com>:
> hi,all
>      I'm working on a Chinese contact search project, I need to transform
> the Chinese words to its Pinyin form.
>
> e.g.
>  中国--> zhongguo
>
> The problem I encounter is that for some chinese words which have more than
> one transforms, like. 贾-> jia, 贾->gu, ...
>
> I already used the ICUTransformFilter(Han->Latin/Names),how could i get all
> the transforms instead just one of them?
>

Maybe use the unihan database (e.g. generate synonyms or something
from it, or make a special filter) ?

http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=%E8%B4%BE
kMandarin       JIA3 GU3 JIA4

you can download this as a zip file.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to