I have searched this mailing list but I could not find the answer to the 
following problem.

I'm using the 3.6.1 Japanese analyzer and it seems that when tokenizing 
some Japanese words, some characters are ignored and they are not returned 
in the tokens.

In the attached example, 
the output is:

[私 日本人 ]
[]

Note the empty token set of the second sample. I could not figure out if 
I'm doing something wrong, or if this is a bug in the Japanese analyzer.

Thanks,
Jerome



Sauf indication contraire ci-dessus:/ Unless stated otherwise above:
Compagnie IBM France
Siège Social : 17 avenue de l'Europe, 92275 Bois-Colombes Cedex
RCS Nanterre 552 118 465
Forme Sociale : S.A.S.
Capital Social : 653.242.306,20 �
SIREN/SIRET : 552 118 465 03644 - Code NAF 6202A 
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to