I have searched this mailing list but I could not find the answer to the following problem.
I'm using the 3.6.1 Japanese analyzer and it seems that when tokenizing some Japanese words, some characters are ignored and they are not returned in the tokens. In the attached example, the output is: [私 日本人 ] [] Note the empty token set of the second sample. I could not figure out if I'm doing something wrong, or if this is a bug in the Japanese analyzer. Thanks, Jerome Sauf indication contraire ci-dessus:/ Unless stated otherwise above: Compagnie IBM France Siège Social : 17 avenue de l'Europe, 92275 Bois-Colombes Cedex RCS Nanterre 552 118 465 Forme Sociale : S.A.S. Capital Social : 653.242.306,20 � SIREN/SIRET : 552 118 465 03644 - Code NAF 6202A
--------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org