I added some code you advised and the result is as follows:

Text: AaaBCcDdEFGgHhIiJKkLMmN

        Pos     start   end
        Inc     Ofst    Ofst
[Aaa]   1       0       3
[B]     1       3       4
[Cc]    1       4       6
[Dd]    1       6       8
[E]     1       8       9
[F]     1       9       10
[Gg]    1       10      12
[Hh]    1       12      14
[Ii]    1       14      16
[J]     1       16      17
[Kk]    1       17      19
[L]     1       19      20
[Mm]    1       20      22
[N]     1       22      23

Output:
<B>AaaBCcDdEFGgHhIiJKkLMmN</B>

It seems JapaneseAnalyzer produces correct tokens
to me.

Any thoughts?

Koji

> -----Original Message-----
> From: markharw00d [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, September 06, 2005 3:37 PM
> To: java-user@lucene.apache.org
> Subject: Re: Highlighter apply to Japanese
> 
> 
> I don't know the behaviour of the Japanese Analyzer you are using.
> Can you add to your example diagnosis the Token.getPositionIncrement, 
> Token.startOffset and Token.endOffset for each of the tokens?
> 
> The highlighter groups tokens with overlapping start and end offsets 
> into a single TokenGroup for the purposes of highlighting. 
> This allows 
> TokenStreams which produce multiple synonyms for the same 
> source token 
> to work. This behaviour was also required to get the CJKAnalyzer to 
> work. It could be that the Analyzer you are using is 
> producing a stream 
> of tokens which *all* overlap?
> 
> Cheers
> Mark
> 
> 
>               
> ___________________________________________________________ 
> To help you stay safe and secure online, we've developed the 
> all new Yahoo! Security Centre. http://uk.security.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to