> Sent: Tuesday, September 06, 2005 7:22 PM
> To: java-user@lucene.apache.org
> Subject: RE: Highlighter apply to Japanese
>
>
> Try change TokenGroup.isDistinct();
>
> Maybe the offset test code should be >= rather than >
> ie
>
> boolean isDistinct(
Try change TokenGroup.isDistinct();
Maybe the offset test code should be >= rather than >
ie
boolean isDistinct(Token token)
{
return token.startOffset()>=endOffset;
}
I've just tried the change with the Junit test and all
seems well still with the non CJK
org
> Subject: Re: Highlighter apply to Japanese
>
>
> Hi, Koji,
>
> I had the same problem as you. This is because CJK's n-gram analysis
> is different from single character's.
>
> My get around is to use CJKHighlighter and
> CJKHighlightAnalyzer in sandbox.
>
tokens
to me.
Any thoughts?
Koji
> -Original Message-
> From: markharw00d [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, September 06, 2005 3:37 PM
> To: java-user@lucene.apache.org
> Subject: Re: Highlighter apply to Japanese
>
>
> I don't know the behaviour
Hi, Koji,
I had the same problem as you. This is because CJK's n-gram analysis
is different from single character's.
My get around is to use CJKHighlighter and CJKHighlightAnalyzer in sandbox.
--
Chris Lu
Lucene Search RAD on Any Database
http://www.dbsight.net
On 9/5/05, Koji Se
I don't know the behaviour of the Japanese Analyzer you are using.
Can you add to your example diagnosis the Token.getPositionIncrement,
Token.startOffset and Token.endOffset for each of the tokens?
The highlighter groups tokens with overlapping start and end offsets
into a single TokenGroup f