AIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Sunday, June 17, 2007 8:09:30 PM
Subject: Re: Lucene for chinese search
There are three things to watch out for chinese or CJK languages:
1. The content source or database need to be encoded in UTF-8.
2. StandardAnalyzer doesn't support ch
and Chinese text in my datasource, the search is
> >> working for
> >> English term, and Chinese char display as '???' in the result output.
> >>
> >> Please advice or send some sample / solutions
> >>
> >> Thanks.
> >>
>
gt;
>> I mixed English and Chinese text in my datasource, the search is
>> working for
>> English term, and Chinese char display as '???' in the result
output.
>>
>> Please advice or send some sample / solutions
>>
>> Thanks.
>>
>>
not display for Chinese term.
>>
>> I mixed English and Chinese text in my datasource, the search is
>> working for
>> English term, and Chinese char display as '???' in the result output.
>>
>> Please advice or send some sample / solutions
>>
>>
output.
Please advice or send some sample / solutions
Thanks.
-Original Message-
From: Mathieu Lecarme [mailto:[EMAIL PROTECTED]
Sent: Monday, June 18, 2007 8:58 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene for chinese search
Lee Li Bin a écrit :
> Hi,
>
> I still m
2007 8:58 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene for chinese search
Lee Li Bin a écrit :
> Hi,
>
> I still met problem for searching of Chinese words.
> XMl file which is the datasource and analyzer has already been encoded.
> Have testing on StandardAnalyzer, CJKAnalyzer,
hieu Lecarme [mailto:[EMAIL PROTECTED]
Sent: Monday, June 18, 2007 8:58 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene for chinese search
Lee Li Bin a écrit :
> Hi,
>
> I still met problem for searching of Chinese words.
> XMl file which is the datasource and analyzer has already
Lee Li Bin a écrit :
> Hi,
>
> I still met problem for searching of Chinese words.
> XMl file which is the datasource and analyzer has already been encoded.
> Have testing on StandardAnalyzer, CJKAnalyzer, and ChineseAnalyzer, but it
> still can't get any results.
>
> 1.do we need any encoding
t: Re: Lucene for chinese search
There are three things to watch out for chinese or CJK languages:
1. The content source or database need to be encoded in UTF-8.
2. StandardAnalyzer doesn't support chinese words well. Use either
ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer i
There are three things to watch out for chinese or CJK languages:
1. The content source or database need to be encoded in UTF-8.
2. StandardAnalyzer doesn't support chinese words well. Use either
ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer is a
little better.
3. The user's q
10 matches
Mail list logo