Re: Lucene for chinese search

2007-06-22 Thread Otis Gospodnetic
AIL PROTECTED]> To: java-user@lucene.apache.org Sent: Sunday, June 17, 2007 8:09:30 PM Subject: Re: Lucene for chinese search There are three things to watch out for chinese or CJK languages: 1. The content source or database need to be encoded in UTF-8. 2. StandardAnalyzer doesn't support ch

RE: Lucene for chinese search

2007-06-19 Thread Lee Li Bin
and Chinese text in my datasource, the search is > >> working for > >> English term, and Chinese char display as '???' in the result output. > >> > >> Please advice or send some sample / solutions > >> > >> Thanks. > >> >

Re: Lucene for chinese search

2007-06-18 Thread karl wettin
gt; >> I mixed English and Chinese text in my datasource, the search is >> working for >> English term, and Chinese char display as '???' in the result output. >> >> Please advice or send some sample / solutions >> >> Thanks. >> >>

Re: Lucene for chinese search

2007-06-18 Thread Chris Lu
not display for Chinese term. >> >> I mixed English and Chinese text in my datasource, the search is >> working for >> English term, and Chinese char display as '???' in the result output. >> >> Please advice or send some sample / solutions >> >>

Re: Lucene for chinese search

2007-06-18 Thread karl wettin
output. Please advice or send some sample / solutions Thanks. -Original Message- From: Mathieu Lecarme [mailto:[EMAIL PROTECTED] Sent: Monday, June 18, 2007 8:58 PM To: java-user@lucene.apache.org Subject: Re: Lucene for chinese search Lee Li Bin a écrit : > Hi, > > I still m

Re: Lucene for chinese search

2007-06-18 Thread Chris Lu
2007 8:58 PM To: java-user@lucene.apache.org Subject: Re: Lucene for chinese search Lee Li Bin a écrit : > Hi, > > I still met problem for searching of Chinese words. > XMl file which is the datasource and analyzer has already been encoded. > Have testing on StandardAnalyzer, CJKAnalyzer,

RE: Lucene for chinese search

2007-06-18 Thread Lee Li Bin
hieu Lecarme [mailto:[EMAIL PROTECTED] Sent: Monday, June 18, 2007 8:58 PM To: java-user@lucene.apache.org Subject: Re: Lucene for chinese search Lee Li Bin a écrit : > Hi, > > I still met problem for searching of Chinese words. > XMl file which is the datasource and analyzer has already

Re: Lucene for chinese search

2007-06-18 Thread Mathieu Lecarme
Lee Li Bin a écrit : > Hi, > > I still met problem for searching of Chinese words. > XMl file which is the datasource and analyzer has already been encoded. > Have testing on StandardAnalyzer, CJKAnalyzer, and ChineseAnalyzer, but it > still can't get any results. > > 1.do we need any encoding

RE: Lucene for chinese search

2007-06-18 Thread Lee Li Bin
t: Re: Lucene for chinese search There are three things to watch out for chinese or CJK languages: 1. The content source or database need to be encoded in UTF-8. 2. StandardAnalyzer doesn't support chinese words well. Use either ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer i

Re: Lucene for chinese search

2007-06-17 Thread Chris Lu
There are three things to watch out for chinese or CJK languages: 1. The content source or database need to be encoded in UTF-8. 2. StandardAnalyzer doesn't support chinese words well. Use either ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer is a little better. 3. The user's q