There are three things to watch out for chinese or CJK languages:

1. The content source or database need to be encoded in UTF-8.
2. StandardAnalyzer doesn't support chinese words well. Use either
ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer is a
little better.
3. The user's query should be encoded in UTF-8.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes


On 6/17/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Hi,

I would like to know whether Standard Analyzer allows searching of chinese
words?

And in order to support chinese searching, is there any encoding needed in
order to develop the application?

I'm currently using Jetty as web server, jsp as application, and search
results will be saved in xml file and display it using xsl. So is there
encoding needed for any of the files (xml, xsl, etc...) as well as during
parsing of query?

thanks alot


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to