There are three things to watch out for chinese or CJK languages:
1. The content source or database need to be encoded in UTF-8. 2. StandardAnalyzer doesn't support chinese words well. Use either ChineseAnalyzer or CJKAnalyzer. My experience is that CJKAnalyzer is a little better. 3. The user's query should be encoded in UTF-8. -- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes On 6/17/07, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
Hi, I would like to know whether Standard Analyzer allows searching of chinese words? And in order to support chinese searching, is there any encoding needed in order to develop the application? I'm currently using Jetty as web server, jsp as application, and search results will be saved in xml file and display it using xsl. So is there encoding needed for any of the files (xml, xsl, etc...) as well as during parsing of query? thanks alot --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]