On Sep 6, 2005, at 7:15 AM, Hacking Bear wrote:

On 9/6/05, Olivier Jaquemet <[EMAIL PROTECTED]> wrote:


As far as your usage is concerned, it seems to be the right approach,
and I think the StandardAnalyzer does the job pretty right when it has
to deal with whatever language you want.


I should look into exactly what it does. Does this StandardAnalyzer handle
non-European languages like Chinese?

StandardAnalyzer recognizes the CJK range of characters and emits them each as individual tokens. This is not an ideal way to deal with Chinese (不好 :) - but it does at least maintain the characters rather than throwing them away or blurring consecutive characters together.

    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to