On Sep 6, 2005, at 7:15 AM, Hacking Bear wrote:
On 9/6/05, Olivier Jaquemet <[EMAIL PROTECTED]> wrote:
As far as your usage is concerned, it seems to be the right approach,
and I think the StandardAnalyzer does the job pretty right when it
has
to deal with whatever language you want.
I should look into exactly what it does. Does this
StandardAnalyzer handle
non-European languages like Chinese?
StandardAnalyzer recognizes the CJK range of characters and emits
them each as individual tokens. This is not an ideal way to deal
with Chinese (不好 :) - but it does at least maintain the characters
rather than throwing them away or blurring consecutive characters
together.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]