Re: korean and lucene

Cheolgoo Kang Mon, 03 Oct 2005 18:12:01 -0700

StandardAnalyzer's JavaCC based StandardTokenizer.jj cannot read
Korean part of Unicode character blocks.


You should 1) use CJKAnalyzer or 2) add Korean character
block(0xAC00~0xD7AF) to the CJK token definition on the
StandardTokenizer.jj file.

Hope it helps.


On 10/4/05, John Wang <[EMAIL PROTECTED]> wrote:
> Hi:
>
> We are running into problems with searching on korean documents. We are
> using the StandardAnalyzer and everything works with Chinese and Japanese.
> Are there known problems with Korean with Lucene?
>
> Thanks
>
> -John
>
>


--
Cheolgoo

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: korean and lucene

Reply via email to