Thomas:

There are some rather extensive threads on this list about the "interesting"
issues that exist when indexing/searching other languages. I think you'd
find it worthwhile to search the list archive for foreign language or some
such...

The short answer as I remember is that there *is* a built-in stemmer, but
whether it does what you want when indexing multiple languages depends upon
what results you expect to get...and there's no clear answer that I
remember....

Erick

On 11/18/06, Thomas Klein <[EMAIL PROTECTED]> wrote:

Hi there,

I'm fairly new to lucene, I just developped a multi threaded indexing
tcp server using lucene to hmmm, let me remember, index stuffs :)

I have to index not only english, but french and german, and, I don't
know, perhaps other languages in the future.

Did lucene use a default stemmer or do I have to stem the texts before
indexing ?

Does a multi-language stemmer exists ?

(sorry if the answers are in the documentation, I didn't manage to
fully read it)

Thanks in advance !

Thomas Klein.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to