How international languages are supported in Lucene

2008-06-05 Thread Michael Siu
Would someone tell me how Lucene supports indexing and searching documents that contain international languages? What do I need to do in additions to using the StandardAnalyzer? Thanks.

Re: How international languages are supported in Lucene

2008-06-05 Thread Grant Ingersoll
Hi Michael, That's a pretty open ended question and, I'm assuming, by "international languages" you mean non-English :-). You might get some mileage out of http://wiki.apache.org/lucene-java/IndexingOtherLanguages but it is a bit out of date (namely the sandbox references). Lucene inde

RE: How international languages are supported in Lucene

2008-06-05 Thread Michael Siu
Grant, Thanks for the timely reply. :-) No, we do not have a specific language in mind. Basically, our document source could potentially contain any language in the world. Supporting English, Spanish, Italian, French, Chinese, Russian and Japanese would be the minimum set. Do you mean we will n

Multi-language support within a single index

2008-06-05 Thread Glen Newton
I would like to be able to get multi-language support within a single index. I would appreciate input on what I am suggesting: Assuming that you want something like the following in your document: Title_english Title_french Title_german Keyword_english Keyword_french Keyword_german Let's pretend

Re: How international languages are supported in Lucene

2008-06-05 Thread Erick Erickson
See below On Thu, Jun 5, 2008 at 12:04 PM, Michael Siu <[EMAIL PROTECTED]> wrote: > Grant, > > Thanks for the timely reply. :-) > > No, we do not have a specific language in mind. Basically, our document > source could potentially contain any language in the world. Supporting > English, Spanish,

Re: Multi-language support within a single index

2008-06-05 Thread Erick Erickson
I'm not sure what you're getting at, but it seems awful similar to PerFieldAnalyzerWrapper that already exists and does (it seems to me on a quick scan) to do exactly what you want. And it works for both indexing and querying out-of-the-box. Best Erick On Thu, Jun 5, 2008 at 12:14 PM, Glen Ne

Re: Multi-language support within a single index

2008-06-05 Thread Glen Newton
Yes, thank-you for the pointer, and apologies for not doing my homework better. :-) It is exactly what I want. The scenario is where I have articles which tend to be in english and have abstracts, and for some of them have french language abstracts. Users may want to search the english abstracts o

RE: How international languages are supported in Lucene

2008-06-05 Thread Michael Siu
Thanks Erick. -Original Message- From: Erick Erickson [mailto:[EMAIL PROTECTED] Sent: Thursday, June 05, 2008 9:51 AM To: java-user@lucene.apache.org Subject: Re: How international languages are supported in Lucene See below On Thu, Jun 5, 2008 at 12:04 PM, Michael Siu <[EMAIL PROTECTED