Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread Grant Ingersoll
[EMAIL PROTECTED] wrote: Grant, considering the answer from Karl, it seems that we have to choice to put all the documents in one index or use an index for each language. You are using an index for each language. We are currently discussing the pros and cons for both solutions. Thus we would be

Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread karl wettin
On Tue, 2006-05-09 at 10:18 +0200, [EMAIL PROTECTED] wrote: > > considering the answer from Karl, it seems that we have to choice to > put all the documents in one index or use an index for each language. > You are using an index for each language. We are currently discussing > the pros and cons f

Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread pbatcoi
o use a separate index for each language. Thanks again for taking the time to answer our question! Greeting Stefan and Peter > --- Ursprüngliche Nachricht --- > Von: karl wettin <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: Re: Subject indexing and s

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread karl wettin
On Mon, 2006-05-08 at 16:08 +0200, karl wettin wrote: > On Mon, 2006-05-08 at 08:34 -0400, Grant Ingersoll wrote: > > This seems to be necessary because the IndexWriter takes an analyzer > > as parameter. Thus we can pass the English documents to the > > IndexWriter created with the English analyze

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread karl wettin
On Mon, 2006-05-08 at 08:34 -0400, Grant Ingersoll wrote: > This seems to be necessary because the IndexWriter takes an analyzer > as parameter. Thus we can pass the English documents to the > IndexWriter created with the English analyzer and so on. /** * Adds a document to this index, using the

Re: Subject indexing and seraching documents with multiple languages

2006-05-08 Thread Grant Ingersoll
We wrote our own MultiSearcher type class that manages this problem. It takes in a query in the user's native language and then feeds it to the searcher for that language, which uses a machine translation component to create a query for that index using that language's Analyzer. -Grant [EMAI

Subject indexing and seraching documents with multiple languages

2006-05-08 Thread pbatcoi
Hello, we need to index and search documents of multiple languages. Our current approach is: Determine the language of each document before passing it to Lucene and use a Lucene index for each language. This seems to be necessary because the IndexWriter takes an analyzer as parameter. Thus we c