Hello,
you could have each node build a separate index, and then merge the
result back in a single consistent index using

org.apache.lucene.index.IndexWriter.addIndexes(Directory...)

Regards,
Sanne

2011/6/30 Guru Chandar <guru.chan...@consona.com>:
> Thanks for the response. The documents are all distinct. My (limited) 
> understanding on partitioning the indexes will lead to results being 
> different from the case where you have all in one partition, due to Lucene 
> currently not supporting distributed idf. Is this correct? Is there a way to 
> make it work seamlessly?
>
> Regards,
> -gc
>
>
> -----Original Message-----
> From: Danil ŢORIN [mailto:torin...@gmail.com]
> Sent: Thursday, June 30, 2011 3:04 PM
> To: java-user@lucene.apache.org
> Subject: Re: distributing the indexing process
>
> It depends....
>
> If all documents are distinct then, yeah, go for it.
>
> If you have multiple versions of same document in your data and you
> only want to index the latest version...then you need a clever way to
> split data to make sure that all versions of document will be indexed
> on same host, and you won't have duplicates later.
>
> But my biggest concern is: if your index is that big that you need to
> index it on different hosts, are you sure you want it to be combine in
> a single index?
> Maybe it's a good idea to partition it?
>
> On Thu, Jun 30, 2011 at 12:12, Guru Chandar <guru.chan...@consona.com> wrote:
>>
>>
>> If we have to index a lot of documents, is there a way to divide the
>> documents into multiple sets and index them on multiple machines in
>> parallel, and then merge the resulting indexes back into a single
>> machine? If yes, will the result be logically equivalent to indexing all
>> the documents on a single machine?
>>
>>
>>
>> Thanks,
>>
>> -gc
>>
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to