http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message -
> From: Guru Chandar
> To: java-user@lucene.apache.org
> Cc:
> Sent: Thursday, June 30, 2011 5:12 AM
> Subject: distributing the indexing process
>
On Thu, 2011-06-30 at 11:45 +0200, Guru Chandar wrote:
> Thanks for the response. The documents are all distinct. My (limited)
> understanding on partitioning the indexes will lead to results being
> different from the case where you have all in one partition, due to
> Lucene currently not supp
-gc
>
>
> -Original Message-
> From: Danil ŢORIN [mailto:torin...@gmail.com]
> Sent: Thursday, June 30, 2011 3:04 PM
> To: java-user@lucene.apache.org
> Subject: Re: distributing the indexing process
>
> It depends
>
> If all documents are distinct then, y
it work
seamlessly?
Regards,
-gc
-Original Message-
From: Danil ŢORIN [mailto:torin...@gmail.com]
Sent: Thursday, June 30, 2011 3:04 PM
To: java-user@lucene.apache.org
Subject: Re: distributing the indexing process
It depends
If all documents are distinct then, yeah, go for it
It depends
If all documents are distinct then, yeah, go for it.
If you have multiple versions of same document in your data and you
only want to index the latest version...then you need a clever way to
split data to make sure that all versions of document will be indexed
on same host, and you
If we have to index a lot of documents, is there a way to divide the
documents into multiple sets and index them on multiple machines in
parallel, and then merge the resulting indexes back into a single
machine? If yes, will the result be logically equivalent to indexing all
the documents on a s