Re: distributing the indexing process

2011-07-06 Thread Otis Gospodnetic
http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message - > From: Guru Chandar > To: java-user@lucene.apache.org > Cc: > Sent: Thursday, June 30, 2011 5:12 AM > Subject: distributing the indexing process >

RE: distributing the indexing process

2011-06-30 Thread Toke Eskildsen
On Thu, 2011-06-30 at 11:45 +0200, Guru Chandar wrote: > Thanks for the response. The documents are all distinct. My (limited) > understanding on partitioning the indexes will lead to results being > different from the case where you have all in one partition, due to > Lucene currently not supp

Re: distributing the indexing process

2011-06-30 Thread Sanne Grinovero
-gc > > > -Original Message- > From: Danil ŢORIN [mailto:torin...@gmail.com] > Sent: Thursday, June 30, 2011 3:04 PM > To: java-user@lucene.apache.org > Subject: Re: distributing the indexing process > > It depends > > If all documents are distinct then, y

RE: distributing the indexing process

2011-06-30 Thread Guru Chandar
it work seamlessly? Regards, -gc -Original Message- From: Danil ŢORIN [mailto:torin...@gmail.com] Sent: Thursday, June 30, 2011 3:04 PM To: java-user@lucene.apache.org Subject: Re: distributing the indexing process It depends If all documents are distinct then, yeah, go for it

Re: distributing the indexing process

2011-06-30 Thread Danil ŢORIN
It depends If all documents are distinct then, yeah, go for it. If you have multiple versions of same document in your data and you only want to index the latest version...then you need a clever way to split data to make sure that all versions of document will be indexed on same host, and you

distributing the indexing process

2011-06-30 Thread Guru Chandar
If we have to index a lot of documents, is there a way to divide the documents into multiple sets and index them on multiple machines in parallel, and then merge the resulting indexes back into a single machine? If yes, will the result be logically equivalent to indexing all the documents on a s