On Thu, 2011-06-30 at 11:45 +0200, Guru Chandar wrote:
>  Thanks for the response. The documents are all distinct. My (limited)
>  understanding on partitioning the indexes will lead to results being
>  different from the case where you have all in one partition, due to
>  Lucene currently not supporting distributed idf. Is this correct?

Yes, that is a prime blocker for proper distributed search with Lucene.

>  Is there a way to make it work seamlessly?

There's some work being done with Solr, but it is not stable:
https://issues.apache.org/jira/browse/SOLR-1632

We're experimenting with distributed idf by assigning different weights
to the queries sent to different searchers, based on term statistics
from the different indexes. However, it is quite a hack and one we've
done because one of the indexes is external and out of our control.


3 years ago (or was it 4?) we also used distributed indexing with
searching being done on a single merged index. It worked surprisingly
well, but it was replaced by indexing on a single machine, when we
finally got around to doing a proper profiling of our indexing process
and removed some serious bottlenecks.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to