Hi Christian, You didn't mention Solr, so I'm not sure if you are aware of it. Maybe Solr meets your needs?
Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Christian Reuschling <christian.reuschl...@gmail.com> > To: java-user@lucene.apache.org > Sent: Tuesday, August 4, 2009 5:50:16 AM > Subject: ParallelMultiSearcher and idf > > Hello, > > when searching over multiple indices, we create one IndexReader for each > index, > and wrap them into a MultiReader, that we use for IndexSearcher creation. > > This is fine for searching multiple indices on one machine, but in the case > the > indices are distributed over the (intra)net, this scenario has several lacks: > > - searching/scoring/sorting is 100% on the client machine, so you need all the > ram and cpu power at every client. > - all the data necessary for scoring must go over the net - so the traffic > should be significantly higher > - thus, there is a lack of overall performance > > Nevertheless, creating a MultiReader and making a searcher out of it has one > advantage (at least can be an advantage depending on the scenario): The > document freqiencies of a term will be summed up, and thus it is 100% > transparent for scoring whether the indices are splittet or not. > > I'm wondering whether there is the possibility to get the advantages of both > scenarios, e.g. by first summing up the query terms-related document > frequencies, and sending them together with the query to every > (remote)searcher > of ParallelMultiSearcher, for scoring. > > Maybe this is exactly what ParallelMultiSearcher does, and I haven't seen it? > > > Thanks for clarification! > > Chris --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org