Re: IndexSearcher with two Indexes

Robert Muir Fri, 27 Jan 2012 14:10:55 -0800

On Fri, Jan 27, 2012 at 4:53 PM, Hany Azzam <h...@eecs.qmul.ac.uk> wrote:
> Hi Robert,
>
> Thanks for the reply. I am trying to do something different. If I use a 
> mutireader then the searching/scoring will take place over the two indexes at 
> the same time. However, in my case the subcomponents of the retrieval model 
> are calculated over separate evidence spaces. For example, the retrieval 
> model calculates something like that:
>
> score := P(query_term | documents) * P(query_term | relevant_documents)
>
> The P(query_term | documents) can be estimated using the index over the whole 
> collection of documents. The P(query_term | relevant_documents) can be 
> estimated using the index over the relevant documents only (which are known 
> prior to the execution of the query).
>


In this situation, if you want to combine the statistics from
different indexes in your own way, you can look at
IndexSearcher.termStatistics() and
IndexSearcher.collectionStatistics().
These are intended for situations like distributed search, but maybe
you can make use of them.

here is some pseudocode:

    IndexReader relevant = IndexReader.open(relevantDirectory);
    IndexReader documents = IndexReader.open(documentsDirectory);

    final IndexSearcher relevantSearcher = new IndexSearcher(relevant);
    IndexSearcher documentsSearcher = new IndexSearcher(documents) {

      @Override
      public CollectionStatistics collectionStatistics(String field)
throws IOException {
        CollectionStatistics documentStats = super.collectionStatistics(field);
        return new CollectionStatistics(...
someCombinationOf(documentStats + stuff from relevantSearcher));
      }

      // do a similar thing for termStatistics()....
    };

    documentsSearcher.search(...)

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: IndexSearcher with two Indexes

Reply via email to