Boolean Query search performance

Beard, Brian Wed, 05 Mar 2008 12:09:03 -0800

I'm using lucene 2.2.0.

I'm in the process of re-writing some queries to build BooleanQueries
instead of using query parser.


Bypassing query parser provides almost an order of magnitude improvement
for very large queries, but then the search performance takes 20-30%
longer. I'm adding boost values as a way to sort (long story, no
workaround I can see for now).

If I do a query.toString(), both queries give different results, which
is probably a clue (additional paren's with the BooleanQuery)

Query.toString the old way using queryParser:
    +(id:1^2.0 id:2 ... ) +type:CORE

Query.toString the new way using BooleanQuery:
    +((id:1^2.0) (id:2) ... ) +type:CORE

Does anyone have any ideas as to why the performance difference in
searching. I would think I could get the search time to be the same,
since after the query is formed everything remains the same. The same
number of hits are going through the hitCollector that gets called after
this, so all variables seem constant. QueryParser must be doing some
optimization I'm not taking advantage of. Code snippet follows.....

The previous way using queryParser was:

  String queryStr = (id:1^2 OR id:2 .... id:n) AND type:CORE
  Query query = parser.parse(queryStr);


The new way using BooleanQuery is...

  BooleanQuery totalBooleanQuery = new BooleanQuery();
  BooleanClause docTypeClause = 
    new BooleanClause( new TermQuery( 
    new Term("type", "CORE")), BooleanClause.Occur.MUST);

  BooleanQuery idBooleanQuery = new BooleanQuery();
  for (String uid : uniqueIdMap.keySet()) {
        TermQuery termQuery = 
         new TermQuery(new Term("id", uid));
        termQuery.setBoost(loopBoostValue);
        idBooleanQuery.add(
         new BooleanClause(termQuery, BooleanClause.Occur.SHOULD));
  } 

  totalBooleanQuery.add(docTypeClause);
  totalBooleanQuery.add(idBooleanQuery, Occur.MUST); 




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Boolean Query search performance

Reply via email to