Thanks Uwe for your explanation,

Indeed that's what I understood that scanning will happen first.
Is there a way to run a subquery in Lucene, i.e. running a query only on
the result of a first query to avoid scanning the whole index ?
Is is worth forwarding this request to the developers, do you think it
is feasible to implement such a short circuit operator where the term is
"late" evaluated only if the expression to the left evaluates to true to
avoid scanning the index in its entirety ?

Thanks in advance for your help

-----Original Message-----
From: Uwe Schindler [mailto:u...@thetaphi.de] 
Sent: 15 February 2012 21:16
To: java-user@lucene.apache.org
Subject: RE: Short circuit AND or subquerying in lucene for performance

> : Basically for queries such as field1:foo AND field2:*bar, I think it
> : would be highly beneficial to restrict evaluation of the second
field on
> : the result of the first to avoid scanning the index in its entirety
due
> : to the leading wildcard.
> 
> This is exactly how the BooleanQuery class in Lucene works.
> 
> Please note the logic in ConjunctionScorer and BooleanScorer2 (how
much
> optimizing can be done depends on wether all of the clauses are
required
or
> not)

The problem here is more the leading wildcard query. The terms are
scanned
before the scoring/result collection occurs (partly during query
rewrite,
partly as bitset before the scorer starts - depends on term density).
The
problem is that short circuiting in BS2 occurs when the wild card
bitsets
are already calculated... For wildcard queries there is no possibility
to
optimize the document collection, because *every* matching term has to
be
scanned and termdocs retrieved.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


****************************************************************
Daiwa Capital Markets Europe Limited is registered in England (registered 
number 01487359). The registered office is at 5 King William Street, London 
EC4N 7AX. The company is authorised and regulated by The Financial Services 
Authority and is a member of the London Stock Exchange.

The information contained in this E-Mail is confidential unless the sender has 
specifically stated otherwise. If you are not the intended recipient please 
notify Daiwa Capital Markets Europe Limited at the sender's address and delete 
it immediately. Communications sent by or to any person through our computer 
systems may be viewed by other personnel and agents of Daiwa Capital Markets 
Europe Limited . The sender does not intend by sending this message to form a 
contract with the recipient, and Daiwa Capital Markets Europe Limited, its 
affiliates and staff do not accept any liability for the contents of this 
message.

The information contained herein has been obtained from sources we believe to 
be reliable but we do not represent that it is accurate or complete, and 
therefore, Daiwa Capital Markets Europe Limited, its affiliates and staff 
cannot be held  responsible or liable for the contents of this message. The 
foregoing is not an offer or solicitation to buy or sell any security, 
instrument or investment. In addition Daiwa Capital Markets Europe Limited, or 
any affiliated company, may have an interest, position, or effect transactions, 
in any investment mentioned herein. Any opinions or recommendations expressed 
herein are solely those of the author or analyst.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to