Thanks Uwe for your explanation, Indeed that's what I understood that scanning will happen first. Is there a way to run a subquery in Lucene, i.e. running a query only on the result of a first query to avoid scanning the whole index ? Is is worth forwarding this request to the developers, do you think it is feasible to implement such a short circuit operator where the term is "late" evaluated only if the expression to the left evaluates to true to avoid scanning the index in its entirety ?
Thanks in advance for your help -----Original Message----- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: 15 February 2012 21:16 To: java-user@lucene.apache.org Subject: RE: Short circuit AND or subquerying in lucene for performance > : Basically for queries such as field1:foo AND field2:*bar, I think it > : would be highly beneficial to restrict evaluation of the second field on > : the result of the first to avoid scanning the index in its entirety due > : to the leading wildcard. > > This is exactly how the BooleanQuery class in Lucene works. > > Please note the logic in ConjunctionScorer and BooleanScorer2 (how much > optimizing can be done depends on wether all of the clauses are required or > not) The problem here is more the leading wildcard query. The terms are scanned before the scoring/result collection occurs (partly during query rewrite, partly as bitset before the scorer starts - depends on term density). The problem is that short circuiting in BS2 occurs when the wild card bitsets are already calculated... For wildcard queries there is no possibility to optimize the document collection, because *every* matching term has to be scanned and termdocs retrieved. Uwe --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org **************************************************************** Daiwa Capital Markets Europe Limited is registered in England (registered number 01487359). The registered office is at 5 King William Street, London EC4N 7AX. The company is authorised and regulated by The Financial Services Authority and is a member of the London Stock Exchange. The information contained in this E-Mail is confidential unless the sender has specifically stated otherwise. If you are not the intended recipient please notify Daiwa Capital Markets Europe Limited at the sender's address and delete it immediately. Communications sent by or to any person through our computer systems may be viewed by other personnel and agents of Daiwa Capital Markets Europe Limited . The sender does not intend by sending this message to form a contract with the recipient, and Daiwa Capital Markets Europe Limited, its affiliates and staff do not accept any liability for the contents of this message. The information contained herein has been obtained from sources we believe to be reliable but we do not represent that it is accurate or complete, and therefore, Daiwa Capital Markets Europe Limited, its affiliates and staff cannot be held responsible or liable for the contents of this message. The foregoing is not an offer or solicitation to buy or sell any security, instrument or investment. In addition Daiwa Capital Markets Europe Limited, or any affiliated company, may have an interest, position, or effect transactions, in any investment mentioned herein. Any opinions or recommendations expressed herein are solely those of the author or analyst. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org