I noticed there is still no JIRA ticket for this, do we have any type on 
consensus on how this issue will/will not be resolved?  

If MultiSearcher and and MultiReader do not give the same results, I would 
think one would be considered "broken" and/or possibly "unfixable".  Is 
MultiSearcher going to be deprecated and removed?  Should one always use 
MultiReader in order to get accurate results?


-----Original Message-----
From: Robert Muir [mailto:rcm...@gmail.com] 
Sent: Wednesday, November 10, 2010 1:19 PM
To: java-user@lucene.apache.org
Subject: Re: Search returning documents matching a NOT range

On Wed, Nov 10, 2010 at 1:04 PM, Uwe Schindler <u...@thetaphi.de> wrote:
> I know where the bug is...
>
> The problem has nothing to to with MultiSearcher at all, its just the 
> rewritten query. Because (as Robert said) MultiSearcher rewrites per index, 
> the rewritten query is different for each sub-index. The problems, Robert 
> mentioned only affect scoring (which is different). As Range is ConstantScore 
> it should still return same documents.
>

I disagree, it really is because of how MultiSearcher combine()s the
queries from the subsearchers.

and the problems again don't affect scoring:

* you might hit boolean max clause count with AUTO if you use
MultiSearcher (e.g. 5 subs return 250 terms each and the total unique
across all in combine() is > 1024). In this case the query won't work
at all, but it works with MultiReader. And this is really bad, because
i think we document you will not hit this boolean max clause count
with AUTO, and we default to it for this reason.

* fuzzyquery etc will return completely different documents, or if you
manually use TopTermsRewrite for some other query.

none of these problems happen if you use MultiReader. This is why i
suggested removing MultiSearcher, at the least, if you use this thing
i think you should explicitly set FILTER_REWRITE.

> The problem here seems to be that in the first searcher der Boolean rewrites 
> to an empty BooleanQuery(), which is fine for ranges that hit no terms at all 
> (this is also supported and works). The problem seems to be that a SHOULD_NOT 
> clause on an empty Boolean query produces this bug!

yes, i'm not sure this is the "problem", i think the problem is in
combine(), but this is the root cause.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to