Hi,

----Original Message-----
> From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL
> Sent: Monday, April 15, 2013 9:43 PM
> To: java-user@lucene.apache.org
> Subject: Re: Statically store sub-collections for search (faceted search?)
> 
> Hi Uwe,
> 
> Thanks for the info, I was under the impression that it didn't... I got this 
> info
> (that filters don't have a limit because they are not scoring) from a document
> like the one below. Can't say this is the exact doc because its been a while
> since I saw that, though.
> 
> http://searchhub.org/2009/06/08/bringing-the-highlighter-back-to-wildcard-
> queries-in-solr-14/
> 
> """
> As a response to this performance pitfall on very large indices’s (and the
> infamous TooManyClauses exception), new queries were developed that
> relied on a new Query class called ConstantScoreQuery.
> ConstantScoreQuerys accept a filter of matching documents and then score
> with a constant value equal to the boost. Depending on the qualities of your
> index, this method can be faster than the Boolean expansion method, and
> more importantly, does not suffer from TooManyClauses exceptions. Rather
> than matching and scoring n BooleanQuery clauses (potentially thousands of
> clauses), a single filter is enumerated and then traveled for scoring. On the
> other hand, constructing and scoring with a BooleanQuery containing a few
> clauses is likely to be much faster than constructing and traveling a Filter.
> """

This is true, but you misunderstood it: This is about MultiTermQueries (which 
is the superclass of WildcardQuery, Fuzzy-, and range queries). Those queries 
are no native Lucene queries, so they rewrite to basic/native queries. In 
earlier Lucene versions, Wildcards were always rewritten to BooleanQueries with 
many TermQueries (one for each term that matches the wildcard), leading to the 
problem with too many terms. This is still the case, but only in some limits 
(this mode is only used if the wildcard expands to few terms). Those 
BooleanQueris are then used with ConstantScoreQuery(Query).
The above text talks about another mode (which is used for many terms today): 
*No* BooleanQuery is build at all, instead all matching term's documents are 
marked in a BitSet and this BitSet is used with a Filter to construct a 
different Query type: ConstantScoreQuery(Filter). The BooleanQuery max clause 
count does not apply, because no BooleanQuery is involved in the whole process. 
If you use ConstantScoreQuery(BooleanQuery), the limit still applies, but not 
for ConstantScoreQuery(internalWildcardFilter).

Uwe

> On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote:
> 
> > The limit also applies for filters. If you have a list of terms ORed 
> > together,
> the fastest way is not to use a BooleanQuery at all, but instead a TermsFilter
> (which has no limits).
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: Carsten Schnober [mailto:schno...@ids-mannheim.de]
> >> Sent: Monday, April 15, 2013 9:53 AM
> >> To: java-user@lucene.apache.org
> >> Subject: Re: Statically store sub-collections for search (faceted
> >> search?)
> >>
> >> Am 12.04.2013 20:08, schrieb SUJIT PAL:
> >>> Hi Carsten,
> >>>
> >>> Why not use your idea of the BooleanQuery but wrap it in a Filter
> instead?
> >> Since you are not doing any scoring (only filtering), the max boolean
> >> clauses limit should not apply to a filter.
> >>
> >> Hi Sujit,
> >> thanks for your suggestion! I wasn't aware that the max clause limit
> >> does not match for a BooleanQuery wrapped in a filter. I suppose the
> >> ideal way would be to use a BooleanFilter but not a QueryWrapperFilter,
> right?
> >>
> >> However, I am also not sure how to apply a filter in my use case
> >> because I perform a SpanQuery. Although SpanQuery#getSpans() does
> >> take a Bits object as an argument (acceptDocs), I haven't been able
> >> to figure out how to generate this Bits object correctly from a Filter
> object.
> >>
> >> Best,
> >> Carsten
> >>
> >> --
> >> Institut für Deutsche Sprache | http://www.ids-mannheim.de
> >> Projekt KorAP                 | http://korap.ids-mannheim.de
> >> Tel. +49-(0)621-43740789      | schno...@ids-mannheim.de
> >> Korpusanalyseplattform der nächsten Generation Next Generation
> Corpus
> >> Analysis Platform
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to