On 03/11/2013 01:22 PM, Michael McCandless wrote:
On Mon, Mar 11, 2013 at 9:32 AM, Carsten Schnober
<schno...@ids-mannheim.de> wrote:
Am 11.03.2013 13:38, schrieb Michael McCandless:
On Mon, Mar 11, 2013 at 7:08 AM, Uwe Schindler<u...@thetaphi.de> wrote:
Set the rewrite method to e.g. SCORING_BOOLEAN_QUERY_REWRITE, then this should
work (after rewrite your query is a BooleanQuery, which supports
extractTerms()).
... as long as you don't exceed the max number of terms allowed by BQ
(1024 by default, but you can raise it).
True, I've noticed this meanwhile. Are there any recommendations for
this setting where the limit is as large as possible while staying
within a reasonable performance? Of course, this is highly subjective,
but what's the magnitude here? Will a limit of 1,024,000 typically
increase the query time by the factor 1,000 too?
Carsten
I think 1024 may already be too high ;)
But really it depends on your situation: test different limits and see.
How much slower a larger query is depends on the specifics of the terms ...
This doesn't really address the OP's question about selecting terms, but
I thought it might be interesting...
We've taken some measurements of query performance scaling as you add
terms, since we tend to generate large lists of query terms when
restricting access to content by user entitlements. I went back and
read a theoretical result on the scaling here (sorry lost the link - I
think it was in an early paper by Doug Cutting): it seems there is a log
component and a linear component. We saw mostly the linear behavior in
our tests. I think in practice, taking into consideration the amount of
time dedicated to search vs other components of a complete system that
1024 is a reasonable limit. We've basically told our customers that if
they want to entitle lists of > 1024 items, they should instead group
them and sell the groups. But of course there is flexibility to go to
say 2K if we have to. Anyway, just confirming the default seems
sensible, but yes queries will slow down with more terms.
-Mike Sokolov
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org