Re: Query with many clauses

2014-10-29 Thread Michael Sokolov
I did some analysis with access-control lists and found that our customers have significant overlap in the documents they have access to, so we would be able to realize very nice compression in the number of terms in access control queries by indexing overlapping subsets. However this is a fai

Re: Query with many clauses

2014-10-29 Thread Yonik Seeley
For queries with many terms, where each term matches few documents (actually a single document for "ID filters" in my tests), I saw speedups between 4x and 8x http://heliosearch.org/solr-terms-query/ (the 3rd chart) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets

Re: Query with many clauses

2014-10-29 Thread Michael McCandless
I suggested TermsFilter, not TermFilter :) Note the sneaky extra s Mike McCandless http://blog.mikemccandless.com On Wed, Oct 29, 2014 at 8:20 AM, Pawel Rog wrote: > Hi, > I already tried to transform Queries to filter (TermQuery -> TermFilter) > but didn't see much speed up. I wrote tha

Re: Query with many clauses

2014-10-29 Thread Pawel Rog
Hi, I already tried to transform Queries to filter (TermQuery -> TermFilter) but didn't see much speed up. I wrote that wrapped this filter into ConstantScoreQuery and in other test I used FilteredQuery with MatchAllDocsQuery and BooleanFilter. Both cases seems to work quite similar in terms of pe

Re: Query with many clauses

2014-10-29 Thread Michael Sokolov
I'm curious to know more about your use case, because I have an idea for something that addresses this, but haven't found the opportunity to develop it yet - maybe somebody else wants to :). The basic idea is to reduce the number of terms needed to be looked up by collapsing commonly-occurring

Re: Query with many clauses

2014-10-29 Thread Michael McCandless
Are the clauses simple TermQuery? If so, try TermsFilter: it sorts the terms which should give some [small] speedup when visiting them in the terms dict, and it reuses a single TermsEnum across all terms. Mike McCandless http://blog.mikemccandless.com On Tue, Oct 28, 2014 at 9:40 PM, Pawel Ro