Let's have highly selective q (matches a few docs), and weakly selective fq
(matches many documents, let it be a kind of access control query).
If we query them as is, it will took a while to materialize heavy filter
query eagerly, and then just check intersection with a few query results.
Thus, if we put fq={!... cost=200 cache=false } ..., it will defer filter
execution, and just check a few docs matching the main query - a huge
performance boost.

Note: if one gets along with bare Lucene via q= +{!.. main q} +{! .. filter
q} it is smart enough to combine them in a most effective way, letting the
main query to drive intersection.
NB noone yet parser Occur.FILTER but already print it as #, combining two
queries might not be easy due to query parser quirks.

On Sun, Dec 22, 2024 at 4:27 PM Mingchun Zhao <mingchun.zha...@gmail.com>
wrote:

> Hi Mikhail,
>
> Thanks for your answer!
>
> > Here are two answers. Order of calling BQ.Builder.add() doesn't decide
> the
> > order of execution, as well as occur.
> > BooleanQuery lazily executes intersection dynamically and adjusts to
> actual
> > values, with many conditions and spec cases.
>
> Understood!
>
> > However, if you check SolrIndexSearcher.getProcessedFilter() you notice
> > that it executes filters eagerly and cache them (both up to parameters).
> > So, here the filterQuery in most cases will be bitset (or other) ie
> > materialized filter.
>
> Understood, I checked the getProcessedFilter() method in the source code,
> and it was as you explained.
>
> > It depends on relative selectivity (number of matched documents) of q and
> fq.
> > In an edge case deferring filters with cost>100 might get significant
> gain.
>
> I didn’t quite understand this part. Could you explain it in more detail
> please?
> Are you saying that the overall search performance differs depending on the
> number of documents matched by q and fq, due to the varying load of
> calculating the intersection? Or are you suggesting that the load of the
> filtering process in the filterQuery can change depending on the number of
> documents matched by q and fq as well?
> My understanding was that the filterQuery is executed independently of the
> scoreQuery, applying the filter logic to the entire index and then
> calculating the intersection of the respective results. Therefore, I
> thought the processing order of q and fq wouldn’t affect the overall search
> performance.
>
>
> Regards,
> Mingchun
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to