Let's have highly selective q (matches a few docs), and weakly selective fq (matches many documents, let it be a kind of access control query). If we query them as is, it will took a while to materialize heavy filter query eagerly, and then just check intersection with a few query results. Thus, if we put fq={!... cost=200 cache=false } ..., it will defer filter execution, and just check a few docs matching the main query - a huge performance boost.
Note: if one gets along with bare Lucene via q= +{!.. main q} +{! .. filter q} it is smart enough to combine them in a most effective way, letting the main query to drive intersection. NB noone yet parser Occur.FILTER but already print it as #, combining two queries might not be easy due to query parser quirks. On Sun, Dec 22, 2024 at 4:27 PM Mingchun Zhao <mingchun.zha...@gmail.com> wrote: > Hi Mikhail, > > Thanks for your answer! > > > Here are two answers. Order of calling BQ.Builder.add() doesn't decide > the > > order of execution, as well as occur. > > BooleanQuery lazily executes intersection dynamically and adjusts to > actual > > values, with many conditions and spec cases. > > Understood! > > > However, if you check SolrIndexSearcher.getProcessedFilter() you notice > > that it executes filters eagerly and cache them (both up to parameters). > > So, here the filterQuery in most cases will be bitset (or other) ie > > materialized filter. > > Understood, I checked the getProcessedFilter() method in the source code, > and it was as you explained. > > > It depends on relative selectivity (number of matched documents) of q and > fq. > > In an edge case deferring filters with cost>100 might get significant > gain. > > I didn’t quite understand this part. Could you explain it in more detail > please? > Are you saying that the overall search performance differs depending on the > number of documents matched by q and fq, due to the varying load of > calculating the intersection? Or are you suggesting that the load of the > filtering process in the filterQuery can change depending on the number of > documents matched by q and fq as well? > My understanding was that the filterQuery is executed independently of the > scoreQuery, applying the filter logic to the entire index and then > calculating the intersection of the respective results. Therefore, I > thought the processing order of q and fq wouldn’t affect the overall search > performance. > > > Regards, > Mingchun > -- Sincerely yours Mikhail Khludnev