[
https://issues.apache.org/jira/browse/LUCENE-5418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886925#comment-13886925
]
Michael McCandless commented on LUCENE-5418:
--------------------------------------------
Actually, I've removed the SlowBitsDocIdSetIterator (need to post a patch again
soon...), because it's too trappy. I think it's better if the user gets an
exception here than just silently run super slowly, and at least for this issue
there are always ways to run the Filter "quickly" (use DrillSideways or
DrillDownQuery, or create FilteredQuery directly).
> Don't use .advance on costly (e.g. distance range facets) filters
> -----------------------------------------------------------------
>
> Key: LUCENE-5418
> URL: https://issues.apache.org/jira/browse/LUCENE-5418
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 5.0, 4.7
>
> Attachments: LUCENE-5418.patch
>
>
> If you use a distance filter today (see
> http://blog.mikemccandless.com/2014/01/geospatial-distance-faceting-using.html
> ), then drill down on one of those ranges, under the hood Lucene is using
> .advance on the Filter, which is very costly because we end up computing
> distance on (possibly many) hits that don't match the query.
> It's better performance to find the hits matching the Query first, and then
> check the filter.
> FilteredQuery can already do this today, when you use its
> QUERY_FIRST_FILTER_STRATEGY. This essentially accomplishes the same thing as
> Solr's "post filters" (I think?) but with a far simpler/better/less code
> approach.
> E.g., I believe ElasticSearch uses this API when it applies costly filters.
> Longish term, I think Query/Filter ought to know itself that it's expensive,
> and cases where such a Query/Filter is MUST'd onto a BooleanQuery (e.g.
> ConstantScoreQuery), or the Filter is a clause in BooleanFilter, or it's
> passed to IndexSearcher.search, we should also be "smart" here and not call
> .advance on such clauses. But that'd be a biggish change ... so for today
> the "workaround" is the user must carefully construct the FilteredQuery
> themselves.
> In the mean time, as another workaround, I want to fix DrillSideways so that
> when you drill down on such filters it doesn't use .advance; this should give
> a good speedup for the "normal path" API usage with a costly filter.
> I'm iterating on the lucene server branch (LUCENE-5376) but once it's working
> I plan to merge this back to trunk / 4.7.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]