[
https://issues.apache.org/jira/browse/LUCENE-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236168#comment-16236168
]
Adrien Grand commented on LUCENE-8017:
--------------------------------------
bq. reader-specific, because it uses global stats
In case there is confusion, getReaderHelper does not mean caching on the
top-level reader, it means taking deletes and dv updates into account (I got a
bit confused since you mentioned top-level statistics).
I agree there are use-cases for caching on the core+dv-updates+deletes key, but
we have no way to know whether it is safe to do. While caching on a core is ok
due to the fact that segments get merged less and less often as they get
bigger, caching on deletes and dv updates is problematic: if there is a
constant stream of updates, there would very little reuse. This wouldn't be a
big deal with a regular cache, but the query cache has the unusual property
that caching a clause of a query can take 10x longer than running the query
(think eg. of a selective query and a filter that matches most of the index).
This makes caching dangerous if reuse of cache entries is not likely. And
{{FunctionMatchQuery}} is a worst-case scenario since it requires to perform a
linear scan of the documents of the segment.
If we want to start caching on the core+dv-updates+deletes key, we should find
a way to make sure that the index is mostly static so that cache entries would
be reused.
> FunctionRangeQuery and FunctionMatchQuery can pollute the QueryCache
> --------------------------------------------------------------------
>
> Key: LUCENE-8017
> URL: https://issues.apache.org/jira/browse/LUCENE-8017
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Major
> Attachments: LUCENE-8017.patch, LUCENE-8017.patch
>
>
> The QueryCache assumes that queries will return the same set of documents
> when run over the same segment, independent of all other segments held by the
> parent IndexSearcher. However, both FunctionRangeQuery and
> FunctionMatchQuery can select hits based on score, which depend on term
> statistics over the whole index, and could therefore theoretically return
> different result sets on a given segment.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]