[ 
https://issues.apache.org/jira/browse/LUCENE-8017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236168#comment-16236168
 ] 

Adrien Grand commented on LUCENE-8017:
--------------------------------------

bq. reader-specific, because it uses global stats

In case there is confusion, getReaderHelper does not mean caching on the 
top-level reader, it means taking deletes and dv updates into account (I got a 
bit confused since you mentioned top-level statistics).

I agree there are use-cases for caching on the core+dv-updates+deletes key, but 
we have no way to know whether it is safe to do. While caching on a core is ok 
due to the fact that segments get merged less and less often as they get 
bigger, caching on deletes and dv updates is problematic: if there is a 
constant stream of updates, there would very little reuse. This wouldn't be a 
big deal with a regular cache, but the query cache has the unusual property 
that caching a clause of a query can take 10x longer than running the query 
(think eg. of a selective query and a filter that matches most of the index). 
This makes caching dangerous if reuse of cache entries is not likely. And 
{{FunctionMatchQuery}} is a worst-case scenario since it requires to perform a 
linear scan of the documents of the segment.

If we want to start caching on the core+dv-updates+deletes key, we should find 
a way to make sure that the index is mostly static so that cache entries would 
be reused.

> FunctionRangeQuery and FunctionMatchQuery can pollute the QueryCache
> --------------------------------------------------------------------
>
>                 Key: LUCENE-8017
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8017
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>         Attachments: LUCENE-8017.patch, LUCENE-8017.patch
>
>
> The QueryCache assumes that queries will return the same set of documents 
> when run over the same segment, independent of all other segments held by the 
> parent IndexSearcher.  However, both FunctionRangeQuery and 
> FunctionMatchQuery can select hits based on score, which depend on term 
> statistics over the whole index, and could therefore theoretically return 
> different result sets on a given segment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to