[
https://issues.apache.org/jira/browse/LUCENE-6077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-6077:
---------------------------------
Attachment: LUCENE-6077.patch
Updated patch:
- CachingWrapperFilter now uses a policy that only caches on merged segments
by default (instead of all segments)
- applied other suggestions about typos/naming
> Add a filter cache
> ------------------
>
> Key: LUCENE-6077
> URL: https://issues.apache.org/jira/browse/LUCENE-6077
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Minor
> Fix For: 5.0
>
> Attachments: LUCENE-6077.patch, LUCENE-6077.patch
>
>
> Lucene already has filter caching abilities through CachingWrapperFilter, but
> CachingWrapperFilter requires you to know which filters you want to cache
> up-front.
> Caching filters is not trivial. If you cache too aggressively, then you slow
> things down since you need to iterate over all documents that match the
> filter in order to load it into an in-memory cacheable DocIdSet. On the other
> hand, if you don't cache at all, you are potentially missing interesting
> speed-ups on frequently-used filters.
> Something that would be nice would be to have a generic filter cache that
> would track usage for individual filters and make the decision to cache or
> not a filter on a given segments based on usage statistics and various
> heuristics, such as:
> - the overhead to cache the filter (for instance some filters produce
> DocIdSets that are already cacheable)
> - the cost to build the DocIdSet (the getDocIdSet method is very expensive
> on some filters such as MultiTermQueryWrapperFilter that potentially need to
> merge lots of postings lists)
> - the segment we are searching on (flush segments will likely be merged
> right away so it's probably not worth building a cache on such segments)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]