[
https://issues.apache.org/jira/browse/LUCENE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler resolved LUCENE-5803.
-----------------------------------
Resolution: Fixed
Thank you also to Shay Banon, who provided the original idea and patch inside
the ES code tree.
> Add another AnalyzerWrapper class that does not have its own cache, so
> delegate-only wrappers don't create thread local resources several times
> -----------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-5803
> URL: https://issues.apache.org/jira/browse/LUCENE-5803
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 4.9
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 5.0, 4.10
>
> Attachments: LUCENE-5803.patch, LUCENE-5803.patch, LUCENE-5803.patch,
> LUCENE-5803.patch, LUCENE-5803.patch, LUCENE-5803.patch
>
>
> This is a followup issue for the following Elasticsearch issue:
> https://github.com/elasticsearch/elasticsearch/pull/6714
> Basically the problem is the following:
> - Elasticsearch has a pool of Analyzers that are used for analysis in several
> indexes
> - Each index uses a different PerFieldAnalyzerWrapper
> PerFieldAnalyzerWrapper uses PER_FIELD_REUSE_STRATEGY. Because of this it
> caches the tokenstreams for every field. If there are many fields, this are a
> lot. In addition, the underlying analyzers may also cache tokenstreams and
> other PerFieldAnalyzerWrappers do the same, although the delegate Analyzer
> can always return the same components.
> We should add similar code to Elasticsearch's directly to Lucene: If the
> delegating Analyzer just delegates per Field or just wraps CharFilters around
> the Reader, there is no need to cache the TokenStreamComponents a second time
> in the delegating Analyzers. This is only needed, if the delegating Analyzers
> adds additional TokenFilters (like ShingleAnalyzerWrapper).
> We should name this new class DelegatingAnalyzerWrapper extends
> AnalyzerWrapper. The wrapComponents method must be final, because we are not
> allowed to add additional TokenFilters, but unlike ES, we don't need to
> disallow wrapping with CharFilters.
> Internally this class uses a private ReuseStrategy that just delegates to the
> underlying analyzer. It does not matter here if the strategy of the delegate
> is global or per field, this is private to the delegate.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]