[
https://issues.apache.org/jira/browse/LUCENE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14537160#comment-14537160
]
Markus Heiden commented on LUCENE-6365:
---------------------------------------
For the normal use case I agree. But I had problems with long build times for
lookups for big dictionaries (using the AnalyzingSuggester). I profiled the
creation of the lookups and one hotspot was the allocation of the internal
stack. One problem is, that the initial size of the internal stack is too small
(4 entries), so the internal stack gets resized over and over again. I will
increase its size to 16.
> Optimized iteration of finite strings
> -------------------------------------
>
> Key: LUCENE-6365
> URL: https://issues.apache.org/jira/browse/LUCENE-6365
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/other
> Affects Versions: 5.0
> Reporter: Markus Heiden
> Priority: Minor
> Labels: patch, performance
> Attachments: FiniteStringsIterator.patch,
> FiniteStringsIterator2.patch, FiniteStringsIterator3.patch
>
>
> Replaced Operations.getFiniteStrings() by an optimized FiniteStringIterator.
> Benefits:
> Avoid huge hash set of finite strings.
> Avoid massive object/array creation during processing.
> "Downside":
> Iteration order changed, so when iterating with a limit, the result may
> differ slightly. Old: emit current node, if accept / recurse. New: recurse /
> emit current node, if accept.
> The old method Operations.getFiniteStrings() still exists, because it eases
> the tests. It is now implemented by use of the new FiniteStringIterator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]