[ 
https://issues.apache.org/jira/browse/LUCENE-6365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612534#comment-14612534
 ] 

Markus Heiden commented on LUCENE-6365:
---------------------------------------

I adapted my patch to the latest changes in trunk. 

I think the reuse of the iterator is one core part of this whole patch. I tried 
to rework the api of the iterator so that the reuse case and the no-reuse case 
are handled in a similar way. I hope you like it now (at least a bit). Lucene 
does this kind of reuse already, e.g. see Transition.

FuzzyCompletionQuery has been added lately and relies on the old big set of 
finite strings. I am not sure how to rework it. Currently it still uses the 
set, maybe it is better to use the iterator inside of FuzzyCompletionWeight, 
but this means recomputing the finite strings over and over again. What do you 
think?

BTW topoSortStates() is implemented by AnalyzingSuggester and 
CompletionTokenStream identically. Maybe it should be moved to one place, maybe 
to Operations? 

> Optimized iteration of finite strings
> -------------------------------------
>
>                 Key: LUCENE-6365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6365
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/other
>    Affects Versions: 5.0
>            Reporter: Markus Heiden
>            Priority: Minor
>              Labels: patch, performance
>         Attachments: FiniteStrings_reuse.patch
>
>
> Replaced Operations.getFiniteStrings() by an optimized FiniteStringIterator.
> Benefits:
> Avoid huge hash set of finite strings.
> Avoid massive object/array creation during processing.
> "Downside":
> Iteration order changed, so when iterating with a limit, the result may 
> differ slightly. Old: emit current node, if accept / recurse. New: recurse / 
> emit current node, if accept.
> The old method Operations.getFiniteStrings() still exists, because it eases 
> the tests. It is now implemented by use of the new FiniteStringIterator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to