[
https://issues.apache.org/jira/browse/LUCENE-8332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490210#comment-16490210
]
David Smiley commented on LUCENE-8332:
--------------------------------------
{quote}It's a TokenStream because it consumes the entire input in the first
call to incrementToken (which invokes input.reset(), input.end(), input.close())
{quote}
Is that policy documented anywhere? FingerprintFilter behaves similarly yet is
a TokenFilter. So I'm thinking that ConcatenateGraphFilter has a nice ring to
it ;)
I'm working on it; I can post a patch or GitHub PR when done; sometime Friday I
expect.
> New ConcatenateGraphTokenStream (move/rename CompletionTokenStream)
> -------------------------------------------------------------------
>
> Key: LUCENE-8332
> URL: https://issues.apache.org/jira/browse/LUCENE-8332
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/analysis
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Major
>
> Lets move and rename the CompletionTokenStream in the suggest module into the
> analysis module renamed as ConcatenateGraphTokenStream. See comments in
> LUCENE-8323 leading to this idea. Such a TokenStream (or TokenFilter?) has
> several uses:
> * for the suggest module
> * by the SolrTextTagger for NER/ERD use cases – SOLR-12376
> * for doing complete match search efficiently
> It will need a factory – a TokenFilterFactory, even though we don't have a
> TokenFilter based subclass of TokenStream.
> It appears there is no back-compat concern in it suddenly disappearing from
> the suggest module as it's marked experimental and it only seems to be public
> now perhaps due to some technicality (it has package level constructors).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]