Hi Roxana, I think you can use https://lucene.apache.org/core/5_4_0/analyzers-common/org/apache/lucene/analysis/sinks/TeeSinkTokenFilter.html <https://lucene.apache.org/core/5_4_0/analyzers-common/org/apache/lucene/analysis/sinks/TeeSinkTokenFilter.html> like suggested earlier.
HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 22 Nov 2017, at 11:43, Roxana Danger <[email protected]> wrote: > > Hi Emir, > Many thanks for your reply. > The UpdateProcessor can do this work, but is analyzer.reusableTokenStream > <https://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/analysis/Analyzer.html#reusableTokenStream(java.lang.String, > java.io.Reader)> the way to obtain a previous generated tokenstream? is it > guarantee to get access to the token stream and not reconstruct it? > Thanks, > Roxana > > > On Wed, Nov 22, 2017 at 10:26 AM, Emir Arnautović < > [email protected]> wrote: > >> Hi Roxana, >> I don’t think that it is possible. In some cases (seems like yours is good >> fit) you could create custom update request processor that would do the >> shared analysis (you can have it defined in schema) and after analysis use >> those tokens to create new values for those two fields and remove source >> value (or flag it as ignored in schema). >> >> HTH, >> Emir >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> >> >>> On 22 Nov 2017, at 11:09, Roxana Danger <[email protected]> wrote: >>> >>> Hello all, >>> >>> I would like to reuse the tokenstream generated for one field, to create >> a >>> new tokenstream (adding a few filters to the available tokenstream), for >>> another field without the need of executing again the whole analysis. >>> >>> The particular application is: >>> - I have field *tokens* that uses an analyzer that generate the tokens >> (and >>> maintains the token type attributes) >>> - I would like to have another two new fields: *verbs* and *adjectives*. >>> These should reuse the tokenstream generated for the field *tokens* and >>> filter the verbs and adjectives for the respective fields. >>> >>> Is this feasible? How should it be implemented? >>> >>> Many thanks. >> >>
