Re: read more tokens during analysis

Ahmet Arslan Fri, 12 Feb 2010 09:09:24 -0800

> i want to consider the current word
> & the next as a single term.
> 
> when analyzing "Arun Kumar"
> 
> i want my analyzer to consider "Arun",  "Arun Kumar"
> as synonyms.
> 
> in the tokenstream method, how do we read the next token
> "Kumar"
> i am going through the setPositionIncrements method for
> considering them as
> synonyms, but i don't understand how to implement look
> ahead in the
> analyzer.


Can we say that you want to implement a synonym filter that takes a list of 
custom synonyms?
If yes why not use Solr's SynonymFilterFactory[1] that does this automatically? 
It can handle multi-words synonym like "Arun",  "Arun Kumar"
I can share the code to integrate it into Lucene if you want.

[1]http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: read more tokens during analysis

Reply via email to