hi all,

how can I get the Token's Position from the TokenStream / Tokenizer / Analyzer ? I know that there's a TokenPositionIncrement Attribute and a TokenPositionLength Attribute, but is there an easy way to get the token position or do I need to implement my own attribute by adding one of the attributes mentioned above?

the reason I need it is that I wrote an implementation of a ShingleFilter which breaks shingles at punctuations so the tokens [token number one, word two] will create the shingles [ "token number", "number one", "word two" ] -- but Not [ "one word" ] because of the comma. I want it to break shingles at increment gaps as well.

thanks,


Igal


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to