hi all,
how can I get the Token's Position from the TokenStream / Tokenizer /
Analyzer ? I know that there's a TokenPositionIncrement Attribute and a
TokenPositionLength Attribute, but is there an easy way to get the token
position or do I need to implement my own attribute by adding one of the
attributes mentioned above?
the reason I need it is that I wrote an implementation of a
ShingleFilter which breaks shingles at punctuations so the tokens [token
number one, word two] will create the shingles [ "token number", "number
one", "word two" ] -- but Not [ "one word" ] because of the comma. I
want it to break shingles at increment gaps as well.
thanks,
Igal
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org