Hi Erick,
Thanks for the reply. The use case I have is this:
Say you have a synonym expansion like this:
ac -> air conditioning
And to keep it simple, a document where the first term is ac. When
analyzing the document I currently create a token stream that looks
something like this for the
Erick Erickson skrev:
Offhand, I expect this will affect up span queries, phrase
queries, and who knows what else? Maybe scoring?
I belive that the offsets are just meta data stored with the term
vectors, used by the highlighter et c. Phrase and span queries use term
position in the stream (p
Is this a theoretical question or is there a use-case you're trying
to support? If the latter, a statement of the problem you're trying
to solve would be helpful.
If the former, setting all your start offsets to 0 seems wrong. You're
essentially saying that all tokens are at the beginning of the d
Hi,
I have a TokenStream that inserts synonym tokens into the stream when
matched. One thing I am wondering about is what is the effect of the
startOffset and endOffset. I have something like this:
Token synonymToken = new Token(originalToken.startOffset(),
originalToken.endOffset(), "SYN