I thought that went to the "index" of the token. I may not understand it completely but this is how I currently view the TokenStream
For example if my text was the following: This is an Example This is index of 1, is has index 2, an has index 3 Example has index 4. What I have is the actual "character position" in the original text. "This" is characters 0-3, "is" is characters 5-6, "an" is characters 8-9, and "Example" is characters 11-17. I know that given Token 4 (Example) I can get the startOffset and endOffset (11, and 17). What I'm wondering is given character offset can I get a tokenIndex. (I.E. given character offset 12, it would return 3, because Example is the closest token that starts at character 12). --JP On 7/6/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: I never got a response to this and thought maybe I was too wordy. : : I'm wondering if there's a way where given a position in the original text : you can retrieve the token index that is nearest to that position using the : StandardToken/StandardTokenizer classes? i may not be understanding the question, but wouldn't that just be... TokenStream s = getTokenStreamForOrriginalText() Token t; for (i=0; i<thePositionYouKnow; i++) { t = s.next(); } return t; ? -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]