I thought that went to the "index" of the token.  I may not understand it
completely but this is how I currently view the TokenStream

For example if my text was the following:

This is an Example

This is index of 1, is has index 2, an has index 3 Example has index 4.
What I have is the actual "character position" in the original text.  "This"
is characters 0-3, "is" is characters 5-6, "an" is characters 8-9, and
"Example" is characters 11-17.  I know that given Token 4 (Example) I can
get the startOffset and endOffset (11, and 17).  What I'm wondering is given
character offset can I get a tokenIndex.  (I.E.  given character offset 12,
it would return 3, because Example is the closest token that starts at
character 12).

--JP

On 7/6/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:

: I never got a response to this and thought maybe I was too wordy.
:
: I'm wondering if there's a way where given a position in the original
text
: you can retrieve the token index that is nearest to that position using
the
: StandardToken/StandardTokenizer classes?

i may not be understanding the question, but wouldn't that just be...

  TokenStream s = getTokenStreamForOrriginalText()
  Token t;
  for (i=0; i<thePositionYouKnow; i++) {
    t = s.next();
  }
  return t;

?


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to