This is not coupled because: termLength() is the number of chars in the term buffer, where the offsets give the offsets in the orginal char stream. If you use a CharFilter to e.g. remove chars, the termLength will get shorter, but the offset are still the original ones. Also both things are indexed in different ways, the termLength and offsets have no relation and must (as said before) not even follow a contract like end-start=length.
----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Babak Farhang [mailto:farh...@gmail.com] > Sent: Friday, November 13, 2009 11:50 PM > To: java-user@lucene.apache.org > Subject: Redundant fields Token class? > > I'm writing a TokenFilter and am confused about why class Token has > both an *endOffset* and a *termLength* field. It would appear that > the following invariant should always hold for a Token instance: > > termLength() == endOffset() - startOffset() > > If so, then > > 1) Why 2 fields, instead of 1? > 2) Why isn't the invariant enforced in the class? > > -Babak > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org