Okay, Koji, hopefully I'll be more luckily suggesting this this time. Have you tried http://issues.apache.org/jira/browse/LUCENE-1448 yet? I am not sure if its in an applyable state, but I hope that covers your issue.
On Fri, Jan 16, 2009 at 7:15 PM, Koji Sekiguchi <k...@r.email.ne.jp> wrote: > Hello, > > I'm writing a highlighter by using term offsets info (yes, I borrowed > the idea > of LUCENE-644). In my highlighter, I'm seeing unexpected term offsets info > when getting multi-valued field. > > For example, if I indexed [" aaaa"," bbb "] (multi-valued), I got term info > bbb(7,10). This is expected result. But if I indexed [" aaa "," bbb "] > (note that using " aaa " instead of " aaaa"), I got term info bbb(6,9) > which > is unexpected. I would like to get same offset info for bbb because they > are same length of field values. > > Please use the following program to see the problem I'm seeing. I'm > using trunk: > > public static void main(String[] args) throws Exception { > // create an index > Directory dir = new RAMDirectory(); > Analyzer analyzer = new WhitespaceAnalyzer(); > IndexWriter writer = new IndexWriter( dir, analyzer, true, > MaxFieldLength.LIMITED ); > Document doc = new Document(); > doc.add( new Field( "f", " aaa ", Store.YES, Index.ANALYZED, > TermVector.WITH_OFFSETS ) ); > //doc.add( new Field( "f", " aaaa", Store.YES, Index.ANALYZED, > TermVector.WITH_OFFSETS ) ); > doc.add( new Field( "f", " bbb ", Store.YES, Index.ANALYZED, > TermVector.WITH_OFFSETS ) ); > writer.addDocument( doc ); > writer.close(); > > // print the offsets > IndexReader reader = IndexReader.open( dir ); > TermPositionVector tpv = (TermPositionVector)reader.getTermFreqVector( > 0, "f" ); > for( int i = 0; i < tpv.getTerms().length; i++ ){ > System.out.print( "term = \"" + tpv.getTerms()[i] + "\"" ); > TermVectorOffsetInfo[] tvois = tpv.getOffsets( i ); > for( TermVectorOffsetInfo tvoi : tvois ){ > System.out.println( "(" + tvoi.getStartOffset() + "," + > tvoi.getEndOffset() + ")" ); > } > } > reader.close(); > } > > regards, > > Koji > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >