Okay, Koji, hopefully I'll be more luckily suggesting this this time.

Have you tried http://issues.apache.org/jira/browse/LUCENE-1448 yet? I am
not sure if its in an applyable state, but I hope that covers your issue.

On Fri, Jan 16, 2009 at 7:15 PM, Koji Sekiguchi <k...@r.email.ne.jp> wrote:

> Hello,
>
> I'm writing a highlighter by using term offsets info (yes, I borrowed
> the idea
> of LUCENE-644). In my highlighter, I'm seeing unexpected term offsets info
> when getting multi-valued field.
>
> For example, if I indexed [" aaaa"," bbb "] (multi-valued), I got term info
> bbb(7,10). This is expected result. But if I indexed [" aaa "," bbb "]
> (note that using " aaa " instead of " aaaa"), I got term info bbb(6,9)
> which
> is unexpected. I would like to get same offset info for bbb because they
> are same length of field values.
>
> Please use the following program to see the problem I'm seeing. I'm
> using trunk:
>
> public static void main(String[] args) throws Exception {
> // create an index
> Directory dir = new RAMDirectory();
> Analyzer analyzer = new WhitespaceAnalyzer();
> IndexWriter writer = new IndexWriter( dir, analyzer, true,
> MaxFieldLength.LIMITED );
> Document doc = new Document();
> doc.add( new Field( "f", " aaa ", Store.YES, Index.ANALYZED,
> TermVector.WITH_OFFSETS ) );
> //doc.add( new Field( "f", " aaaa", Store.YES, Index.ANALYZED,
> TermVector.WITH_OFFSETS ) );
> doc.add( new Field( "f", " bbb ", Store.YES, Index.ANALYZED,
> TermVector.WITH_OFFSETS ) );
> writer.addDocument( doc );
> writer.close();
>
> // print the offsets
> IndexReader reader = IndexReader.open( dir );
> TermPositionVector tpv = (TermPositionVector)reader.getTermFreqVector(
> 0, "f" );
> for( int i = 0; i < tpv.getTerms().length; i++ ){
> System.out.print( "term = \"" + tpv.getTerms()[i] + "\"" );
> TermVectorOffsetInfo[] tvois = tpv.getOffsets( i );
> for( TermVectorOffsetInfo tvoi : tvois ){
> System.out.println( "(" + tvoi.getStartOffset() + "," +
> tvoi.getEndOffset() + ")" );
> }
> }
> reader.close();
> }
>
> regards,
>
> Koji
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to