Hello,
I am trying to work through term positions and how to get them from a collection of hits. Does setting TermVector.WITH_POSITIONS_OFFSETS to true save the start/end position of the term in the source text file? (I _think_ it does).

If so, where would I start for trying to make that information accessible in a "result set"? I believe it would be extending a query, a scorer, a hit, and/or a weight object. I will be wanting to process ALL hits, so I think will need to implement a hitcollector.

As an example of what I want, if I were looking for the offset position of "brown" in a properly indexed field containing "the lazy brown fox", I would like to get:
start==10
end==15 (assuming my counting is right)

Based on Paul Elschot's previous response to a similar question I had (which I am still working on), I _think_ I need to extend something like the ExactPhraseScorer. While debugging with my IDE (Eclipse) I can see that the weight object in the scorer contains a reference to the query. The query contains the fields:
   Vector positions (just has ints of term positions in phrase?)
   Vector terms (vector of Term, just field name and field contents?)

The weight also seems to have an array of TermPositions, which have SegmentTermPositions. I thought this was what I wanted, but I don't see the proper start/end fields, or anything which seems to be on the right track.

   Can anyone point me in the right direction?
Thanks,

Sean



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to