Re: what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-18 Thread wgggfiy
thx, mike. about the 3th question, "encode them all into the payload" is better than "a new postings format with the codec" ?? I mean replace the orginal posting item (position, startOffset, endOffset, payload) with my own inverted item such as class TestPostingItem { int termId; l

Re: what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-18 Thread Michael McCandless
On Sun, Nov 18, 2012 at 12:09 PM, wgggfiy wrote: > I'm now studying lucene 4.0. > 1, what is the startOffset and endOffset for ? is there a code example ? These are set by the analyzer, to the start and end character offset for this token (using the OffsetAttribute). The offsets are used for hig

what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-18 Thread wgggfiy
I'm now studying lucene 4.0. 1, what is the startOffset and endOffset for ? is there a code example ? 2, what is payload ? I know just a little about it, and it can be used for things like font weight, or XML enclosing tag. 3, I have a item like (lucene, 350, 450, 33.2, 2), where 350,450 is the o