Re: Last/max term in Lucene 4.x

2011-02-21 Thread Jason Rutherglen
> Maybe we need a seekFloor in the TermsEnum?  (What we have now is > really seekCeil).  But, what's the larger use case here..? I opened an issue LUCENE-2930 to simply store the last/max term, however the seekFloor would work just as well. The use case is finding the last of the ordered IDs stor

Re: Last/max term in Lucene 4.x

2011-02-21 Thread Michael McCandless
On Sun, Feb 20, 2011 at 8:47 PM, Jason Rutherglen wrote: >> Though, if you just want to get to the last term... VarGap's terms >> index can quickly tell you the last indexed term, and from there you >> can scan to the last term?  (It'd be at most 32 (by default) scans). > > In VariableGapTermsInde

Re: Last/max term in Lucene 4.x

2011-02-20 Thread Jason Rutherglen
> Though, if you just want to get to the last term... VarGap's terms > index can quickly tell you the last indexed term, and from there you > can scan to the last term? (It'd be at most 32 (by default) scans). In VariableGapTermsIndexReader, IndexEnum doesn't support ord. How would I seek to the

Re: Last/max term in Lucene 4.x

2011-02-20 Thread Michael McCandless
On Sat, Feb 19, 2011 at 8:42 AM, Jason Rutherglen wrote: >> But you have to >> use a terms index impl that supports ord (eg FixedGap). > > Ok, and the VariableGap is the new standard because the FST is much > more efficient as a terms index?  Perhaps I'd need to create a codec > (or patch the exi

Re: Last/max term in Lucene 4.x

2011-02-19 Thread Jason Rutherglen
> Instead of docFreq, did you mean numUniqueTerms? Right. > But you have to > use a terms index impl that supports ord (eg FixedGap). Ok, and the VariableGap is the new standard because the FST is much more efficient as a terms index? Perhaps I'd need to create a codec (or patch the existing) t

Re: Last/max term in Lucene 4.x

2011-02-19 Thread Michael McCandless
I don't quite understand your question Jason... Seeking to the first term of the field just gets you the smallest term (in unsigned byte[] order, ie Unicode order if the byte[] is UTF8) across all docs. Instead of docFreq, did you mean numUniqueTerms? Ie, you want to seek to the largest term for