Re: API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Jason Rutherglen
Yeah that's customizing the Lucene source. :) I should have gone into more detail, I will next time. On Wed, Nov 10, 2010 at 2:10 PM, Michael McCandless wrote: > Actually, the .tii file pre-flex (3.x) is nearly identical to the .tis > file, just that it only contains every 128th term. > > If you

Re: API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Michael McCandless
Actually, the .tii file pre-flex (3.x) is nearly identical to the .tis file, just that it only contains every 128th term. If you just make SegmentTermEnum public (or, sneak your class into oal.index package) then you can instantiate SegmentTermsEnum passing it an IndexInput opened on the .tii file

Re: API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Jason Rutherglen
In a word, no. You'd need to customize the Lucene source to accomplish this. On Wed, Nov 10, 2010 at 1:02 PM, Burton-West, Tom wrote: > Hello all, > > We have an extremely large number of terms in our indexes.  I want to be able > to extract a sample of the terms, say something like every 128th

API access to in-memory tii file (3.x not flex).

2010-11-10 Thread Burton-West, Tom
Hello all, We have an extremely large number of terms in our indexes. I want to be able to extract a sample of the terms, say something like every 128th term. If I use code based on org.apache.lucene.misc.HighFreqTerms or org.apache.lucene.index.CheckIndex I would get a TermsEnum, call term