Re: Scanning through inverted index

2013-11-27 Thread Michael Berkovsky
I will take a look. Thanks for your help! On Wed, Nov 27, 2013 at 1:37 PM, Earl Hood wrote: > On Wed, Nov 27, 2013 at 3:31 PM, Michael Berkovsky wrote: > > > My goal is to simply store records term->[doc1, doc2, ] on disk. I > > tried to get these records through do

Re: Scanning through inverted index

2013-11-27 Thread Michael Berkovsky
My goal is to simply store records term->[doc1, doc2, ] on disk. I tried to get these records through docsEnum but it was too slow. Not sure if it possible to get them faster, hence the reason for my enquiry.(Perhaps there is some low level API to scan through the posting list?) Thanks, mb

Re: Scanning through inverted index

2013-11-27 Thread Michael Berkovsky
ere is. > Reconstructing the entire document? Just finding out > what documents a few words belong to? > > The former will be painful and lossy, Luke does that > for instance. > > FWIW, > Erick > > > On Mon, Nov 25, 2013 at 11:54 AM, Michael Berkovsky < > michael.berkov...@gm

Scanning through inverted index

2013-11-25 Thread Michael Berkovsky
Hello! I wonder if there is a fast way to scan through the entire inverted index to collect words and documents they belong to. Thanks, mb

Re: How to extract highest TF-IDF terms from Lucene index?

2012-05-09 Thread Michael Berkovsky
Thanks! On Wed, May 9, 2012 at 2:01 PM, Mike McCandless wrote: > There is a tool named HighFregTerms, in contrib/misc that does this... > > Mike > > Sent from my iPad > > On May 9, 2012, at 4:18 PM, Michael Berkovsky > wrote: > > > Hi, > > > > Ass