I will take a look. Thanks for your help!
On Wed, Nov 27, 2013 at 1:37 PM, Earl Hood wrote:
> On Wed, Nov 27, 2013 at 3:31 PM, Michael Berkovsky wrote:
>
> > My goal is to simply store records term->[doc1, doc2, ] on disk. I
> > tried to get these records through docsEnum but it was too slo
On Wed, Nov 27, 2013 at 3:31 PM, Michael Berkovsky wrote:
> My goal is to simply store records term->[doc1, doc2, ] on disk. I
> tried to get these records through docsEnum but it was too slow. Not sure
> if it possible to get them faster, hence the reason for my enquiry.(Perhaps
> there is
My goal is to simply store records term->[doc1, doc2, ] on disk. I
tried to get these records through docsEnum but it was too slow. Not sure
if it possible to get them faster, hence the reason for my enquiry.(Perhaps
there is some low level API to scan through the posting list?)
Thanks,
mb
: The goal is to construct the iterator
:
: Iterator: term -> [doc1, doc2, ...]
That iterator already exists -- it's a DocsEnum.
Erick's question is what your *end* goal is .. what are you attempting to
do that you are asking about accessing a low level iterator over all thd
docs that contain
The goal is to construct the iterator
Iterator: term -> [doc1, doc2, ...]
It would run through the entire Lucene index . The index contains +100 mln
documents
Thanks,
mb
On Wed, Nov 27, 2013 at 5:47 AM, Erick Erickson wrote:
> Probably should explain what your end goal here is.
> Reconstructi
Probably should explain what your end goal here is.
Reconstructing the entire document? Just finding out
what documents a few words belong to?
The former will be painful and lossy, Luke does that
for instance.
FWIW,
Erick
On Mon, Nov 25, 2013 at 11:54 AM, Michael Berkovsky <
michael.berkov...@g
Hello!
I wonder if there is a fast way to scan through the entire inverted index
to collect words and documents they belong to.
Thanks,
mb