> This (very large number of unique terms) is a problem for Lucene currently.
>
> There are some simple improvements we could make to the terms dict
> format to not require so much RAM per term in the terms index...
> LUCENE-1458 (flexible indexing) has these improvements, but
> unfortunately tied in w/ lots of other changes.  Maybe we should break
> out a separate issue for this... this'd be a great contained
> improvement, if anyone out there has "the itch" :)

Resurrecting an old thread, but it's a concern that I have as well, so
I thought I'd add on to this.

It looks like issue 1458 was resolved on dec. 3, but I couldn't figure
out what the resolution was.  Does lucene 3.0 have a more
memory-friendly replacement to reading the entire .tii file into RAM?
If not, would just mmap'ing the .tii file and skipping around in the
mmap be a better solution than essentially reading the entire file and
keeping it in arrays on the heap?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to