I'm converting a Lucene 2.3.2 to 2.4.1 (with a view to going to 2.9.4).
Many of our indexes are 5M+ Documents, however, only a small subset of these are relevant to any user. As a DocIdSet, backed by a BitSet or OpenBitSet, is rather inefficient in terms of memory use, what is the recommended way to DocIdSet implementation to use in this scenario?
Seems like SortedVIntList can be used to store the info, but it has no methods to build the list in the first place, requiring an array or bitset in the constructor.
I had used Nutch's DocSet and HashDocSet implementations in my 2.3.2 deployment, but want to move away from that Nutch dependency, so wondered if Lucene had a way to do this?
Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org