This is a follow up to the earlier thread I started to understand memory usage patterns of SegmentReader instances, but I decided to create a separate post since this issue is much more serious than the heap overhead created by use of stored field compression.
Here is the use case, once again. The index totals around 300M documents, with 7 string, 2 long, 1 integer, 1 date and 1 float fields which are both indexed and stored. It is split into 4 shards, which are basically separate indices... if that matters. After the index is populated (but not optimized since we don't do that), the overall heap usage taken up by Lucene is over 1 GB, much of which is taken up by instances of BlockTreeTermsReader. For instance for the largest segment in one such an index, the retained heap size of the internal tree map is around 50 MB. This is evident from heap dump analysis, which I have screenshots of that I can post here, if that helps. As there are many segments of various sizes in the index, as expected, the total heap usage for one shard stands at around 280 MB. Could someone shed some light on whether this is expected, and if so - how could I possibly trim down memory usage here? Is there a way to switch to a different terms index implementation, one that doesn't preload all the terms into RAM, or only does this partially, i.e. as a cache? I'm not sure if I'm framing my questions correctly, as I'm obviously not an expert on Lucene's internals, but this is going to become a critical issue for large scale use cases of our system.