On 8/11/2021 6:04 AM, Satya Nand wrote:
*Filter cache stats:*
https://drive.google.com/file/d/19MHEzi9m3KS4s-M86BKFiwmnGkMh3DGx/view?usp=sharing


This shows the current size as 3912, almost full.

There is an alternate format for filterCache entries, that just lists the IDs of the matching documents.  This only gets used when the hitcount for the filter is low.  I do not know what threshold it uses to decide that the hitcount is low enough to use the alternate format, and I do not know where in the code to look for the answer.  This is probably why you can have 3912 entries in the cache without blowing the heap.

I bet that when the heap gets blown, the filter queries Solr receives are such that they cannot use the alternate format, and thus require the full 12.7 million bytes.  Get enough of those, and you're going to need more heap than 30GB.  I bet that if you set the heap to 31G, the OOMEs would occur a little less frequently.  Note that if you set the heap to 32G, you actually have less memory available than if you set it to 31G -- At 32GB, Java must switch from 32 bit pointers to 64 bit pointers.  Solr creates a LOT of objects on the heap, so that difference adds up.

Discussion item for those with an interest in the low-level code:  What kind of performance impact would it cause to use a filter bitmap compressed with run-length encoding?  Would that happen at the Lucene level rather than the Solr level?

To fully solve this issue, you may need to re-engineer your queries so that fq values are highly reusable, and non-reusable filters are added to the main query.  Then you would not need a very large cache to obtain a good hit ratio.

Thanks,
Shawn

Reply via email to