Hi Dominic, We also see some spikes in the memory footprint of some of our dense vector clouds. There's no conclusive evidence that it's tied to any specific behavior, but potentially this could be due to segment merges? You could try tweaking the MergePolicyFactory / MergeScheduler and see if that has any effect on your spikes, however in general it has been found that larger/fewer segments is particularly beneficial on dense vector search performance (which might exacerbate the behavior if this is the root cause).
-Kevin From: users@solr.apache.org At: 08/21/24 07:15:44 UTC-4:00To: users@solr.apache.org Subject: Vector search core size/disk usage We've been testing using vector search in 9.4 recently, and whilst the quality of searches is looking good, we're concerned by what we're seeing in core size/disk use. The core we're testing is typically around 2.5GB in size. In the test instance with vectors added, it's typically hovering around 13GB, but sometimes spikes up to over 20GB for no apparent reason we can see. I get why having a thousand-odd floats per entry would make for a bigger core, but we'd hoped there would be some cleverness in the storage mechanism to stop it being quite so drastic; and we're very concerned by the way the needed storage space seems to spike up to double the normal usage from time to time. Are there any settings we're missing that might reduce the amount of space vectors take up? And does anyone know why we're seeing such big fluctuations? Thanks Dominic