Hi Dominic,

We also see some spikes in the memory footprint of some of our dense vector 
clouds. There's no conclusive evidence that it's tied to any specific behavior, 
but potentially this could be due to segment merges? You could try tweaking the 
MergePolicyFactory / MergeScheduler and see if that has any effect on your 
spikes, however in general it has been found that larger/fewer segments is 
particularly beneficial on dense vector search performance (which might 
exacerbate the behavior if this is the root cause).

-Kevin

From: users@solr.apache.org At: 08/21/24 07:15:44 UTC-4:00To:  
users@solr.apache.org
Subject: Vector search core size/disk usage

We've been testing using vector search in 9.4 recently, and whilst the
quality of searches is looking good, we're concerned by what we're seeing
in core size/disk use.

The core we're testing is typically around 2.5GB in size. In the test
instance with vectors added, it's typically hovering around 13GB, but
sometimes spikes up to over 20GB for no apparent reason we can see.

I get why having a thousand-odd floats per entry would make for a bigger
core, but we'd hoped there would be some cleverness in the storage
mechanism to stop it being quite so drastic; and we're very concerned by
the way the needed storage space seems to spike up to double the normal
usage from time to time.

Are there any settings we're missing that might reduce the amount of space
vectors take up? And does anyone know why we're seeing such big
fluctuations?

Thanks

Dominic


Reply via email to