Aside from the hyper parameter that as Matthias said are indexing time ones
(so you need to re-index), there is a nice talk from Bloomberg at the
latest Berlin buzzwords that can give you some idea to reduce the query
latency:

https://www.youtube.com/watch?v=cDiCX3mVAlQ

Cheers
--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr Chair of PMC*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>


On Mon, 7 Jul 2025 at 10:43, Derek C <de...@hssl.ie> wrote:

> We have about 4M documents with a 512 vector field.  For us we have to make
> sure that the SOLR node instances have more memory available to Linux than
> the SOLR collection size so that it's available to the OS for caching.  I
> can see with iotop that there is 0 bytes disk reads while our SOLR node are
> working away.  Memory caching makes all the difference - without the memory
> the performance isn't good enough (for us to deliver web pages).
>
> We started using dense vector searches back in SOLR 8 and, at that time, I
> was experimenting with mounting RAM disks but it was only after that that I
> realized that Linux appears to be really good at automatically caching data
> so messing around with RAM disks wasn't necessary.  This means you end up
> with the choice of how much memory for the JDK and how much memory for
> Linux - and that Linux memory choice is really important for performance -
> more memory than the Collection(s) size in any case (I'm not sure how much
> more memory is required I just know that we are getting away with it now
> through a process of trial and error).
>
> Derek
>

Reply via email to