What does your synthetic randomized benchmark look like? Did you try
different values for hnswMaxConnections and hnswMaxConn. Do your curves
wildly differ from https://ann-benchmarks.com/luceneknn.html ?


On Tue, Jan 30, 2024 at 3:49 PM Moll, Dr. Andreas <m...@juris.de.invalid>
wrote:

> Hi,
>
> the hnsw documentation for the Lucene HnswGraph and the SolR vector search
> is not very verbose, especially in regards to the parameters hnswMaxConn
> and hnswBeamWidth.
> I find it hard to come up with sensible values for these parameters by
> reading the paper from 2018.
> Does anyone have experience with the influence of the parameters on the
> results? As far as I understand the code the graph is created at indexing
> time so it would be time intensive to come up with the optimal values for a
> specific use case by trial and error?
>
> We have a SolR index with roughly 100 million embeddings and in a
> synthetic randomized benchmarks around 14% percent of requests will result
> in a suboptimal answer (based on the cosine vector similarity).
> I expected this "error" rate to be much smaller. I would love to hear your
> experiences.
>
> Best regards
>
> Andreas Moll
>

>

Reply via email to