The dense vector codebase in Lucene is extremely hot and it has been quite
hot for the last couple of years.
This means that many changes, optimizations, bug fixes, new features, and
potential bugs come with different Lucene versions.

Apache Solr 9.4.1 uses Lucene 9.8.0
Apache Solr 9.7.0 uses Lucene 9.10

Furthermore, HNSW (the approach used by Solr/Lucene for vector search) has
also a random factor involved with entry points.

In short, differences in search results are expected and not necessarily a
bad signal.

So, let's rephrase when you say: "The newer version is ignoring some
documents"? do you mean that by the exact nearest neighbor you were
expecting them but didn't get them via approximation? (also remember that
Solr knn starts approximate and redirects to exact only in certain (slow)
scenarios.

Your best bet is playing with the HNSW hyper-parameters, but I'm not sure
what's your underlying real issue in terms of relevance/results quality.

Cheers

--------------------------
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr Chair of PMC*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io <http://sease.io/>
LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
<https://twitter.com/seaseltd> | Youtube
<https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
<https://github.com/seaseltd>


On Sat, 7 Jun 2025 at 15:31, Κούτσιανος Λάζαρος <l.koutsia...@gmail.com>
wrote:

> Hello to the Community! I’m currently facing an issue regarding the Dense
> Vector Search in Apache Solr and was hoping you might have a small tip or
> suggestion.
>
> I've indexed the exact same data (with identical vectors) in Solr 9.4.1
> and Solr 9.7.0. However, when performing Dense Vector Search, I’m getting
> different results for some queries between the two versions. It seems to me
> that the newer version is ignoring some documents. I’ve double-checked that
> the vectors are the same across both setups, but I still can’t explain the
> discrepancy in results.
>
> According to the Solr documentation:
> https://solr.apache.org/guide/solr/latest/query-guide/dense-vector-search.html
> there are no differences in the default Dense Vector Search configurations
> between the two versions. I’m using the default similarity metric in both
> cases, which should be Euclidean.
>
> Any idea or hint would be greatly appreciated!
> Thank you all in advance
>
>

Reply via email to