Re: Do we know why Lucene's HNSW may be slower than other HNSW implementations?

Chris Hegarty Tue, 05 Aug 2025 08:06:41 -0700

Just to close the loop on this. I say close, however the work is still active, 
but worth an update.


Trevor and I added a bulk scoring abstraction to RandomVectorScorer [1] in. The 
API can be adapted a little too, in order to improve some common use cases [2], 
but is otherwise “solid”.  Trevor has prototyped a bulk scorer implementation 
using rust, and I’ve verified similar with a native scorer - both prefetch 
yet-to-be-scorer vectors from the given set of ordinals (improvements can be 
seen in the linked PR comments). This is great, but we can do better in Lucene.

Now that we have a bulk scorer API, I’ve implemented an off-heap bulk scorer 
for float32’s and orientated a trivial dot product implementation using Panama 
vector around scoring 4 vectors at time, shaping the code in a way to more 
easily help provoke the CPU to do overlap the memory fetches (at least, that’s 
the idea!).

Running luceneutil with 1M cohere 768d vectors, search latencies show an 
improvement of ~33%, and merge times 40-50% [3]. I have some ideas about how to 
further improve indexing, but they can be done separately. 

-Chris.

[1] https://github.com/apache/lucene/pull/14978
[2] https://github.com/apache/lucene/pull/15021
[3] https://github.com/apache/lucene/pull/14980#issuecomment-3155502316


> On 19 Jun 2025, at 20:37, Adrien Grand <jpou...@gmail.com> wrote:
> 
> Thanks Mike, this is useful information. Then I'll try to reproduce this 
> benchmark to better understand what is happening.
> 
> On Thu, Jun 19, 2025 at 8:16 PM Michael Sokolov <msoko...@gmail.com> wrote:
> We've recently been comparing Lucene's HNSW w/FAISS' and there is not
> a 2x difference there. FAISS does seem to be around 10-15% faster I
> think?  The 2x difference is roughly what I was seeing in comparisons
> w/hnswlib prior to the dot-product improvements we made in Lucene.
> 
> On Thu, Jun 19, 2025 at 2:12 PM Adrien Grand <jpou...@gmail.com> wrote:
> >
> > Chris,
> >
> > FWIW I was looking at luceneknn 
> > (https://github.com/erikbern/ann-benchmarks/blob/f402b2cc17b980d7cd45241ab5a7a4cc0f965e55/ann_benchmarks/algorithms/luceneknn/Dockerfile#L15)
> >  which is on 9.7, though I don't know if it enabled the incubating vector 
> > API at runtime?
> >
> > I hope that mentioning ANN benchmarks did not add noise to this thread, I 
> > was mostly looking at whether I could find another benchmark that suggests 
> > that Lucene is significantly slower in similar conditions. Does it align 
> > with other people's experience that Lucene is 2x slower or more compared 
> > with other good HNSW implementations?
> >
> > Adrien
> >
> > Le jeu. 19 juin 2025, 18:44, Chris Hegarty 
> > <christopher.hega...@elastic.co.invalid> a écrit :
> >>
> >> Hi Adrien,
> >>
> >> > Even though it uses Elasticsearch to run the benchmark, it really 
> >> > benchmarks Lucene functionality,
> >>
> >> Agreed.
> >>
> >> > This seems consistent with results from 
> >> > https://ann-benchmarks.com/index.html though I don't know if the cause 
> >> > of the performance difference is the same or not.
> >>
> >> On ann-benchmarks specifically. Unless I’m looking in the wrong place, 
> >> then they’re using Elasticsearch 8.7.0 [1], which predates our usage of 
> >> the Panama Vector API for vector search. We added support for that in 
> >> Lucene 9.7.0 -> Elasticsearch 8.9.0.  So those benchmarks are wildly out 
> >> of date, no ?
> >>
> >> -Chris.
> >>
> >> [1] 
> >> https://github.com/erikbern/ann-benchmarks/blob/f402b2cc17b980d7cd45241ab5a7a4cc0f965e55/ann_benchmarks/algorithms/elasticsearch/Dockerfile#L2
> >>
> >>
> >> > On 19 Jun 2025, at 16:39, Adrien Grand <jpou...@gmail.com> wrote:
> >> >
> >> > Hello all,
> >> >
> >> > I have been looking at this benchmark against Vespa recently: 
> >> > https://blog.vespa.ai/elasticsearch-vs-vespa-performance-comparison/. 
> >> > (The report is behind an annoying email wall, but I'm copying relevant 
> >> > data below, so hopefully you don't need to download the report.) Even 
> >> > though it uses Elasticsearch to run the benchmark, it really benchmarks 
> >> > Lucene functionality, I don't believe that Elasticsearch does anything 
> >> > that meaningfully alters the results that you would get if you were to 
> >> > run Lucene directly.
> >> >
> >> > The benchmark seems designed to highlight the benefits of Vespa's 
> >> > realtime design, that's fair game I guess. But it also runs some queries 
> >> > in read-only scenarios when I was expecting Lucene to perform better.
> >> >
> >> > One thing that got me curious is that it reports about 2x worse latency 
> >> > and throughput for pure unfiltered vector search on a force-merged index 
> >> > (so single segment/graph). Does anybody know why Lucene's HNSW may 
> >> > perform slower than Vespa's HNSW? This seems consistent with results 
> >> > from https://ann-benchmarks.com/index.html though I don't know if the 
> >> > cause of the performance difference is the same or not.
> >> >
> >> > For reference, here are details that apply to both Lucene and Vespa's 
> >> > vector search:
> >> >  - HNSW,
> >> >  - float32 vectors, no quantization,
> >> >  - embeddings generated using  Snowflake's Arctic-embed-xs model
> >> >  - 1M docs
> >> >  - 384 dimensions,
> >> >  - dot product,
> >> >  - m = 16,
> >> >  - max connections = 200,
> >> >  - search for top 10 hits,
> >> >  - no filter,
> >> >  - single client, so no search concurrency,
> >> >  - purple column is force-merged, so single segment/graph like Vespa.
> >> >
> >> > I never seriously looked at Lucene's vector search performance, so I'm 
> >> > very happy to be educated if I'm making naive assumptions!
> >> >
> >> > Somewhat related, is this the reason why I'm seeing many threads around 
> >> > bringing 3rd party implementations into Lucene, including ones that are 
> >> > very similar to Lucene on paper? To speed up vector search?
> >> >
> >> > --
> >> > Adrien
> >> > <vespa-vs-es-screenshot.png>
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> > For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: dev-h...@lucene.apache.org
> >>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 
> 
> 
> -- 
> Adrien


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Do we know why Lucene's HNSW may be slower than other HNSW implementations?

Reply via email to