large topk,
but I guess you need it?
Best regards
Kent Fitch
On Thu, 28 Mar 2024, 5:12 am Iram Tariq,
wrote:
> Hi All,
>
> I am using Dense vectors in SOLR and facing slowness in it. Each search is
> taking 10-25 seconds. I want to reduce the time to 5 seconds (or less
> ide
h api's "k" parameter,
I believe that the HNSW graph for a new segment (whether being created due
to merging or new records) does need to be entirely in memory, allocated in
the JVM heap: I have had the need to increase the JVM heap when some
extremely large HNSW segment files ha
egment they were written to is
eventually merged as the index grows). So, each HNSW segment is searched
independently and the results from all segments are combined and ranked by
score. I guess deleted documents will degrade results until they are
removed by the segment containing them being merged, but I have no
knowledge/experience of how results degrade based on % of deleted
documents, "M" and beamwidth (construction and search) settings.
best regards
Kent Fitch
s required a lot of memory:
I guess with multiple segment merges happening in the JVM, and each dealing
with multiple and large memory representations of their incoming and
outgoing segment's HNSW graphes, a lot of heap is required!
best regards
Kent Fitch
On Wed, Mar 1, 2023 at 2:23 A
get embeddings for 160M articles for this test - we
are just trying to test whether Lucene's HNSW is feasible for our
use-case), so in the overwhelming majority of "misses", the top article is
indeed very similar to the article sought. That is, for our use case, the
results are sa