Re: Performance Suggestion for Dense Vectors

2024-03-27 Thread Kent Fitch
large topk, but I guess you need it? Best regards Kent Fitch On Thu, 28 Mar 2024, 5:12 am Iram Tariq, wrote: > Hi All, > > I am using Dense vectors in SOLR and facing slowness in it. Each search is > taking 10-25 seconds. I want to reduce the time to 5 seconds (or less > ide

Re: KNN HNSW - How does "indexing" and "updating" work ?

2023-03-15 Thread Kent Fitch
h api's "k" parameter, I believe that the HNSW graph for a new segment (whether being created due to merging or new records) does need to be entirely in memory, allocated in the JVM heap: I have had the need to increase the JVM heap when some extremely large HNSW segment files ha

Re: How/When does SOLR build KNN HNSW index ?

2023-02-28 Thread Kent Fitch
egment they were written to is eventually merged as the index grows). So, each HNSW segment is searched independently and the results from all segments are combined and ranked by score. I guess deleted documents will degrade results until they are removed by the segment containing them being merged, but I have no knowledge/experience of how results degrade based on % of deleted documents, "M" and beamwidth (construction and search) settings. best regards Kent Fitch

Re: KNN HNSW - performance over time with document updates

2023-02-28 Thread Kent Fitch
s required a lot of memory: I guess with multiple segment merges happening in the JVM, and each dealing with multiple and large memory representations of their incoming and outgoing segment's HNSW graphes, a lot of heap is required! best regards Kent Fitch On Wed, Mar 1, 2023 at 2:23 A

Re: KNN HNSW - performance over time with document updates

2023-02-26 Thread Kent Fitch
get embeddings for 160M articles for this test - we are just trying to test whether Lucene's HNSW is feasible for our use-case), so in the overwhelming majority of "misses", the top article is indeed very similar to the article sought. That is, for our use case, the results are sa