Hi all,

I'm a bit uncertain how KNN with HNSW works in SOLR with dense vector
fields and searching.

Recently I've been doing tests loading dense vectors after inferencing
[images] and then checking by eye the closest matches and the results look
funny (very similar images not being the nearest results as I'd normally
expect).

I'm unclear about HNSW in general (like what are the best policies, or a
good guide or starting point, for choosing hnswMaxConnections and
hnswBeamWidth values if you know the dense vector length (512) and you know
you have 2 million+ documents).

But one thing I'm wondering right now is with a dataset over time, where
documents have been added and documents have been removed over time, can
this affect the KNN search (i.e. is it better if all documents, or at least
the dense vector field, had be indexed fresh) ?

BTW I haven't yet moved from SOLR 9.0 to 9.1 but I do read that the HNSW
codec has changed in some way so a reindex is required - I should probably
try 9.1 (I would prioritise this if anyone says 9.1 is better quality or
better performance for KNN searches!).

Thanks for any info!

Derek

-- 
Derek Conniffe
Harvey Software Systems Ltd T/A HSSL
Telephone (IRL): 086 856 3823
Telephone (US): (650) 449 6044
Skype: dconnrt
Email: de...@hssl.ie


*Disclaimer:* This email and any files transmitted with it are confidential
and intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please delete it
(if you are not the intended recipient you are notified that disclosing,
copying, distributing or taking any action in reliance on the contents of
this information is strictly prohibited).
*Warning*: Although HSSL have taken reasonable precautions to ensure no
viruses are present in this email, HSSL cannot accept responsibility for
any loss or damage arising from the use of this email or attachments.
P For the Environment, please only print this email if necessary.

Reply via email to