[ 
https://issues.apache.org/jira/browse/SOLR-17928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18032183#comment-18032183
 ] 

Chris M. Hostetter commented on SOLR-17928:
-------------------------------------------

misc comments skimming the patch:
 * Jira summary/description (and the CHANGES entry in the patch) revert to 
adding an {{"efSearch"}} parameter – which is what the code variable gets 
named, but the actual parameter users will see is named {{"ef-search"}} in the 
code ... i don't know of any other local parameters in solr that use a dash 
character? using the cameal case {{"efSearch"}} would certainly be more 
consistent with other params
 * I had to re-read the {{SolrKnnXXXVectorQuery}} classes 4 times before I 
could wrap my head around why it was maintaining a local copy of {{private 
final int topK}} since the super class already has a {{protected final int k}} 
... can we please add a comment drawing attention to the fact that we pass 
{{efSearch}} to the super constructor to use as the "k" for the graph walk
 * Neither of the {{SolrKnnXXXVectorQuery}} classes override hashCode or 
equals, so similar queries with distinct {{topK}} values are going to cause 
false positive cache hits.
 * IIUC defaulting {{efSearch}} to {{topK * 2}} is a backcompat change – won't 
that make all knn queries two twice as much work under the covers (ideally 
providing better accuracy) on upgrade? ... not sure how i feel about that.  I 
feel like a lot of people may already setting {{topK}} based on how 
aggressively the want the graph walk to be, even if they only intend to look at 
a smaller number of results.
 * I don't see any validation that {{efSearch}} is bigger then {{topK}} ... 
should there be?

> Add efSearch parameter to knn query
> -----------------------------------
>
>                 Key: SOLR-17928
>                 URL: https://issues.apache.org/jira/browse/SOLR-17928
>             Project: Solr
>          Issue Type: Improvement
>          Components: vector-search
>            Reporter: Ishan Chattopadhyaya
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now, only topK can be requested. efSearch is a standard overfetch 
> parameter.
> Proposing that we add it for better recall accuracy.
> (FYI, Elasticsearch calls it num_candidates. Commonly referred to as 
> efSearch, similar to efConstruction that we call beamWidth)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to