[ https://issues.apache.org/jira/browse/SOLR-17447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881420#comment-17881420 ]
Siju Varghese edited comment on SOLR-17447 at 9/12/24 10:16 PM: ---------------------------------------------------------------- [~epugh] our latency budget for the "search as you type" use case is < 10ms. Our index is about 150GB. We plan to use both timeAllowed and maxHits. The timeAllowed is an upper bound, the maxHits is a lower bound. Basically the idea is that for the search as you type use case, just go through a limited number of docs and return what you find. Attached a WIP patch for reference. was (Author: JIRAUSER306991): [~epugh] our latency budget for the "search as you type" use case is < 10ms. Our index is about 150GB. We plan to use both timeAllowed and maxHits. The timeAllowed is an upper bound, the maxHits is a lower bound. Basically the idea is that for the search as you type use case, just go through a limited number of docs and return what you find. > Add support for maxHits > ----------------------- > > Key: SOLR-17447 > URL: https://issues.apache.org/jira/browse/SOLR-17447 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Components: SearchComponents - other > Reporter: Siju Varghese > Priority: Minor > Attachments: > Add_support_for_maxHits__Max_hits_is_a_hard_value_for_number__of_hits_the_searcher_iterate1.patch > > > Currently there are 3 mechanisms to control # of hits for a query > * Use of the _timeAllowed_ query parameter - Though this does not directly > control the number of hits, it has a similar effect with the collector > terminating after the specified time budget has exceeded. The primary > objective of this switch is to control runaway queries. > * Use of {{{}segmentTerminateEarly{}}}{\{ __ }}parameter - This parameter is > only applicable for sorted segments where the sort criteria requested matches > the sort criteria used in the SortingMergePolicy > * Use of cpuAllowed parameter to put upper bound on cpu time for a query. > > I would like to propose a new _maxHits_ parameter. This parameter early > terminates the query once it has gone past the provided number of hits per > shard. > For us the motivation for such a parameter is the following: > Our search is extremely latency sensitive and the query set is a mix of very > high frequency tokens where we favor fast recall and typical search queries > where we favor precision at low latency. The former can be thought of as a > search as you type use case and we want to ensure that we return the results > quickly and just go over enough documents we plan to control via the maxHits > parameter. We can't use a sorted index for our use case because the sort > criteria is a ranking function which is based off document features and the > user input. > With the maxHits parameter, it is quite likely that the results returned > might not be the most relevant ones, however that is acceptable for us. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org