[ 
https://issues.apache.org/jira/browse/SOLR-17319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855738#comment-17855738
 ] 

Alessandro Benedetti commented on SOLR-17319:
---------------------------------------------

Hi [~hossman], I suggest we move all discussions on the Pull Request where me 
and [~dsmiley] are currently iterating?

Thanks for your input by the way, let's see how the PR evolves and take it from 
there!

In regards to the distributed support, I would not say that "it doesn't return 
correct results": 

Combining query results (such as Reciprocal Rank Fusion) can happen per node(1) 
or in the coordinator (2) (after results are already merged from different 
shards).
(1) is not ideal in my opinion, but I don't think it's strictly wrong: the new 
score from Reciprocal Rank Fusion is based on ranking and it is still 
comparable across shards (we do the same for Learning To Rank in Apache Solr)
(2) the reason I believe this to be a welcome improvement is that in this way 
we first find the distributed best-ranked list for each of the queries, and we 
only combine them afterwards

If we think (1) to be unbearable I am happy to wait for someone to contribute 
(2) but I am afraid I won't be able to work on it anytime soon, and possibly it 
would a shame to delay this feature up to that point only?

let's keep the discussion in one place only and continue on Github!
Where we can also check on the rest of your ideas that are much appreciated!

> Introduce support for Reciprocal Rank Fusion (combining queries)
> ----------------------------------------------------------------
>
>                 Key: SOLR-17319
>                 URL: https://issues.apache.org/jira/browse/SOLR-17319
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query
>    Affects Versions: 9.6.1
>            Reporter: Alessandro Benedetti
>            Assignee: Alessandro Benedetti
>            Priority: Major
>
> Reciprocal Rank Fusion (RRF) is an algorithm that takes in input multiple 
> ranked lists to produce a unified result set. 
> Examples of use cases where RRF can be used include hybrid search and 
> multiple Knn vector queries executed concurrently. 
> RRF is based on the concept of reciprocal rank, which is the inverse of the 
> rank of a document in a ranked list of search results. 
> The combination of search results happens taking into account the position of
>  the items in the original rankings, and giving higher score to items that 
> are ranked higher in multiple lists. RRF was introduced the first time by 
> Cormack et al. in [1].
> The syntax proposed:
> JSON Request
> {code:json}
> {
>     "queries": {
>         "lexical1": {
>             "lucene": {
>                 "query": "id:(10^=2 OR 2^=1 OR 4^=0.5)"
>             }
>         },
>         "lexical2": {
>             "lucene": {
>                 "query": "id:(2^=2 OR 4^=1 OR 3^=0.5)"
>             }
>         }
>     },
>     "limit": 10,
>     "fields": "[id,score]",
>     "params": {
>         "combiner": true,
>         "combiner.upTo": 5,
>         "facet": true,
>         "facet.field": "id",
>         "facet.mincount": 1
>     }
> }
> {code}
> [1] Cormack, Gordon V. et al. “Reciprocal rank fusion outperforms condorcet 
> and individual rank learning methods.” Proceedings of the 32nd international 
> ACM SIGIR conference on Research and development in information retrieval 
> (2009)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to