[
https://issues.apache.org/jira/browse/SOLR-8776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190864#comment-15190864
]
Diego Ceccarelli commented on SOLR-8776:
----------------------------------------
I uploaded a new patch, now groups are reranked according to the reranking max
scores, in the {{finish()}} method of the grouping {{CommandField}} I added:
{code:java}
if (result != null && query instanceof RankQuery && groupSort ==
Sort.RELEVANCE){
// if we are sorting for relevance and query is a RankQuery, it may be
that
// the order of the groups changed, we need to reorder
GroupDocs[] groups = result.groups;
Arrays.sort(groups, new Comparator<GroupDocs>() {
@Override
public int compare(GroupDocs o1, GroupDocs o2) {
if (o1.maxScore > o2.maxScore) return -1;
if (o1.maxScore < o2.maxScore) return 1;
return 0;
}});
}
{code}
This will reorder the groups if we re-rank the documents with the rank query.
The second test succeeds.
I'm still thinking what it should be the correct semantic to implement
reranking + grouping:
When you apply a query {{q}} and then a rank-query {{rq}} , you first score all
the documents and then rescore top-N documents with the rank-query. The problem
with grouping is that in order to get the top-groups you first need to score
the collection: you may have a document that scored really low with {{q}} but
got a high score with {{rq}}, but the only way to find it is to rerank the
whole collection (impracticable). There are two possible solutions then:
- if we want to apply {{rq}} on the top 1000 documents, we can collect the
groups in the top-1000 documents, and they will be the same obtained scoring
directly with {{rq}}, but in a different order;
- we can collect more groups than what we need, and then rerank the top
documents in each group - I would call this solution: **Group Reranking**.
In my opinion group reranking is a better solution: imagine we have a group
containing the top-1000 documents ranked with {{q}} we will rerank them maybe
just to return one document. I guess the best would be, assuming that we want
to apply rerank query to N documents and return the top K groups you can
retrieve top K*y groups and then rerank N/(K*y) documents in each group.
> Support RankQuery in grouping
> -----------------------------
>
> Key: SOLR-8776
> URL: https://issues.apache.org/jira/browse/SOLR-8776
> Project: Solr
> Issue Type: Improvement
> Components: search
> Affects Versions: master
> Reporter: Diego Ceccarelli
> Priority: Minor
> Fix For: master
>
> Attachments: 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch,
> 0001-SOLR-8776-Support-RankQuery-in-grouping.patch
>
>
> Currently it is not possible to use RankQuery [1] and Grouping [2] together
> (see also [3]). In some situations Grouping can be replaced by Collapse and
> Expand Results [4] (that supports reranking), but i) collapse cannot
> guarantee that at least a minimum number of groups will be returned for a
> query, and ii) in the Solr Cloud setting you will have constraints on how to
> partition the documents among the shards.
> I'm going to start working on supporting RankQuery in grouping. I'll start
> attaching a patch with a test that fails because grouping does not support
> the rank query and then I'll try to fix the problem, starting from the non
> distributed setting (GroupingSearch).
> My feeling is that since grouping is mostly performed by Lucene, RankQuery
> should be refactored and moved (or partially moved) there.
> Any feedback is welcome.
> [1] https://cwiki.apache.org/confluence/display/solr/RankQuery+API
> [2] https://cwiki.apache.org/confluence/display/solr/Result+Grouping
> [3]
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201507.mbox/%3ccahm-lpuvspest-sw63_8a6gt-wor6ds_t_nb2rope93e4+s...@mail.gmail.com%3E
> [4]
> https://cwiki.apache.org/confluence/display/solr/Collapse+and+Expand+Results
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]