tomglk commented on a change in pull request #151:
URL: https://github.com/apache/solr/pull/151#discussion_r640664451



##########
File path: 
solr/core/src/java/org/apache/solr/handler/component/QueryComponent.java
##########
@@ -999,20 +1013,34 @@ protected void mergeIds(ResponseBuilder rb, ShardRequest 
sreq) {
 
           shardDoc.sortFieldValues = unmarshalledSortFieldValues;
 
-          queue.insertWithOverflow(shardDoc);
+          if(reRankQueue != null && docCounter++ <= reRankDocsSize) {
+              ShardDoc droppedShardDoc = 
reRankQueue.insertWithOverflow(shardDoc);
+              // FIXME: Only works if the original request does not sort by 
score

Review comment:
       The current solution only works if the original sort did not sort by 
score.
   
   This is because the score of the documents is overwritten during the 
reRanking.
   The reRankDocs-param which specifies the amount of docs that should be 
reRanked, is used per shard, but also has to be applied while combining the 
results.
   
   Therefore, each shard response may contain documents which were reRanked on 
the shard, but should not be reRanked in the combined result.
   
   Example:
   reRankDocs = 2
   **shard1:** doc_1 (score 200, reRanked), doc_2 (score 100, reRanked, 
original score 40), doc_3 (score 30)
   **shard2:** doc_4 (score 300, reRanked), doc_5 (score 50, reRanked, original 
score 25), doc_6 (score 20)
   
   **expected result:**
   doc_4 (score 300, reRanked), doc_1 (score 200, reRanked), doc_2, _doc_3, 
doc_5_, doc_6
   
   **actual result:**
   doc_4 (score 300, reRanked), doc_1 (score 200, reRanked), doc_2, _doc_5, 
doc_3_, doc_6
   
   The problem is, that we compare the score after reRanking (doc 2 & 5) with 
the score before reRanking (doc 3 & 6).
   We have no access to the score before reRanking at this point and I 
currently see no possibility to retrieve it again.
   Depending on the used reRanking algorithm, these scores may differ greatly, 
which results in an incorrect ordering of the results starting at position > 
reRankDocs.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to