[GitHub] [solr] cpoerschke commented on a change in pull request #8: SOLR-14607: LTR Query, the TimeAllowed parameter causes a timeout exception after no results

GitBox Thu, 11 Mar 2021 11:14:59 -0800


cpoerschke commented on a change in pull request #8:
URL: https://github.com/apache/solr/pull/8#discussion_r592622736




##########
File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRRescorer.java
##########
@@ -181,14 +185,19 @@ public void scoreFeatures(IndexSearcher indexSearcher,
         readerContext = leaves.get(readerUpto);
         endDoc = readerContext.docBase + readerContext.reader().maxDoc();
       }
-      // We advanced to another segment
-      if (readerContext != null) {
-        docBase = readerContext.docBase;
-        scorer = modelWeight.scorer(readerContext);
+      try{
+        // We advanced to another segment
+        if (readerContext != null) {
+          docBase = readerContext.docBase;
+          scorer = modelWeight.scorer(readerContext);
+        }
+        scoreSingleHit(indexSearcher, topN, modelWeight, docBase, hitUpto, 
hit, docID, scoringQuery, scorer, reranked);
+        hitUpto++;
+      } catch (ExitableDirectoryReader.ExitingReaderException ex) {
+        break;

Review comment:
       An advantage of handling the exception here is that the code has an 
opportunity to do "as much as possible" reranking in the available time i.e. 
some reranking rather than no reranking. A disadvantage is that the calling 
code and end user don't know if the returned results were the "true" results or 
if they are "partial" because there was only enough time for part of the 
reranking work.

##########
File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRScoringQuery.java
##########
@@ -249,6 +250,8 @@ private void createWeights(IndexSearcher searcher, boolean 
needsScores,
       try{
         Feature.FeatureWeight fw = f.createWeight(searcher, needsScores, req, 
originalQuery, efi);
         featureWeights.add(fw);
+      } catch (ExitableDirectoryReader.ExitingReaderException ex) {
+        throw ex;

Review comment:
       Differentiating `ExitableDirectoryReader.ExitingReaderException` from 
other `Exception` types here would give calling code an opportunity to handle 
the exception. I haven't yet explored how perhaps 
https://github.com/apache/solr/blob/main/solr/contrib/ltr/src/test/org/apache/solr/ltr/TestLTRScoringQuery.java
 could include testing for `ExitableDirectoryReader.ExitingReaderException` 
exceptions.

##########
File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/LTRRescorer.java
##########
@@ -124,10 +125,13 @@ public TopDocs rescore(IndexSearcher searcher, TopDocs 
firstPassTopDocs,
     final LTRScoringQuery.ModelWeight modelWeight = 
(LTRScoringQuery.ModelWeight) searcher
         .createWeight(searcher.rewrite(scoringQuery), ScoreMode.COMPLETE, 1);
 
-    scoreFeatures(searcher,topN, modelWeight, firstPassResults, leaves, 
reranked);
+    int hitUpto = scoreFeatures(searcher,topN, modelWeight, firstPassResults, 
leaves, reranked);

Review comment:
       Via this change here `scoreFeatures` handles the 
`ExitableDirectoryReader.ExitingReaderException` scenario but in the 
[artificial 
scenario](https://issues.apache.org/jira/browse/SOLR-14607?focusedCommentId=17299792&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17299792)
 using the `techproducts` example the `searcher.createWeight` was also 
encountering the exception.

##########
File path: solr/core/src/java/org/apache/solr/search/ReRankCollector.java
##########
@@ -150,6 +160,12 @@ public TopDocs topDocs(int start, int howMany) {
         rescoredDocs.scoreDocs = scoreDocs;
         return rescoredDocs;
       }
+    } catch (ExitableDirectoryReader.ExitingReaderException ex) {
+      if (mainDocs != null) {
+        throw new UnReRankedTopDocs(mainDocs);
+      } else {
+        throw ex;

Review comment:
       If we managed to get the top docs but then rescoring encountered the 
timeout exception then we could provide the top docs to the caller, for the 
caller to decide if un-reranked documents are better than no documents at all.

##########
File path: solr/core/src/java/org/apache/solr/search/ReRankCollector.java
##########
@@ -97,9 +105,11 @@ public ScoreMode scoreMode() {
   @SuppressWarnings({"unchecked"})
   public TopDocs topDocs(int start, int howMany) {
 
+    TopDocs mainDocs = null;
+
     try {
 
-      TopDocs mainDocs = mainCollector.topDocs(0,  Math.max(reRankDocs, 
length));
+      mainDocs = mainCollector.topDocs(0,  Math.max(reRankDocs, length));

Review comment:
       Here we get the top docs (and I image we could hit the timeout exception 
at this point) and then at line 119 we rescore the documents (and there the 
timeout exception can also happen).

##########
File path: solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java
##########
@@ -1599,7 +1599,13 @@ public ScoreMode scoreMode() {
       ScoreMode scoreModeUsed = buildAndRunCollectorChain(qr, query, 
collector, cmd, pf.postFilter).scoreMode();
 
       totalHits = topCollector.getTotalHits();
-      TopDocs topDocs = topCollector.topDocs(0, len);
+      TopDocs topDocs = null;
+      try {
+        topDocs = topCollector.topDocs(0, len);
+      } catch (ReRankCollector.UnReRankedTopDocs urrTopDocs) {
+        topDocs = urrTopDocs.topDocs;
+        qr.setPartialResults(true);
+      }

Review comment:
       `buildAndRunCollectorChain` already handles the timeout exception but 
afterwards `topCollector.topDocs` can also encounter the timeout exception. If 
un-reranked documents are available then we provide them here and indicate to 
the caller that the response is partial. The `timeAllowed` query parameter [1] 
and the `partialResults` response flag go together i.e. the client would 
already know to check about it. Also this would apply to both reranking via 
`ltr` and reranking via the `rerank` [2] parser code paths.
   
   [1] 
https://solr.apache.org/guide/8_8/query-re-ranking.html#rerank-query-parser
   [2] 
https://solr.apache.org/guide/8_8/common-query-parameters.html#timeallowed-parameter




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [solr] cpoerschke commented on a change in pull request #8: SOLR-14607: LTR Query, the TimeAllowed parameter causes a timeout exception after no results

Reply via email to