[ https://issues.apache.org/jira/browse/SOLR-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318161#comment-17318161 ]
Christine Poerschke commented on SOLR-14607: -------------------------------------------- Returning to this, I wonder if it might be helpful to "brain storm" some scenarios and possiblities, away from and complimentary to code/pull request discussions? ---- imaginary setup: * our collection contains 1000 documents * 500 documents match the query {{foobar:hello}} * we want to rerank the best 100 of the 500 matching documents * we want to return the best 10 of the 500 reranked documents * we want features to be returned for the documents search parameters: {code} q=foobar:hello&rows=10&rq={!ltr model=myModel reRankDocs=100}&fl=id,foobar,[features],score {code} assumptions (just to help us think about different scenarios): * matching the query takes 1ms per document i.e. 500ms for the 500 documents that match the query * reranking the documents takes 10ms per document i.e. 1000ms for the best 100 of the 500 matching documents scenario 1: no timeAllowed limit * 500ms + 1000ms = 1500ms * the search completes in one and a half seconds, returning full results and feature values scenario 2a: timeAllowed=5ms * the search logically hits the timeAllowed limit whilst matching the query, after matching 5 documents * there is no time left for reranking * 5 documents is less than the rows=10 and also less than reRankDocs=100 scenario 2b: timeAllowed=80ms * the search logically hits the timeAllowed limit whilst matching the query, after matching 80 documents * there is no time left for reranking * 80 documents is more than the rows=10 and but it is less than reRankDocs=100 scenario 2c: timeAllowed=123ms * the search logically hits the timeAllowed limit whilst matching the query, after matching 123 documents * there is no time left for reranking * 123 documents is more than the rows=10 and also more than reRankDocs=100 scenario 3a: timeAllowed=550ms * the search spends 500ms matching the query * 50ms are left for reranking and in that time the features for 5 documents could be computed * 5 documents is less than the rows=10 and it is also less than reRankDocs=100 scenario 3b: timeAllowed=750ms * the search spends 500ms matching the query * 250ms are left for reranking and in that time the features for 25 documents could be computed * 25 documents is more than the rows=10 and but it is less than reRankDocs=100 ---- brain storming possibilities: all scenarios: * possibility 0: disallow use of {{timeAllowed}} with {{ltr}} re-ranking scenario 2: * possibility 1: don't compute feature values, don't do re-ranking, return the partial results with the existing {{partialResults}} flag set in the response, from the flag the caller understands that re-ranking did not happen. * possibility 2: don't do re-ranking but do compute and return feature values. return the partial results with the existing {{partialResults}} flag set in the response, from the flag the caller understands that re-ranking did not happen and via the presence of feature values the caller understands that the {{timeAllowed}} was not fully respected i.e. after the allowed time was used still additional time was spent computing features * possibility 3: do compute and return feature values, do do re-ranking, return the results with the existing {{partialResults}} flag set in the response, from the response the caller understands that the {{timeAllowed}} was not fully respected i.e. after the allowed time was used still additional time was spent computing features and re-ranking scenario 3: * possibility 4: don't compute remaining features, don't return any features, don't do re-ranking, return the results based on original scores and indicate via a new {{rerankingOmitted=true}} or similar flag in the response that re-ranking was not done. * possibility 5: do compute remaining features, return all features, don't do re-ranking, return the results based on original scores and indicate via a new {{rerankingOmitted=true}} flag in the response that re-ranking was skipped. via the presence of feature values and the presence of the flag the caller understands that the {{timeAllowed}} was not fully respected * possibility 6: do compute remaining features, return all features, do do re-ranking. the caller understands from documentation that {{timeAllowed}} is not applied during re-ranking. > LTR Query, timeAllowed parameter causes a timeout exception with no result > -------------------------------------------------------------------------- > > Key: SOLR-14607 > URL: https://issues.apache.org/jira/browse/SOLR-14607 > Project: Solr > Issue Type: Improvement > Components: contrib - LTR > Affects Versions: main (9.0) > Reporter: Dawn > Priority: Minor > Attachments: SOLR-14607-poc.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > When using the LTR, open timeAllowed parameter, LTR feature of query may call > 'ExitableFilterAtomicReader.CheckAndThrow' timeout checks. > If a timeout occurs at this point, the exception ExitingReaderException is > thrown, Lead to null result. > Exception information: > {code:java} > The request took too long to iterate over terms. Timeout: timeoutAt: > 50321611131050 (System.nanoTime(): 50321639573838), > TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@62eaeeaa > {code} > > Can hold this exception in the LTR, returning partial results rather than > null. > This exception occurs in two places: > 1. 'LTRScoringQuery.CreateWeight' or 'LTRScoringQuery.createWeightsParallel'. > Here is the loading stage, timeout directly end is acceptable. > 2. 'ModelWeight.scorer'. This is a stage that evaluates each Doc and can > catch the exception, returns the computed document. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org