[ 
https://issues.apache.org/jira/browse/SOLR-14607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318161#comment-17318161
 ] 

Christine Poerschke commented on SOLR-14607:
--------------------------------------------

Returning to this, I wonder if it might be helpful to "brain storm" some 
scenarios and possiblities, away from and complimentary to code/pull request 
discussions?

----

imaginary setup:

* our collection contains 1000 documents
* 500 documents match the query {{foobar:hello}}
* we want to rerank the best 100 of the 500 matching documents
* we want to return the best 10 of the 500 reranked documents
* we want features to be returned for the documents

search parameters:
{code}
q=foobar:hello&rows=10&rq={!ltr model=myModel 
reRankDocs=100}&fl=id,foobar,[features],score
{code}

assumptions (just to help us think about different scenarios):
* matching the query takes 1ms per document i.e. 500ms for the 500 documents 
that match the query
* reranking the documents takes 10ms per document i.e. 1000ms for the best 100 
of the 500 matching documents

scenario 1: no timeAllowed limit
* 500ms + 1000ms = 1500ms
* the search completes in one and a half seconds, returning full results and 
feature values

scenario 2a: timeAllowed=5ms
* the search logically hits the timeAllowed limit whilst matching the query, 
after matching 5 documents
* there is no time left for reranking
* 5 documents is less than the rows=10 and also less than reRankDocs=100

scenario 2b: timeAllowed=80ms
* the search logically hits the timeAllowed limit whilst matching the query, 
after matching 80 documents
* there is no time left for reranking
* 80 documents is more than the rows=10 and but it is less than reRankDocs=100

scenario 2c: timeAllowed=123ms
* the search logically hits the timeAllowed limit whilst matching the query, 
after matching 123 documents
* there is no time left for reranking
* 123 documents is more than the rows=10 and also more than reRankDocs=100

scenario 3a: timeAllowed=550ms
* the search spends 500ms matching the query
* 50ms are left for reranking and in that time the features for 5 documents 
could be computed
* 5 documents is less than the rows=10 and it is also less than reRankDocs=100

scenario 3b: timeAllowed=750ms
* the search spends 500ms matching the query
* 250ms are left for reranking and in that time the features for 25 documents 
could be computed
* 25 documents is more than the rows=10 and but it is less than reRankDocs=100

----

brain storming possibilities:

all scenarios:
* possibility 0: disallow use of {{timeAllowed}} with {{ltr}} re-ranking

scenario 2:
* possibility 1: don't compute feature values, don't do re-ranking, return the 
partial results with the existing {{partialResults}} flag set in the response, 
from the flag the caller understands that re-ranking did not happen.
* possibility 2: don't do re-ranking but do compute and return feature values. 
return the partial results with the existing {{partialResults}} flag set in the 
response, from the flag the caller understands that re-ranking did not happen 
and via the presence of feature values the caller understands that the 
{{timeAllowed}} was not fully respected i.e. after the allowed time was used 
still additional time was spent computing features
* possibility 3: do compute and return feature values, do do re-ranking, return 
the results with the existing {{partialResults}} flag set in the response, from 
the response the caller understands that the {{timeAllowed}} was not fully 
respected i.e. after the allowed time was used still additional time was spent 
computing features and re-ranking

scenario 3:
* possibility 4: don't compute remaining features, don't return any features, 
don't do re-ranking, return the results based on original scores and indicate 
via a new {{rerankingOmitted=true}} or similar flag in the response that 
re-ranking was not done.
* possibility 5: do compute remaining features, return all features, don't do 
re-ranking, return the results based on original scores and indicate via a new 
{{rerankingOmitted=true}} flag in the response that re-ranking was skipped. via 
the presence of feature values and the presence of the flag the caller 
understands that the {{timeAllowed}} was not fully respected
* possibility 6: do compute remaining features, return all features, do do 
re-ranking. the caller understands from documentation that {{timeAllowed}} is 
not applied during re-ranking.

> LTR Query, timeAllowed parameter causes a timeout exception with no result
> --------------------------------------------------------------------------
>
>                 Key: SOLR-14607
>                 URL: https://issues.apache.org/jira/browse/SOLR-14607
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - LTR
>    Affects Versions: main (9.0)
>            Reporter: Dawn
>            Priority: Minor
>         Attachments: SOLR-14607-poc.patch
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When using the LTR, open timeAllowed parameter, LTR feature of query may call 
> 'ExitableFilterAtomicReader.CheckAndThrow' timeout checks.
> If a timeout occurs at this point, the exception ExitingReaderException is 
> thrown, Lead to null result.
> Exception information:
> {code:java}
>  The request took too long to iterate over terms. Timeout: timeoutAt: 
> 50321611131050 (System.nanoTime(): 50321639573838), 
> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@62eaeeaa
> {code}
>  
> Can hold this exception in the LTR, returning partial results rather than 
> null.
> This exception occurs in two places:
> 1. 'LTRScoringQuery.CreateWeight' or 'LTRScoringQuery.createWeightsParallel'. 
> Here is the loading stage, timeout directly end is acceptable.
> 2. 'ModelWeight.scorer'. This is a stage that evaluates each Doc and can 
> catch the exception, returns the computed document.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to