[ 
https://issues.apache.org/jira/browse/SOLR-17670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiaBaoGao updated SOLR-17670:
-----------------------------
    Security:     (was: Public)

> Fix unnecessary memory allocation caused by a large reRankDocs param
> --------------------------------------------------------------------
>
>                 Key: SOLR-17670
>                 URL: https://issues.apache.org/jira/browse/SOLR-17670
>             Project: Solr
>          Issue Type: Bug
>            Reporter: JiaBaoGao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> The reRank function has a reRankDocs parameter that specifies the number of 
> documents to re-rank. I've observed that increasing this parameter to test 
> its performance impact causes queries to become progressively slower. Even 
> when the parameter value exceeds the total number of documents in the index, 
> further increases continue to slow down the query, which is counterintuitive.
>  
> Therefore, I investigated the code:
>  
> For a query containing re-ranking, such as:
> {code:java}
> {
> "start": "0",
> "rows": 10,
> "fl": "ID,score",
> "q": "*:*",
> "rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000 
> reRankWeight=2}"
> } {code}
>  
> The current execution logic is as follows:
> 1. Perform normal retrieval using the q parameter.
> 2. Re-score all documents retrieved in the q phase using the rq parameter.
>  
> During the retrieval in phase 1 (using q), a TopScoreDocCollector is created. 
> Underneath, this creates a PriorityQueue which contains an Object[]. The 
> length of this Object[] continuously increases with reRankDocs without any 
> limit. 
>  
> On my local test cluster with limited JVM memory, this can even trigger an 
> OOM, causing the Solr node to crash. I can also reproduce the OOM situation 
> using the SolrCloudTestCase unit test. 
>  
> I think limiting the length of the Object[] array using 
> searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this 
> issue. This way, when reRankDocs exceeds maxDoc, memory allocation will not 
> continue to increase indefinitely. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to