JiaBaoGao created SOLR-17670:
--------------------------------

             Summary: Fix unnecessary memory allocation caused by a large 
reRankDocs param
                 Key: SOLR-17670
                 URL: https://issues.apache.org/jira/browse/SOLR-17670
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: JiaBaoGao


The reRank function has a reRankDocs parameter that specifies the number of 
documents to re-rank. I've observed that increasing this parameter to test its 
performance impact causes queries to become progressively slower. Even when the 
parameter value exceeds the total number of documents in the index, further 
increases continue to slow down the query, which is counterintuitive.
 
Therefore, I investigated the code:
 
For a query containing re-ranking, such as:
{code:java}
{
"start": "0",
"rows": 10,
"fl": "ID,score",
"q": "*:*",
"rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000 reRankWeight=2}"
} {code}
 
The current execution logic is as follows:
1. Perform normal retrieval using the q parameter.
2. Re-score all documents retrieved in the q phase using the rq parameter.
 
During the retrieval in phase 1 (using q), a TopScoreDocCollector is created. 
Underneath, this creates a PriorityQueue which contains an Object[]. The length 
of this Object[] continuously increases with reRankDocs without any limit. 
 
On my local test cluster with limited JVM memory, this can even trigger an OOM, 
causing the Solr node to crash. I can also reproduce the OOM situation using 
the SolrCloudTestCase unit test. 
 
I think limiting the length of the Object[] array using 
searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this issue. 
This way, when reRankDocs exceeds maxDoc, memory allocation will not continue 
to increase indefinitely. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to