[ https://issues.apache.org/jira/browse/SOLR-17670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17929380#comment-17929380 ]
ASF subversion and git services commented on SOLR-17670: -------------------------------------------------------- Commit 6e2b61e529ad2c8d9068740dffb9cab8f4d9416e in solr's branch refs/heads/branch_9_8 from jiabao.gao [ https://gitbox.apache.org/repos/asf?p=solr.git;h=6e2b61e529a ] SOLR-17670: Fix unnecessary memory allocation caused by a large reRankDocs param (#3181) (cherry picked from commit 76c09a35dba42913a6bcb281b52b00f87564624a) > Fix unnecessary memory allocation caused by a large reRankDocs param > -------------------------------------------------------------------- > > Key: SOLR-17670 > URL: https://issues.apache.org/jira/browse/SOLR-17670 > Project: Solr > Issue Type: Bug > Reporter: JiaBaoGao > Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > The reRank function has a reRankDocs parameter that specifies the number of > documents to re-rank. I've observed that increasing this parameter to test > its performance impact causes queries to become progressively slower. Even > when the parameter value exceeds the total number of documents in the index, > further increases continue to slow down the query, which is counterintuitive. > > Therefore, I investigated the code: > > For a query containing re-ranking, such as: > {code:java} > { > "start": "0", > "rows": 10, > "fl": "ID,score", > "q": "*:*", > "rq": "{!rerank reRankQuery='{!func} 100' reRankDocs=1000000000 > reRankWeight=2}" > } {code} > > The current execution logic is as follows: > 1. Perform normal retrieval using the q parameter. > 2. Re-score all documents retrieved in the q phase using the rq parameter. > > During the retrieval in phase 1 (using q), a TopScoreDocCollector is created. > Underneath, this creates a PriorityQueue which contains an Object[]. The > length of this Object[] continuously increases with reRankDocs without any > limit. > > On my local test cluster with limited JVM memory, this can even trigger an > OOM, causing the Solr node to crash. I can also reproduce the OOM situation > using the SolrCloudTestCase unit test. > > I think limiting the length of the Object[] array using > searcher.getIndexReader().maxDoc() at ReRankCollector would resolve this > issue. This way, when reRankDocs exceeds maxDoc, memory allocation will not > continue to increase indefinitely. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org