I don't have enough knowledge to give the greenlight on this, but on a high 
level it seems reasonable. This question is probably more appropriate to 
surface on the dev mailing list rather than the users mailing list. Hopefully 
you get more thoughts on your patch suggestion there

-Kevin

From: users@solr.apache.org At: 03/25/25 20:24:32 UTC-4:00To:  
users@solr.apache.org
Cc:  chirayu.sama...@bloomreach.com
Subject: Performance issue: distributed grouping + dense vector search

Hi All,

When running vector search with grouping in multi-shard setting, the
KnnFloatVectorQuery is executed (2+rows) times.

The culprit is this function call in
https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/sear
ch/grouping/distributed/command/TopGroupsFieldCommand.java#L202

if (needScores) {
  for (GroupDocs<?> group : topGroups.groups) {
    TopFieldCollector.populateScores(group.scoreDocs, searcher, query);
  }
}

Where
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucen
e/search/TopFieldCollector.java#L403
does

final Weight weight = searcher.createWeight(searcher.rewrite(query),
ScoreMode.COMPLETE, 1);

Here, if the query is KnnFloatVectorQuery, *searcher.rewrite(query) *will
execute the same vector search for each topGroups.groups

a simple fix could be moving *searcher.rewrite(query) *out of the
topGroups.groups loop. Thoughts?

Best,

Yue


Reply via email to