[ https://issues.apache.org/jira/browse/SOLR-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400988#comment-17400988 ]
Yael Mushinsky commented on SOLR-15586: --------------------------------------- Tried to remove the code colored in green with small patch: org.apache.solr.search.Grouping.CommandFunc: @Override @SuppressWarnings(\{"unchecked"}) protected void finish() throws IOException { if (secondPass != null) { result = secondPass.getTopGroups(0); {color:#00875a}*populateScoresIfNecessary();* {color} } I get the scores and everything seems working ... what is it's purpose? > After upgrading to Solr 8.7 from 5.2.1 seems that asking for 'score' in 'fl' > causes the query to take almost twice (or so) the time (regression?) > ------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-15586 > URL: https://issues.apache.org/jira/browse/SOLR-15586 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 8.7 > Reporter: Yael Mushinsky > Priority: Major > > In performance tests conducted on the same system, comparing its performance > with Solr 5.2.1 verses 8.7, was found a degradation in the query execution > time. > The system uses a custom query parser, which constructs quite a large query > object. > When removing the ‘score’ parameter from the ‘fl’ list, the times were back > as in Solr 5. > Trying to trouble shoot using jstack (small bash script that take a jstack of > the Solr thread every few milliseconds) found that the Solr 8 thread “spends” > time in the stack: (was appearing in several jstack snippets) > "qtp2038105753-19" #19 prio=5 os_prio=0 cpu=1837.18ms elapsed=10334.56s > tid=0x00007f7da0973000 nid=0x6ddd runnable [0x00007f7d6d5f2000] > java.lang.Thread.State: RUNNABLE > at > sun.nio.ch.FileDispatcherImpl.pread0([java.base@13.0.2/Native|mailto:java.base@13.0.2/Native%3cmailto:java.base@13.0.2/Native] > Method) > at > sun.nio.ch.FileDispatcherImpl.pread([java.base@13.0.2/FileDispatcherImpl.java:54|mailto:java.base@13.0.2/FileDispatcherImpl.java:54%3cmailto:java.base@13.0.2/FileDispatcherImpl.java:54]) > at > sun.nio.ch.IOUtil.readIntoNativeBuffer([java.base@13.0.2/IOUtil.java:274|mailto:java.base@13.0.2/IOUtil.java:274%3cmailto:java.base@13.0.2/IOUtil.java:274]) > at > sun.nio.ch.IOUtil.read([java.base@13.0.2/IOUtil.java:245|mailto:java.base@13.0.2/IOUtil.java:245%3cmailto:java.base@13.0.2/IOUtil.java:245]) > at > sun.nio.ch.FileChannelImpl.readInternal([java.base@13.0.2/FileChannelImpl.java:811|mailto:java.base@13.0.2/FileChannelImpl.java:811%3cmailto:java.base@13.0.2/FileChannelImpl.java:811]) > at > sun.nio.ch.FileChannelImpl.read([java.base@13.0.2/FileChannelImpl.java:796|mailto:java.base@13.0.2/FileChannelImpl.java:796%3cmailto:java.base@13.0.2/FileChannelImpl.java:796]) > at > org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:170) > at > org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:315) > at > org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:254) > at > org.apache.lucene.util.fst.ReverseRandomAccessReader.readByte(ReverseRandomAccessReader.java:33) > at org.apache.lucene.util.fst.FST.readLabel(FST.java:596) > at org.apache.lucene.util.fst.FST.readArc(FST.java:1259) > at org.apache.lucene.util.fst.FST.readNextRealArc(FST.java:1247) > at > org.apache.lucene.util.fst.FST.readFirstRealTargetArc(FST.java:1091) > at org.apache.lucene.util.fst.FST.findTargetArc(FST.java:1396) > at > org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:485) > at > org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekExact(FilterLeafReader.java:184) > at > org.apache.lucene.index.TermStates.loadTermsEnum(TermStates.java:124) > at org.apache.lucene.index.TermStates.build(TermStates.java:109) > at > org.apache.lucene.search.PhraseQuery$1.getStats(PhraseQuery.java:447) > at org.apache.lucene.search.PhraseWeight.<init>(PhraseWeight.java:38) > at org.apache.lucene.search.PhraseQuery$1.<init>(PhraseQuery.java:429) > at > org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:429) > at > org.apache.lucene.search.BoostQuery.createWeight(BoostQuery.java:125) > at > org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:726) > at > org.apache.lucene.search.BooleanWeight.<init>(BooleanWeight.java:63) > at > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:231) > at > org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:726) > at > org.apache.lucene.search.TopFieldCollector.populateScores(TopFieldCollector.java:563) > at > org.apache.solr.search.Grouping$Command.populateScoresIfNecessary(Grouping.java:593) > at > org.apache.solr.search.Grouping$CommandFunc.finish(Grouping.java:1007) > at org.apache.solr.search.Grouping.execute(Grouping.java:408) > at > org.apache.solr.handler.component.QueryComponent.doProcessGroupedSearch(QueryComponent.java:1486) > at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:395) > at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:360) > > This doesn’t appear in any jstack snippet taken of the Solr 5 thread, when > ‘score’ is also asked for. > Trying to look a bit into the code, I see that in solr 8 the calculation of > score “if needed” is performed as a separate phase, after the document > collection flow (might be that its like collecting the documents again in a > way? I/O etc. ) : > > org.apache.solr.search.Grouping.execute(): > if (!collectors.isEmpty()) { > Collector secondPhaseCollectors = > MultiCollector.wrap(collectors.toArray(new Collector[collectors.size()])); > if (collectors.size() > 0) { > if (cachedCollector != null) { > if (cachedCollector.isCached()) { > cachedCollector.replay(secondPhaseCollectors); > } else { > signalCacheWarning = true; > log.warn(String.format(Locale.ROOT, "The grouping cache is > active, but not used because it exceeded the max cache limit of %d percent", > maxDocsPercentageToCache)); > log.warn("Please increase cache size or disable group caching."); > searchWithTimeLimiter(luceneFilter, secondPhaseCollectors); > } > } else { > if (pf.postFilter != null) { > pf.postFilter.setLastDelegate(secondPhaseCollectors); > secondPhaseCollectors = pf.postFilter; > } > searchWithTimeLimiter(luceneFilter, secondPhaseCollectors); > //->seems to me this is the main document collection flow > } > if (secondPhaseCollectors instanceof DelegatingCollector) { > ((DelegatingCollector) secondPhaseCollectors).finish(); > } > } > } > > for (@SuppressWarnings(\{"rawtypes"})Command cmd : commands) { > cmd.finish(); > } > > org.apache.solr.search.Grouping.CommandFunc: > @Override > @SuppressWarnings(\{"unchecked"}) > protected void finish() throws IOException { > if (secondPass != null) { > result = secondPass.getTopGroups(0); > populateScoresIfNecessary(); > } > > > org.apache.solr.search.Grouping.Command<T>: > protected void populateScoresIfNecessary() throws IOException { > if (needScores) { > for (GroupDocs<?> groups : result.groups) { > TopFieldCollector.populateScores(groups.scoreDocs, searcher, query); > } > } > } > > > I know that asking for the score has a cost of performance, but in Solr 5 it > didn’t seem to be meaningful and in Solr 8 it seems to be very expansive. > (In means of code, it seems the code is Solr 5 is different, I didn’t get > into finding exactly how and where it calculates the score to add to the > document upon request, but in means of timing, it doesn’t seem to have such > an effect.) > Tried asking about this over the users mailing list, but didn't get an answer > ... > Can you please assist in understanding the change in performance? > Thank you. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org