[ https://issues.apache.org/jira/browse/SOLR-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Smiley resolved SOLR-15246. --------------------------------- Assignee: David Smiley Resolution: Not A Bug I'm closing as "not a bug" because there isn't one specific problem here. There are settings that helped a lot. Further things that could be done (honoring timeout, adding multi-threading, adding first-snippet termination) would be separate issues. > A unified highlighting search under solr 8.8.0/8.8.1 can take over 20 mins to > run and eventually times out. > ----------------------------------------------------------------------------------------------------------- > > Key: SOLR-15246 > URL: https://issues.apache.org/jira/browse/SOLR-15246 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter > Affects Versions: 8.8, 8.8.1 > Environment: I was running solr under windows > Reporter: Matthew Flowerday > Assignee: David Smiley > Priority: Minor > > With solr 8.8.0 a new unified highlighting parameter &hl.fragAlignRatio was > implemented which if not set defaults to 0.5. This attempts to improve the > high lighting so that highlighted text does not appear right at the left. > This works well but if you have a search result with numerous occurrences of > the word in question within the record performance goes right down! > 2021-02-27 06:45:03.151 INFO (qtp762476028-20) [ x:uleaf] > o.a.s.c.S.Request [uleaf] webapp=/solr path=/select > params=\{hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&start=20&hl.fl=*&rows=10&_=1614405119134} > hits=57008 status=0 QTime=1414320 > 2021-02-27 06:45:03.245 INFO (qtp762476028-20) [ x:uleaf] > o.a.s.s.HttpSolrCall Unable to write response, client closed connection or we > are shutting down => org.eclipse.jetty.io.EofException > at > org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) > org.eclipse.jetty.io.EofException: null > at > org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > at > org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > at > org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > > when I set &hl.fragAlignRatio=0.25 results came back much quicker > 2021-02-27 14:59:57.189 INFO (qtp1291367132-24) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=87024 > And &hl.fragAlignRatio=0.1 > 2021-02-27 15:18:45.542 INFO (qtp1291367132-19) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=69033 > And &hl.fragAlignRatio=0.0 > 2021-02-27 15:20:38.194 INFO (qtp1291367132-24) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=2841 > I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully left > aligned). I am not too sure as to how many time a word has to occur in a > record for performance to go right down – but if too many it can have a BIG > impact. > It might be an idea to set the default value to be say 0.25 instead of 0.5 so > that people are not caught out. > I also noticed that setting &timeAllowed=90000 did not break out of the query > until it finished. Perhaps because the query finished quickly and what took > the time was the highlighting. It might be an idea to get &timeAllowed to > also cover any highlighting so that the query does not run until the jetty > timeout is hit. The machine 100% one core for about 20 mins!. > I raised this at the request of a member of the user forum. -- This message was sent by Atlassian Jira (v8.3.4#803005)