[ https://issues.apache.org/jira/browse/SOLR-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301548#comment-17301548 ]
Matthew Flowerday commented on SOLR-15246: ------------------------------------------ Hi David As a test I carried out the query with hl.fragAlignRatio=0.0 and hl.bs.type= LINE as a baseline. Our code fires queries in batches of 100 until a configured number of matches [500] are found (25247 ms for first query). 2021-03-15 09:12:28.057 INFO (qtp762476028-21) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=25247 2021-03-15 09:12:35.705 INFO (qtp762476028-18) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=5104 2021-03-15 09:12:38.725 INFO (qtp762476028-17) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=1865 2021-03-15 09:12:42.539 INFO (qtp762476028-16) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=3108 2021-03-15 09:12:47.590 INFO (qtp762476028-15) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=3935 2021-03-15 09:12:49.967 INFO (qtp762476028-14) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=1046 And then changed hl.fragAlignRatio=0.33 and hl.bs.type=LINE which generated (as expected longer query times) (47064 ms for first query) 2021-03-15 09:20:24.352 INFO (qtp762476028-21) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=47064 2021-03-15 09:20:29.931 INFO (qtp762476028-18) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=4971 2021-03-15 09:20:32.587 INFO (qtp762476028-17) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=1849 2021-03-15 09:20:36.416 INFO (qtp762476028-15) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=2981 2021-03-15 09:20:40.978 INFO (qtp762476028-14) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=4048 2021-03-15 09:20:42.994 INFO (qtp762476028-19) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=1521 I then set hl.bs.type=SEPARATOR and hl.bs.separator=. which generated (26587 ms for first query) 2021-03-15 09:29:04.372 INFO (qtp762476028-20) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=26587 2021-03-15 09:29:16.009 INFO (qtp762476028-18) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=7191 2021-03-15 09:29:19.104 INFO (qtp762476028-17) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=2381 2021-03-15 09:29:24.929 INFO (qtp762476028-15) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=4924 2021-03-15 09:29:28.664 INFO (qtp762476028-14) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=3393 2021-03-15 09:29:29.728 INFO (qtp762476028-19) [ x:uleaf] o.a.s.c.S.Request [uleaf] webapp=/solr path=/select params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2} hits=58780 status=0 QTime=581 This shows a marked improvement on the timings. Hope this is of help. Regards Matthew Matthew Flowerday | Consultant | ULEAF Unisys | 01908 774830| matthew.flower...@unisys.com Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17 8LX THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all devices. > A unified highlighting search under solr 8.8.0/8.8.1 can take over 20 mins to > run and eventually times out. > ----------------------------------------------------------------------------------------------------------- > > Key: SOLR-15246 > URL: https://issues.apache.org/jira/browse/SOLR-15246 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: highlighter > Affects Versions: 8.8, 8.8.1 > Environment: I was running solr under windows > Reporter: Matthew Flowerday > Priority: Minor > > With solr 8.8.0 a new unified highlighting parameter &hl.fragAlignRatio was > implemented which if not set defaults to 0.5. This attempts to improve the > high lighting so that highlighted text does not appear right at the left. > This works well but if you have a search result with numerous occurrences of > the word in question within the record performance goes right down! > 2021-02-27 06:45:03.151 INFO (qtp762476028-20) [ x:uleaf] > o.a.s.c.S.Request [uleaf] webapp=/solr path=/select > params=\{hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&start=20&hl.fl=*&rows=10&_=1614405119134} > hits=57008 status=0 QTime=1414320 > 2021-02-27 06:45:03.245 INFO (qtp762476028-20) [ x:uleaf] > o.a.s.s.HttpSolrCall Unable to write response, client closed connection or we > are shutting down => org.eclipse.jetty.io.EofException > at > org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) > org.eclipse.jetty.io.EofException: null > at > org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > at > org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > at > org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378) > ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102] > > when I set &hl.fragAlignRatio=0.25 results came back much quicker > 2021-02-27 14:59:57.189 INFO (qtp1291367132-24) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=87024 > And &hl.fragAlignRatio=0.1 > 2021-02-27 15:18:45.542 INFO (qtp1291367132-19) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=69033 > And &hl.fragAlignRatio=0.0 > 2021-02-27 15:20:38.194 INFO (qtp1291367132-24) [ x:holmes] > o.a.s.c.S.Request [holmes] webapp=/solr path=/select > params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690} > hits=136939 status=0 QTime=2841 > I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully left > aligned). I am not too sure as to how many time a word has to occur in a > record for performance to go right down – but if too many it can have a BIG > impact. > It might be an idea to set the default value to be say 0.25 instead of 0.5 so > that people are not caught out. > I also noticed that setting &timeAllowed=90000 did not break out of the query > until it finished. Perhaps because the query finished quickly and what took > the time was the highlighting. It might be an idea to get &timeAllowed to > also cover any highlighting so that the query does not run until the jetty > timeout is hit. The machine 100% one core for about 20 mins!. > I raised this at the request of a member of the user forum. -- This message was sent by Atlassian Jira (v8.3.4#803005)