[ 
https://issues.apache.org/jira/browse/SOLR-15246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301548#comment-17301548
 ] 

Matthew Flowerday commented on SOLR-15246:
------------------------------------------

Hi David

 

As a test I carried out the query with hl.fragAlignRatio=0.0 and hl.bs.type= 
LINE as a baseline. Our code fires queries in batches of 100 until a configured 
number of matches [500] are found (25247 ms for first query).

 

2021-03-15 09:12:28.057 INFO  (qtp762476028-21) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=25247

2021-03-15 09:12:35.705 INFO  (qtp762476028-18) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=5104

2021-03-15 09:12:38.725 INFO  (qtp762476028-17) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=1865

2021-03-15 09:12:42.539 INFO  (qtp762476028-16) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=3108

2021-03-15 09:12:47.590 INFO  (qtp762476028-15) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=3935

2021-03-15 09:12:49.967 INFO  (qtp762476028-14) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=1046

 

And then changed hl.fragAlignRatio=0.33 and hl.bs.type=LINE which generated (as 
expected longer query times) (47064 ms for first query)

 

2021-03-15 09:20:24.352 INFO  (qtp762476028-21) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=47064

2021-03-15 09:20:29.931 INFO  (qtp762476028-18) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=4971

2021-03-15 09:20:32.587 INFO  (qtp762476028-17) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=1849

2021-03-15 09:20:36.416 INFO  (qtp762476028-15) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=2981

2021-03-15 09:20:40.978 INFO  (qtp762476028-14) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=4048

2021-03-15 09:20:42.994 INFO  (qtp762476028-19) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=1521

 

I then set hl.bs.type=SEPARATOR and hl.bs.separator=. which generated (26587 ms 
for first query)

 

2021-03-15 09:29:04.372 INFO  (qtp762476028-20) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=*&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=26587

2021-03-15 09:29:16.009 INFO  (qtp762476028-18) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4H/2y5NWUNST0YxOEwxNS1WMQ%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=7191

2021-03-15 09:29:19.104 INFO  (qtp762476028-17) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4FgoC9NWUNST0YyMEw5Mi1MMTE%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=2381

2021-03-15 09:29:24.929 INFO  (qtp762476028-15) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4DvKDFKT05FU0RCMThINTUtQVg5NA%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=4924

2021-03-15 09:29:28.664 INFO  (qtp762476028-14) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP4CoxzJOQVRJT05BTC1HRTE4MTQwODk%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=3393

2021-03-15 09:29:29.728 INFO  (qtp762476028-19) [   x:uleaf] o.a.s.c.S.Request 
[uleaf]  webapp=/solr path=/select 
params={hl.snippets=2&q=test&hl=true&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&cursorMark=AoIIP3/AsjFKT05FU0RCMThMNjEtWkE0Nw%3D%3D&sort=score+desc,id+asc&hl.fl=*&rows=100&wt=javabin&version=2}
 hits=58780 status=0 QTime=581

 

This shows a marked improvement on the timings.

 

Hope this is of help.

 

Regards

 

Matthew

 

Matthew Flowerday | Consultant | ULEAF

Unisys | 01908 774830| matthew.flower...@unisys.com 

Address Enigma | Wavendon Business Park | Wavendon | Milton Keynes | MK17 8LX

 

 

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY 
MATERIAL and is for use only by the intended recipient. If you received this in 
error, please contact the sender and delete the e-mail and its attachments from 
all devices.

   

 



> A unified highlighting search under solr 8.8.0/8.8.1 can take over 20 mins to 
> run and eventually times out.
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-15246
>                 URL: https://issues.apache.org/jira/browse/SOLR-15246
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>    Affects Versions: 8.8, 8.8.1
>         Environment: I was running solr under windows
>            Reporter: Matthew Flowerday
>            Priority: Minor
>
> With solr 8.8.0 a new unified highlighting parameter &hl.fragAlignRatio was 
> implemented which if not set defaults to 0.5. This attempts to improve the 
> high lighting so that highlighted text does not appear right at the left. 
> This works well but if you have a search result with numerous occurrences of 
> the word in question within the record performance goes right down!
> 2021-02-27 06:45:03.151 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.c.S.Request [uleaf]  webapp=/solr path=/select 
> params=\{hl.snippets=2&q=test&hl=on&hl.maxAnalyzedChars=1000000&fl=id,description,specification,score&start=20&hl.fl=*&rows=10&_=1614405119134}
>  hits=57008 status=0 QTime=1414320
> 2021-02-27 06:45:03.245 INFO  (qtp762476028-20) [   x:uleaf] 
> o.a.s.s.HttpSolrCall Unable to write response, client closed connection or we 
> are shutting down => org.eclipse.jetty.io.EofException
>               at 
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
> org.eclipse.jetty.io.EofException: null
>               at 
> org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>               at 
> org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>               at 
> org.eclipse.jetty.io.WriteFlusher.completeWrite(WriteFlusher.java:378) 
> ~[jetty-io-9.4.34.v20201102.jar:9.4.34.v20201102]
>  
> when I set &hl.fragAlignRatio=0.25 results came back much quicker
> 2021-02-27 14:59:57.189 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.25&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=87024
> And  &hl.fragAlignRatio=0.1
> 2021-02-27 15:18:45.542 INFO  (qtp1291367132-19) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.1&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=69033
> And &hl.fragAlignRatio=0.0
> 2021-02-27 15:20:38.194 INFO  (qtp1291367132-24) [   x:holmes] 
> o.a.s.c.S.Request [holmes]  webapp=/solr path=/select 
> params=\{hl.weightMatches=false&hl=on&fl=id,description,specification,score&start=1&hl.fragAlignRatio=0.0&rows=100&hl.snippets=2&q=test&hl.maxAnalyzedChars=1000000&hl.fl=*&hl.method=unified&timeAllowed=90000&_=1614430061690}
>  hits=136939 status=0 QTime=2841
> I left our setting at 0.0 – this presumably how it was in 7.7.1 (fully left 
> aligned).  I am not too sure as to how many time a word has to occur in a 
> record for performance to go right down – but if too many it can have a BIG 
> impact.
> It might be an idea to set the default value to be say 0.25 instead of 0.5 so 
> that people are not caught out.
> I also noticed that setting &timeAllowed=90000 did not break out of the query 
> until it finished. Perhaps because the query finished quickly and what took 
> the time was the highlighting. It might be an idea to get &timeAllowed to 
> also cover any highlighting so that the query does not run until the jetty 
> timeout is hit. The machine 100% one core for about 20 mins!.
> I raised this at the request of a member of the user forum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to