Hi
I have an existing application based on Solr 7 and highlighting and
fragment size worked great. I'm just upgrading Solr to 9.10 and noticed
some of my highlighted fragments can be quite a bit longer than they
were before.
I have a simple setup using the fragmenter the same as in the latest Solr
documentation in my solrconfig.xml.
<fragmenter name="gap"
default="true"
class="solr.highlight.GapFragmenter">
<lst name="defaults">
<int name="hl.fragsize">100</int>
</lst>
</fragmenter>
In the old production Solr 7 I get this output when I search for COVID
where I string two highlight fragments together with ... in between.
Before (7)
... Some people with *COVID*-19 experience symptoms for several weeks or
months (Long *COVID*), while... The REACT-Long *COVID* (REACT-LC) programme
aims to characterise the genetic, biological, social and...
After (9.10)
... Some people with *COVID*-19 experience symptoms for several weeks or
months (Long *COVID*), while others have a short illness or no symptoms.
... The REACT-Long *COVID* (REACT-LC) programme aims to characterise the
genetic, biological, social and environmental signatures and pathways, and
their inter-relationships, that underpin progression to Long *COVID*, and
to understand the natural history and long-term sequelae post-SARS-CoV-2
infection. ...
In the latter version the first fragment is almost the same length. I think
it's gone on to the end of a sentence with a full stop. The second fragment
is much longer and looks like it's taken in the second mention of COVID.
Any ideas on how I can get back to the earlier shorter form?
Shaun