That's excellent. Really clear explanation. Just wondering whether to keep
on with the UnifiedHighlighter now.


On Mon, 8 Dec 2025 at 22:04, Chip Ryan via users <[email protected]>
wrote:

> Hi Shaun,
>
> This is a common issue when upgrading to Solr 9. The default highlighter
> changed from the "original" highlighter to the "UnifiedHighlighter". Your
> GapFragmenter configuration is being completely ignored because
> UnifiedHighlighter doesn't use it.
>
> The UnifiedHighlighter fragments based on sentence boundaries by default,
> which is why you're seeing fragments that extend to full stops and pull in
> multiple keyword occurrences.
>
> *Quick fix:* Add this to your query parameters or solrconfig.xml defaults:
>
> hl.method=original
>
> This tells Solr to use the original highlighter, which will respect your
> GapFragmenter settings again.
>
> *Alternative:* If you'd prefer to stick with UnifiedHighlighter (it's
> faster and handles multi-valued fields better), you can tune its behavior
> with:
>
> hl.method=unified
>
> hl.fragsize=100
>
> hl.bs.type=WORD
>
> Setting hl.bs.type=WORD breaks at word boundaries near your target size
> rather than sentence boundaries, which should give you results much closer
> to what you had before.
>
> Hope this helps.
>
> *Opensolr.com*
> *Your Path to *AI Search
> <https://opensolr.com/faq/view/web-crawler/46/Opensolr-Web-Crawler-Site-Search-Solution>
> [email protected]
> https://opensolr.com
> VAT: RO-35410526
>
>
>
>
> On 8 Dec 2025, at 23:12, Shaun Campbell <[email protected]> wrote:
>
> Hi
>
> I have an existing application based on Solr 7 and highlighting and
> fragment size worked great. I'm just upgrading Solr to 9.10 and noticed
> some of my highlighted fragments can be quite a bit longer than they
> were before.
>
> I have a simple setup using the fragmenter the same as in the latest Solr
> documentation in my solrconfig.xml.
>
> <fragmenter name="gap"
>
>                  default="true"
>
>                  class="solr.highlight.GapFragmenter">
>
>        <lst name="defaults">
>
>          <int name="hl.fragsize">100</int>
>
>        </lst>
>
>  </fragmenter>
>
>
> In the old production Solr 7 I get this output when I search for COVID
> where I string two highlight fragments together with ... in between.
>
>
> Before (7)
>
> ... Some people with *COVID*-19 experience symptoms for several weeks or
> months (Long *COVID*), while... The REACT-Long *COVID* (REACT-LC) programme
> aims to characterise the genetic, biological, social and...
>
>
>
> After (9.10)
>
> ... Some people with *COVID*-19 experience symptoms for several weeks or
> months (Long *COVID*), while others have a short illness or no symptoms.
> ... The REACT-Long *COVID* (REACT-LC) programme aims to characterise the
> genetic, biological, social and environmental signatures and pathways, and
> their inter-relationships, that underpin progression to Long *COVID*, and
> to understand the natural history and long-term sequelae post-SARS-CoV-2
> infection. ...
>
>
> In the latter version the first fragment is almost the same length. I think
> it's gone on to the end of a sentence with a full stop. The second fragment
> is much longer and looks like it's taken in the second mention of COVID.
>
>
> Any ideas on how I can get back to the earlier shorter form?
>
>
> Shaun
>
>
>

Reply via email to