[
https://issues.apache.org/jira/browse/SOLR-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated SOLR-6692:
-------------------------------
Fix Version/s: (was: 5.0)
5.2
> hl.maxAnalyzedChars should apply cumulatively on a multi-valued field
> ---------------------------------------------------------------------
>
> Key: SOLR-6692
> URL: https://issues.apache.org/jira/browse/SOLR-6692
> Project: Solr
> Issue Type: Improvement
> Components: highlighter
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 5.2
>
> Attachments:
> SOLR-6692_hl_maxAnalyzedChars_cumulative_multiValued,_and_more.patch
>
>
> in DefaultSolrHighlighter, the hl.maxAnalyzedChars figure is used to
> constrain how much text is analyzed before the highlighter stops, in the
> interests of performance. For a multi-valued field, it effectively treats
> each value anew, no matter how much text it was previously analyzed for other
> values for the same field for the current document. The PostingsHighlighter
> doesn't work this way -- hl.maxAnalyzedChars is effectively the total budget
> for a field for a document, no matter how many values there might be. It's
> not reset for each value. I think this makes more sense. When we loop over
> the values, we should subtract from hl.maxAnalyzedChars the length of the
> value just checked. The motivation here is consistency with
> PostingsHighlighter, and to allow for hl.maxAnalyzedChars to be pushed down
> to term vector uninversion, which wouldn't be possible for multi-valued
> fields based on the current way this parameter is used.
> Interestingly, I noticed Solr's use of FastVectorHighlighter doesn't honor
> hl.maxAnalyzedChars as the FVH doesn't have a knob for that. It does have
> hl.phraseLimit which is a limit that could be used for a similar purpose,
> albeit applied differently.
> Furthermore, DefaultSolrHighligher.doHighlightingByHighlighter should exit
> early from it's field value loop if it reaches hl.snippets, and if
> hl.preserveMulti=true
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]