Thanks Michael, that may well be the issue! I need to reorder the chain and
thanks for the suggestion on the WordDelimiterGraphFilter which I'll look
into as well.
On Wed, 6 Apr 2022 at 17:14, Michael Gibney
wrote:
> I think the behavior you're seeing is a consequence of the fact that you're
> a
I think the behavior you're seeing is a consequence of the fact that you're
applying index-time stopword filtering *before* the tokens are further
manipulated by WordDelimiterGraphFilter. E.g.:
"the token-is-retained" => "the" "token-is" "retained" => "the" "token"
"is" "retained"
In the case abo
Hi Michael,
Here are the field and fieldType with a result snippet.
I've checked the stopword list, and words like "a" or "be" are in it. I've
also used the UI analysis to check that they indeed should be removed when
indexed and queried.
Many thanks,
Dan
*example results:*
"facets": {
Both `qf` and `relatedness` should be orthogonal to your question, iiuc.
Understanding that your question is mainly about which terms are included
(i.e. included at all -- nevermind ranking), then the only thing that
should determine that is the field and fieldType config for the terms facet
"field