Is this known behavior or is it worth a JIRA ticket?

Searching against a text_general field in Solr 9.1, if my edismax query is
"foo bar" I should be able to get matches for "foo" without "bar" and vice
versa. However, if there happens to be a synonym rule applied at query
time, like "foo bar,zzz" I can no longer get single-term matches against
"foo" or "bar." Both terms are now required, but can occur in either order.
If we change the text_general analysis chain to apply synonyms at index
time instead of query time, this behavior goes away and single-term matches
are again possible.

To reproduce, use the _default configset with "foo bar,zzz" added to
synonyms.txt. Index these four docs:

{"id":"1", "title_txt":"foo"}
{"id":"2", "title_txt":"bar"}
{"id":"3", "title_txt":"foo bar"}
{"id":"4", "title_txt":"bar foo"}

Issue a query for "foo bar" (i.e.
defType=edismax&q.op=OR&qf=title_txt&q=foo bar)
Result: Only docs 3 and 4 come back

Issue a query for "bar foo"
Result: All four docs come back; the synonym rule is not invoked

Looking at the explain output for "foo bar" we see:

+((title_txt:zzz (+title_txt:foo +title_txt:bar)))


Looking at the explain output for "bar foo" we see:

+((title_txt:bar) (title_txt:foo))

So, the observed behavior makes sense according to the low-level query
structure. But -- is this how it's "supposed" to work?

Why not expand the "foo bar" query like this instead?

+((title_txt:zzz (title_txt:foo title_txt:bar)))

Rudi

Reply via email to