And again, for the record and anyone facing the same issue, what I have
described above, ie switching from a query like this

parsedQuery =
 +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee))
 DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | edgefield2:maillol))
 DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
 DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
 DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | edgefield2:grenelle))
 DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
 DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7

to a query like this

parsedQuery =
>  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol wordfield:61
>  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
> edgefield:maillol
>  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
> edgefield:paris)~7)))"

is explained in that article
https://opensourceconnections.com/blog/2018/02/20/edismax-and-multiterm-synonyms-oddities/
by Doug Turnbull

It is SolR switching from a term-centric (first query) to a field-centric
query, because of stop words discrepancies in the qf fields:

"we’ve noted that the inclusion of a stopword by the user can shift the
query to being field-centric. This can happen because the term is a
stopword in one field, but not another. So if “the” is a stopword in title,
but not the description, then suddenly a surprising query structure can be
output by the user adding “the” to their search."

Le mar. 7 mai 2024 à 16:30, elisabeth benoit <elisaelisael...@gmail.com> a
écrit :

>
> For the record, I solved this problem by removing stop words in my
> analyzer for wordfield.
>
> We often get this problem where there is stop words discrepancies between
> fields.
>
> Le jeu. 16 nov. 2023 à 09:28, elisabeth benoit <elisaelisael...@gmail.com>
> a écrit :
>
>>
>> Thanks a lot for taking time to answer.
>>
>> I'll have to figure out a work around, decreasing mm is not an option for
>> me, maybe use a boost for this extra field.
>>
>> Best regards,
>> Elisabeth
>>
>> Le mar. 14 nov. 2023 à 12:05, Mikhail Khludnev <m...@apache.org> a
>> écrit :
>>
>>> Ok. Right
>>> (one two three four five six seven)~7 means match all of them ie in fact
>>> +one
>>> +two +three +four +five +six +seven
>>> Here we can see that how dismax handles fields with different analyzers
>>> is
>>> far from perfection.
>>> You can either decrease mm
>>>
>>> https://solr.apache.org/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
>>> or experiment with mm.autoRelax=true
>>>
>>> https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html#TheExtendedDisMaxQueryParser-Themm.autoRelaxParameter
>>>
>>>
>>> On Mon, Nov 13, 2023 at 10:33 PM elisabeth benoit <
>>> elisaelisael...@gmail.com>
>>> wrote:
>>>
>>> > okay, thanks, for the answer. the thing is
>>> >
>>> > when there is no *wordf**ield* in the *qf* param, but only
>>> *edgefield1* and
>>> > *edgefield2*, I get this parsedQuery
>>> >
>>> > parsedQuery =
>>> >  +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee))
>>> >  DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | edgefield2:maillol))
>>> >  DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>>> >  DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>>> >  DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | edgefield2:grenelle))
>>> >  DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
>>> >  DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7
>>> >
>>> > and SolR does return documents
>>> >
>>> > but when I have instead* wordf**ield* and *edgefield* in *qf*,  I get
>>> this
>>> > parsedQuery
>>> >
>>> > parsedQuery =
>>> > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>>> wordfield:61
>>> > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>>> > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>>> > > edgefield:maillol
>>> > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>>> > > edgefield:paris)~7)))"
>>> >
>>> > and SolR does not return any documents.
>>> >
>>> > That is what makes me thing there is something wrong with the second
>>> > parsedQuery.
>>> >
>>> > Best regards,
>>> > Elisabeth
>>> >
>>> >
>>> >
>>> > Le lun. 13 nov. 2023 à 20:15, Mikhail Khludnev <m...@apache.org> a
>>> écrit :
>>> >
>>> > > >
>>> > > >  the first case listed in my mail
>>> > > > parsedQuery =
>>> > > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>>> > wordfield:61
>>> > > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>>> > > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>>> > > > edgefield:maillol
>>> > > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>>> > > > edgefield:paris)~7)))"
>>> > >
>>> > >
>>> > > > The OR is different, it is all words must match wordfield OR all
>>> words
>>> > > must
>>> > > > match edgefield, but no mix between the two fields are allowed.
>>> > >
>>> > >
>>> > > It doesn't work this way. These two queries differs only in
>>> > scoring/results
>>> > > ordering. i.e
>>> > > this query matches  docs: {wordfield:musee, edgefield:musee} as well
>>> as {
>>> > > wordfield:musee,edgefield:maillol},   {wordfield:musee}, {
>>> > > edgefield:maillol}.
>>> > > This explanation might be useful
>>> > > https://lucidworks.com/post/solr-boolean-operators/
>>> > > Note: DisMax works like OR/| but takes max instead of sum as a score.
>>> > >
>>> > > On Mon, Nov 13, 2023 at 7:21 PM elisabeth benoit <
>>> > > elisaelisael...@gmail.com>
>>> > > wrote:
>>> > >
>>> > > > Hello,
>>> > > >
>>> > > > Thanks for your answer.
>>> > > >
>>> > > > I mean that in the second case listed in my mail, the query is
>>> > > > parsedQuery =
>>> > > >  +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee))
>>> > > >  DisjunctionMaxQuery(((edgefield1:maillol)^1.1 |
>>> edgefield2:maillol))
>>> > > >  DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>>> > > >  DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>>> > > >  DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 |
>>> edgefield2:grenelle))
>>> > > >  DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007))
>>> > > >  DisjunctionMaxQuery(((edgefield1:paris)^1.1 |
>>> edgefield2:paris)))~7
>>> > > >
>>> > > > and so the way I read it is "musee" can match edgefield1 OR
>>> edgefield2,
>>> > > > "maillol" can match edgefield1 OR edgefield2, and so on, so solr
>>> can
>>> > > return
>>> > > > a doc where some query words match with edgefield1 and some other
>>> query
>>> > > > words with edgefield2.
>>> > > >
>>> > > > But in the first case listed in my mail
>>> > > >
>>> > > > parsedQuery =
>>> > > >  "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>>> > wordfield:61
>>> > > >  Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle
>>> > > >  wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>>> > > > edgefield:maillol
>>> > > >  edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>>> > > > edgefield:paris)~7)))"
>>> > > >
>>> > > > The OR is different, it is all words must match wordfield OR all
>>> words
>>> > > must
>>> > > > match edgefield, but no mix between the two fields are allowed.
>>> > > >
>>> > > > So I cannot search both fields at the same time.
>>> > > >
>>> > > > I hope this is clear!
>>> > > >
>>> > > > I would like to search both fields in same query.
>>> > > >
>>> > > > Best regards,
>>> > > > Elisabeth
>>> > > >
>>> > > > Le lun. 13 nov. 2023 à 17:02, Mikhail Khludnev <m...@apache.org> a
>>> > > écrit :
>>> > > >
>>> > > > > Hello Elisabeth.
>>> > > > > DisMax analyses user input across the given qf fields. If the
>>> number
>>> > of
>>> > > > > resulting tokens are different it can't apply defaults logic -
>>> per
>>> > word
>>> > > > sum
>>> > > > > over per field maximums; and flips to max over sums. The good
>>> news is
>>> > > > that
>>> > > > > the difference between two approaches is only scoring.
>>> > > > > WDYM exactly by absence of "matching words to be in two different
>>> > > > fields"?
>>> > > > >
>>> > > > > On Mon, Nov 13, 2023 at 5:01 PM elisabeth benoit <
>>> > > > > elisaelisael...@gmail.com>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Hello,
>>> > > > > >
>>> > > > > > I am using solr 7.3.1 with ExtendedDismaxQParser.
>>> > > > > >
>>> > > > > > I have a edgengrams field and a normal text field. When I mix
>>> those
>>> > > two
>>> > > > > in
>>> > > > > > the same query, ie *qf=edgefield wordfield* and use option
>>> > > > > *debugQuery=on*,
>>> > > > > > I see that the parsedQuery is different, ie all words should
>>> match
>>> > > the
>>> > > > > same
>>> > > > > > field.
>>> > > > > >
>>> > > > > > ie parsedQuery =
>>> > > > > >
>>> > > > > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol
>>> > > wordfield:61
>>> > > > > > Synonym(wordfield:r wordfield:ru wordfield:rue)
>>> wordfield:grenelle
>>> > > > > > wordfield
>>> > > > > > :75007 wordfield:paris)~7)^1.1 | ((edgefield:musee
>>> > edgefield:maillol
>>> > > > > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007
>>> > edgefield
>>> > > > > > :paris)~7)))"
>>> > > > > >
>>> > > > > > When instead I use two edgefields with *qf=**edgefield1
>>> > **edgefield2*
>>> > > > > >
>>> > > > > > parsedQuery =
>>> > > > > > +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 |
>>> edgefield2:musee))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:maillol)^1.1 |
>>> > edgefield2:maillol))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 |
>>> > > edgefield2:grenelle))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:75007)^1.1 |
>>> edgefield2:75007))
>>> > > > > > DisjunctionMaxQuery(((edgefield1:paris)^1.1 |
>>> edgefield2:paris)))~7
>>> > > > > >
>>> > > > > > In the second case, edismax allows matching words to be in two
>>> > > > different
>>> > > > > > fields, but not in first case.
>>> > > > > >
>>> > > > > > Is there a way to have the same behaviour, ie case two, in all
>>> > cases?
>>> > > > > >
>>> > > > > > best regards,
>>> > > > > > Elisabeth
>>> > > > > >
>>> > > > >
>>> > > > >
>>> > > > > --
>>> > > > > Sincerely yours
>>> > > > > Mikhail Khludnev
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> > > --
>>> > > Sincerely yours
>>> > > Mikhail Khludnev
>>> > >
>>> >
>>>
>>>
>>> --
>>> Sincerely yours
>>> Mikhail Khludnev
>>>
>>

Reply via email to