And again, for the record and anyone facing the same issue, what I have described above, ie switching from a query like this
parsedQuery = +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee)) DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | edgefield2:maillol)) DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61)) DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r)) DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | edgefield2:grenelle)) DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007)) DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7 to a query like this parsedQuery = > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol wordfield:61 > Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle > wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee > edgefield:maillol > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007 > edgefield:paris)~7)))" is explained in that article https://opensourceconnections.com/blog/2018/02/20/edismax-and-multiterm-synonyms-oddities/ by Doug Turnbull It is SolR switching from a term-centric (first query) to a field-centric query, because of stop words discrepancies in the qf fields: "we’ve noted that the inclusion of a stopword by the user can shift the query to being field-centric. This can happen because the term is a stopword in one field, but not another. So if “the” is a stopword in title, but not the description, then suddenly a surprising query structure can be output by the user adding “the” to their search." Le mar. 7 mai 2024 à 16:30, elisabeth benoit <elisaelisael...@gmail.com> a écrit : > > For the record, I solved this problem by removing stop words in my > analyzer for wordfield. > > We often get this problem where there is stop words discrepancies between > fields. > > Le jeu. 16 nov. 2023 à 09:28, elisabeth benoit <elisaelisael...@gmail.com> > a écrit : > >> >> Thanks a lot for taking time to answer. >> >> I'll have to figure out a work around, decreasing mm is not an option for >> me, maybe use a boost for this extra field. >> >> Best regards, >> Elisabeth >> >> Le mar. 14 nov. 2023 à 12:05, Mikhail Khludnev <m...@apache.org> a >> écrit : >> >>> Ok. Right >>> (one two three four five six seven)~7 means match all of them ie in fact >>> +one >>> +two +three +four +five +six +seven >>> Here we can see that how dismax handles fields with different analyzers >>> is >>> far from perfection. >>> You can either decrease mm >>> >>> https://solr.apache.org/guide/6_6/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter >>> or experiment with mm.autoRelax=true >>> >>> https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html#TheExtendedDisMaxQueryParser-Themm.autoRelaxParameter >>> >>> >>> On Mon, Nov 13, 2023 at 10:33 PM elisabeth benoit < >>> elisaelisael...@gmail.com> >>> wrote: >>> >>> > okay, thanks, for the answer. the thing is >>> > >>> > when there is no *wordf**ield* in the *qf* param, but only >>> *edgefield1* and >>> > *edgefield2*, I get this parsedQuery >>> > >>> > parsedQuery = >>> > +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee)) >>> > DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | edgefield2:maillol)) >>> > DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61)) >>> > DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r)) >>> > DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | edgefield2:grenelle)) >>> > DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007)) >>> > DisjunctionMaxQuery(((edgefield1:paris)^1.1 | edgefield2:paris)))~7 >>> > >>> > and SolR does return documents >>> > >>> > but when I have instead* wordf**ield* and *edgefield* in *qf*, I get >>> this >>> > parsedQuery >>> > >>> > parsedQuery = >>> > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol >>> wordfield:61 >>> > > Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle >>> > > wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee >>> > > edgefield:maillol >>> > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007 >>> > > edgefield:paris)~7)))" >>> > >>> > and SolR does not return any documents. >>> > >>> > That is what makes me thing there is something wrong with the second >>> > parsedQuery. >>> > >>> > Best regards, >>> > Elisabeth >>> > >>> > >>> > >>> > Le lun. 13 nov. 2023 à 20:15, Mikhail Khludnev <m...@apache.org> a >>> écrit : >>> > >>> > > > >>> > > > the first case listed in my mail >>> > > > parsedQuery = >>> > > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol >>> > wordfield:61 >>> > > > Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle >>> > > > wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee >>> > > > edgefield:maillol >>> > > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007 >>> > > > edgefield:paris)~7)))" >>> > > >>> > > >>> > > > The OR is different, it is all words must match wordfield OR all >>> words >>> > > must >>> > > > match edgefield, but no mix between the two fields are allowed. >>> > > >>> > > >>> > > It doesn't work this way. These two queries differs only in >>> > scoring/results >>> > > ordering. i.e >>> > > this query matches docs: {wordfield:musee, edgefield:musee} as well >>> as { >>> > > wordfield:musee,edgefield:maillol}, {wordfield:musee}, { >>> > > edgefield:maillol}. >>> > > This explanation might be useful >>> > > https://lucidworks.com/post/solr-boolean-operators/ >>> > > Note: DisMax works like OR/| but takes max instead of sum as a score. >>> > > >>> > > On Mon, Nov 13, 2023 at 7:21 PM elisabeth benoit < >>> > > elisaelisael...@gmail.com> >>> > > wrote: >>> > > >>> > > > Hello, >>> > > > >>> > > > Thanks for your answer. >>> > > > >>> > > > I mean that in the second case listed in my mail, the query is >>> > > > parsedQuery = >>> > > > +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | edgefield2:musee)) >>> > > > DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | >>> edgefield2:maillol)) >>> > > > DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61)) >>> > > > DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r)) >>> > > > DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | >>> edgefield2:grenelle)) >>> > > > DisjunctionMaxQuery(((edgefield1:75007)^1.1 | edgefield2:75007)) >>> > > > DisjunctionMaxQuery(((edgefield1:paris)^1.1 | >>> edgefield2:paris)))~7 >>> > > > >>> > > > and so the way I read it is "musee" can match edgefield1 OR >>> edgefield2, >>> > > > "maillol" can match edgefield1 OR edgefield2, and so on, so solr >>> can >>> > > return >>> > > > a doc where some query words match with edgefield1 and some other >>> query >>> > > > words with edgefield2. >>> > > > >>> > > > But in the first case listed in my mail >>> > > > >>> > > > parsedQuery = >>> > > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol >>> > wordfield:61 >>> > > > Synonym(wordfield:r wordfield:ru wordfield:rue) wordfield:grenelle >>> > > > wordfield:75007 wordfield:paris)~7)^1.1 | ((edgefield:musee >>> > > > edgefield:maillol >>> > > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007 >>> > > > edgefield:paris)~7)))" >>> > > > >>> > > > The OR is different, it is all words must match wordfield OR all >>> words >>> > > must >>> > > > match edgefield, but no mix between the two fields are allowed. >>> > > > >>> > > > So I cannot search both fields at the same time. >>> > > > >>> > > > I hope this is clear! >>> > > > >>> > > > I would like to search both fields in same query. >>> > > > >>> > > > Best regards, >>> > > > Elisabeth >>> > > > >>> > > > Le lun. 13 nov. 2023 à 17:02, Mikhail Khludnev <m...@apache.org> a >>> > > écrit : >>> > > > >>> > > > > Hello Elisabeth. >>> > > > > DisMax analyses user input across the given qf fields. If the >>> number >>> > of >>> > > > > resulting tokens are different it can't apply defaults logic - >>> per >>> > word >>> > > > sum >>> > > > > over per field maximums; and flips to max over sums. The good >>> news is >>> > > > that >>> > > > > the difference between two approaches is only scoring. >>> > > > > WDYM exactly by absence of "matching words to be in two different >>> > > > fields"? >>> > > > > >>> > > > > On Mon, Nov 13, 2023 at 5:01 PM elisabeth benoit < >>> > > > > elisaelisael...@gmail.com> >>> > > > > wrote: >>> > > > > >>> > > > > > Hello, >>> > > > > > >>> > > > > > I am using solr 7.3.1 with ExtendedDismaxQParser. >>> > > > > > >>> > > > > > I have a edgengrams field and a normal text field. When I mix >>> those >>> > > two >>> > > > > in >>> > > > > > the same query, ie *qf=edgefield wordfield* and use option >>> > > > > *debugQuery=on*, >>> > > > > > I see that the parsedQuery is different, ie all words should >>> match >>> > > the >>> > > > > same >>> > > > > > field. >>> > > > > > >>> > > > > > ie parsedQuery = >>> > > > > > >>> > > > > > "+DisjunctionMaxQuery((((wordfield:musee wordfield:maillol >>> > > wordfield:61 >>> > > > > > Synonym(wordfield:r wordfield:ru wordfield:rue) >>> wordfield:grenelle >>> > > > > > wordfield >>> > > > > > :75007 wordfield:paris)~7)^1.1 | ((edgefield:musee >>> > edgefield:maillol >>> > > > > > edgefield:61 edgefield:r edgefield:grenelle edgefield:75007 >>> > edgefield >>> > > > > > :paris)~7)))" >>> > > > > > >>> > > > > > When instead I use two edgefields with *qf=**edgefield1 >>> > **edgefield2* >>> > > > > > >>> > > > > > parsedQuery = >>> > > > > > +(DisjunctionMaxQuery(((edgefield1:musee)^1.1 | >>> edgefield2:musee)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:maillol)^1.1 | >>> > edgefield2:maillol)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:61)^1.1 | edgefield2:61)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:r)^1.1 | edgefield2:r)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:grenelle)^1.1 | >>> > > edgefield2:grenelle)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:75007)^1.1 | >>> edgefield2:75007)) >>> > > > > > DisjunctionMaxQuery(((edgefield1:paris)^1.1 | >>> edgefield2:paris)))~7 >>> > > > > > >>> > > > > > In the second case, edismax allows matching words to be in two >>> > > > different >>> > > > > > fields, but not in first case. >>> > > > > > >>> > > > > > Is there a way to have the same behaviour, ie case two, in all >>> > cases? >>> > > > > > >>> > > > > > best regards, >>> > > > > > Elisabeth >>> > > > > > >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > Sincerely yours >>> > > > > Mikhail Khludnev >>> > > > > >>> > > > >>> > > >>> > > >>> > > -- >>> > > Sincerely yours >>> > > Mikhail Khludnev >>> > > >>> > >>> >>> >>> -- >>> Sincerely yours >>> Mikhail Khludnev >>> >>