This is where having a really nice regression test across a wide variety of representative queries comes in super helpful, to understand the impact of changes not just on one query, but across all your queries. See https://quepid.com/ or https://github.com/SeaseLtd/rated-ranking-evaluator for some tooling that can help!
> On Apr 29, 2021, at 7:42 AM, Ere Maijala <ere.maij...@helsinki.fi> wrote: > > Thanks for sharing your experience. We've been running sow=false until now > and got away with it as most searches hit a catch-all field. That won't be > the case in the long run anymore, so now I'm contemplating switching back to > sow=true. It's just a bit scary change to do at this point since it might > affect something I've failed to consider. > > --Ere > > Atita Arora kirjoitti 29.4.2021 klo 13.32: >> I struggled with something very similar today, which turned our search >> model upside down as in Solr 8+ edismax uses sow=false by default and it >> was true in Solr 6 (where we upgraded from), >> but after adding this my handler things were back to normal. >> On Thu, Apr 29, 2021 at 11:45 AM Ere Maijala <ere.maij...@helsinki.fi> >> wrote: >>> Ah, yes, that seems to be the case. Thanks for the pointer! Also the >>> discussion is enlightening. It looks like there hasn't been much >>> happening in that bug so I'll need to consider any other options. >>> >>> --Ere >>> >>> Jan Høydahl kirjoitti 29.4.2021 klo 11.48: >>>> I think you are hitting this bug >>> https://issues.apache.org/jira/browse/SOLR-12779 >>>> >>>> Jan >>>> >>>>> 29. apr. 2021 kl. 08:51 skrev Ere Maijala <ere.maij...@helsinki.fi>: >>>>> >>>>> Hello Markus, >>>>> >>>>> Thanks for the reply. I'm not sure I understand. The docs state the >>> following: >>>>> >>>>> "The default value of mm is 0% (all clauses optional), unless q.op is >>> specified as "AND", in which case mm defaults to 100% (all clauses >>> required)." >>>>> ( >>> https://solr.apache.org/guide/8_8/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter >>> ) >>>>> >>>>> And obviously it has effect. You can also replace q.op=AND with >>> mm=100%25 in my examples with the same results. The multi-word synonym >>> makes the query explained by debugQuery=true seem wrong to me in that it >>> requires all terms to match in the same field, whereas normally the match >>> can be found in any of the fields listed in qf. For example this is the >>> query from my first example: >>>>> >>>>> +(+DisjunctionMaxQuery((name:corsair | manu:corsair | cat:corsair)) >>> +DisjunctionMaxQuery((name:microsystems | manu:microsystems | >>> cat:microsystems)) +DisjunctionMaxQuery((name:memory | manu:memory | >>> cat:memory))) >>>>> >>>>> Using the synonym instead of `corsair microsystems` produces this: >>>>> >>>>> +(+((+name:corsair +name:microsystems +name:memory) | (+manu:corsair >>> +manu:microsystems +manu:memory) | (+cat:cmi +cat:memory))) >>>>> >>>>> We don't use stopwords. mm.autoRelax does not make a difference here. >>>>> >>>>> Best, >>>>> Ere >>>>> >>>>> Markus Jelsma kirjoitti 28.4.2021 klo 16.20: >>>>>> Hello Ere, >>>>>> The q.op parameter is not a dismax parameter. instead i think you are >>> being >>>>>> bitten bij de mm parameter [1] which by default is 100%, meaning all >>> terms >>>>>> must match. Multi word synonym handing and mm are not a very intuitive >>>>>> match, and can lead to crazy problems. Also beware of mm and stopword >>>>>> handling and check out mm.autoRelax [2]. But it is best not to use >>>>>> stopwords at all. >>>>>> Check it out, >>>>>> Markus >>>>>> [1] https://solr.apache.org/guide/6_6/the-dismax-query-parser.html >>>>>> [2] >>> https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html >>>>>> Op wo 28 apr. 2021 om 15:02 schreef Ere Maijala < >>> ere.maij...@helsinki.fi>: >>>>>>> Hi, >>>>>>> >>>>>>> Here's one that I can't wrap my head around. The main question is: why >>>>>>> are the search terms treated differently in eDisMax if the query >>> expands >>>>>>> to a multi-word synonym, and there are different field types and >>> q.op=AND? >>>>>>> >>>>>>> This gets complicated quickly, so I tried to reproduce the results >>> with >>>>>>> the techproducts example: >>>>>>> >>>>>>> 1. Start with vanilla Solr 8.8.2 >>>>>>> >>>>>>> 2. echo "cor => Corsair" >> >>>>>>> server/solr/configsets/sample_techproducts_configs/conf/synonyms.txt >>>>>>> >>>>>>> 4. echo "cmi => Corsair Microsystems" >> >>>>>>> server/solr/configsets/sample_techproducts_configs/conf/synonyms.txt >>>>>>> >>>>>>> 4. bin/solr start -e techproducts >>>>>>> >>>>>>> >>>>>>> Now, a basic query that works fine produces 2 results: >>>>>>> >>>>>>> >>>>>>> >>> http://localhost:8983/solr/techproducts/select?q=corsair+microsystems+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND >>>>>>> >>>>>>> But if I use the synonym, I don't get any results: >>>>>>> >>>>>>> >>>>>>> >>> http://localhost:8983/solr/techproducts/select?q=cmi+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND >>>>>>> >>>>>>> If I leave cat field out, however, I get 2 results: >>>>>>> >>>>>>> >>>>>>> >>> http://localhost:8983/solr/techproducts/select?q=cmi+memory&debugQuery=true&defType=edismax&qf=name+manu&q.op=AND >>>>>>> >>>>>>> Also if leave q.op out and add AND between the terms, I get 2 results >>>>>>> even with the cat field: >>>>>>> >>>>>>> >>>>>>> >>> http://localhost:8983/solr/techproducts/select?q=cmi+AND+memory&debugQuery=true&defType=edismax&qf=name+manu+cat >>>>>>> >>>>>>> The single-word synonym works just fine: >>>>>>> >>>>>>> >>>>>>> >>> http://localhost:8983/solr/techproducts/select?q=cor+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND >>>>>>> >>>>>>> >>>>>>> Can anyone shine a light on what's happening here? >>>>>>> >>>>>>> Additional notes: >>>>>>> >>>>>>> 1. This is a simplified example, and the real-world case is much more >>>>>>> complicated. It has our custom class create the synonyms for compound >>>>>>> words in Finnish, and the queries come from users. >>>>>>> >>>>>>> 2. As far as I can see mm doesn't affect the results in any meaningful >>>>>>> way, but I just might be doing something wrong. >>>>>>> >>>>>>> 3. I included the debugQuery parameter so that it's easy to see how >>>>>>> different the queries become. >>>>>>> >>>>>>> Best Regards, >>>>>>> Ere >>>>>>> >>>>>>> -- >>>>>>> Ere Maijala >>>>>>> Kansalliskirjasto / The National Library of Finland >>>>>>> >>>>> >>>>> -- >>>>> Ere Maijala >>>>> Kansalliskirjasto / The National Library of Finland >>>> >>> >>> -- >>> Ere Maijala >>> Kansalliskirjasto / The National Library of Finland >>> > > -- > Ere Maijala > Kansalliskirjasto / The National Library of Finland _______________________ Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | My Free/Busy <http://tinyurl.com/eric-cal> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw> This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.