This is where having a really nice regression test across a wide variety of 
representative queries comes in super helpful, to understand the impact of 
changes not just on one query, but across all your queries.   See 
https://quepid.com/ or https://github.com/SeaseLtd/rated-ranking-evaluator for 
some tooling that can help!



> On Apr 29, 2021, at 7:42 AM, Ere Maijala <ere.maij...@helsinki.fi> wrote:
> 
> Thanks for sharing your experience. We've been running sow=false until now 
> and got away with it as most searches hit a catch-all field. That won't be 
> the case in the long run anymore, so now I'm contemplating switching back to 
> sow=true. It's just a bit scary change to do at this point since it might 
> affect something I've failed to consider.
> 
> --Ere
> 
> Atita Arora kirjoitti 29.4.2021 klo 13.32:
>> I struggled with something very similar today, which turned our search
>> model upside down as in Solr 8+ edismax uses sow=false by default and it
>> was true in Solr 6 (where we upgraded from),
>> but after adding this my handler things were back to normal.
>> On Thu, Apr 29, 2021 at 11:45 AM Ere Maijala <ere.maij...@helsinki.fi>
>> wrote:
>>> Ah, yes, that seems to be the case. Thanks for the pointer! Also the
>>> discussion is enlightening. It looks like there hasn't been much
>>> happening in that bug so I'll need to consider any other options.
>>> 
>>> --Ere
>>> 
>>> Jan Høydahl kirjoitti 29.4.2021 klo 11.48:
>>>> I think you are hitting this bug
>>> https://issues.apache.org/jira/browse/SOLR-12779
>>>> 
>>>> Jan
>>>> 
>>>>> 29. apr. 2021 kl. 08:51 skrev Ere Maijala <ere.maij...@helsinki.fi>:
>>>>> 
>>>>> Hello Markus,
>>>>> 
>>>>> Thanks for the reply. I'm not sure I understand. The docs state the
>>> following:
>>>>> 
>>>>> "The default value of mm is 0% (all clauses optional), unless q.op is
>>> specified as "AND", in which case mm defaults to 100% (all clauses
>>> required)."
>>>>> (
>>> https://solr.apache.org/guide/8_8/the-dismax-query-parser.html#TheDisMaxQueryParser-Themm_MinimumShouldMatch_Parameter
>>> )
>>>>> 
>>>>> And obviously it has effect. You can also replace q.op=AND with
>>> mm=100%25 in my examples with the same results. The multi-word synonym
>>> makes the query explained by debugQuery=true seem wrong to me in that it
>>> requires all terms to match in the same field, whereas normally the match
>>> can be found in any of the fields listed in qf. For example this is the
>>> query from my first example:
>>>>> 
>>>>> +(+DisjunctionMaxQuery((name:corsair | manu:corsair | cat:corsair))
>>> +DisjunctionMaxQuery((name:microsystems | manu:microsystems |
>>> cat:microsystems)) +DisjunctionMaxQuery((name:memory | manu:memory |
>>> cat:memory)))
>>>>> 
>>>>> Using the synonym instead of `corsair microsystems` produces this:
>>>>> 
>>>>> +(+((+name:corsair +name:microsystems +name:memory) | (+manu:corsair
>>> +manu:microsystems +manu:memory) | (+cat:cmi +cat:memory)))
>>>>> 
>>>>> We don't use stopwords. mm.autoRelax does not make a difference here.
>>>>> 
>>>>> Best,
>>>>> Ere
>>>>> 
>>>>> Markus Jelsma kirjoitti 28.4.2021 klo 16.20:
>>>>>> Hello Ere,
>>>>>> The q.op parameter is not a dismax parameter. instead i think you are
>>> being
>>>>>> bitten bij de mm parameter [1] which by default is 100%, meaning all
>>> terms
>>>>>> must match. Multi word synonym handing and mm are not a very intuitive
>>>>>> match, and can lead to crazy problems. Also beware of mm and stopword
>>>>>> handling and check out mm.autoRelax [2]. But it is best not to use
>>>>>> stopwords at all.
>>>>>> Check it out,
>>>>>> Markus
>>>>>> [1] https://solr.apache.org/guide/6_6/the-dismax-query-parser.html
>>>>>> [2]
>>> https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html
>>>>>> Op wo 28 apr. 2021 om 15:02 schreef Ere Maijala <
>>> ere.maij...@helsinki.fi>:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> Here's one that I can't wrap my head around. The main question is: why
>>>>>>> are the search terms treated differently in eDisMax if the query
>>> expands
>>>>>>> to a multi-word synonym, and there are different field types and
>>> q.op=AND?
>>>>>>> 
>>>>>>> This gets complicated quickly, so I tried to reproduce the results
>>> with
>>>>>>> the techproducts example:
>>>>>>> 
>>>>>>> 1. Start with vanilla Solr 8.8.2
>>>>>>> 
>>>>>>> 2. echo "cor => Corsair" >>
>>>>>>> server/solr/configsets/sample_techproducts_configs/conf/synonyms.txt
>>>>>>> 
>>>>>>> 4. echo "cmi => Corsair Microsystems" >>
>>>>>>> server/solr/configsets/sample_techproducts_configs/conf/synonyms.txt
>>>>>>> 
>>>>>>> 4. bin/solr start -e techproducts
>>>>>>> 
>>>>>>> 
>>>>>>> Now, a basic query that works fine produces 2 results:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> http://localhost:8983/solr/techproducts/select?q=corsair+microsystems+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND
>>>>>>> 
>>>>>>> But if I use the synonym, I don't get any results:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> http://localhost:8983/solr/techproducts/select?q=cmi+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND
>>>>>>> 
>>>>>>> If I leave cat field out, however, I get 2 results:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> http://localhost:8983/solr/techproducts/select?q=cmi+memory&debugQuery=true&defType=edismax&qf=name+manu&q.op=AND
>>>>>>> 
>>>>>>> Also if leave q.op out and add AND between the terms, I get 2 results
>>>>>>> even with the cat field:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> http://localhost:8983/solr/techproducts/select?q=cmi+AND+memory&debugQuery=true&defType=edismax&qf=name+manu+cat
>>>>>>> 
>>>>>>> The single-word synonym works just fine:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>> http://localhost:8983/solr/techproducts/select?q=cor+memory&debugQuery=true&defType=edismax&qf=name+manu+cat&q.op=AND
>>>>>>> 
>>>>>>> 
>>>>>>> Can anyone shine a light on what's happening here?
>>>>>>> 
>>>>>>> Additional notes:
>>>>>>> 
>>>>>>> 1. This is a simplified example, and the real-world case is much more
>>>>>>> complicated. It has our custom class create the synonyms for compound
>>>>>>> words in Finnish, and the queries come from users.
>>>>>>> 
>>>>>>> 2. As far as I can see mm doesn't affect the results in any meaningful
>>>>>>> way, but I just might be doing something wrong.
>>>>>>> 
>>>>>>> 3. I included the debugQuery parameter so that it's easy to see how
>>>>>>> different the queries become.
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Ere
>>>>>>> 
>>>>>>> --
>>>>>>> Ere Maijala
>>>>>>> Kansalliskirjasto / The National Library of Finland
>>>>>>> 
>>>>> 
>>>>> --
>>>>> Ere Maijala
>>>>> Kansalliskirjasto / The National Library of Finland
>>>> 
>>> 
>>> --
>>> Ere Maijala
>>> Kansalliskirjasto / The National Library of Finland
>>> 
> 
> -- 
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to