Ok so from what I’m looking at you have a proximity search so the terms have to 
be within the distance value of each other. In my example, 2, which obviously 
won’t work since there are three terms.  A fuzzy search is based on a single 
term/token. So you need to add ~2 to each term if that’s what you want. There’s 
really good
Documentation about the difference and why it’s not working as you expected 
here:

https://examples.javacodegeeks.com/apache-solr-fuzzy-search-example/

Also try to make use of phrase query fields and boosting them, 



> On Aug 23, 2022, at 11:18 AM, Morten Ernebjerg 
> <morten.ernebj...@data4life.care> wrote:
> 
> (replying on behalf of  my colleague Julius who wrote this question who is
> unable to reply for technical reasons)
> Hi David,
> 
> Thanks for the reply! I think your question may point to something we
> overlooked. We are actually using Solr 8.11 and we want to use fuzzy search
> (
> https://solr.apache.org/guide/8_11/the-standard-query-parser.html#fuzzy-searches),
> i.e. find words that differ from the query by one or a few characters. Our
> understanding was that to get matches that differ by max two chars from
> (using separate line to avoid adding confusing quotation marks)
> 
> term-with-hyphens
> 
> we should send the following query (without any quotation marks):
> 
> term-with-hyphens~2
> 
> Our thinking was that the hyphenated term is one word so there is no need
> to quote it. We had a quick try quoting the hyphenated term in the query as
> you suggested and it looks like it works (i.e. returns matches). Since the
> standard tokenizer splits on hyphens, I'm wondering the unquoted query
> somehow gets converted to the *proximity search* query
> 
> "term with hyphens"~2
> 
> which then fails (though it looks like it should still match
> term-with-hyphens). Would be great to understand what is happening.
> 
> Best,
> 
> Morten
> 
> 
> 
>> On Tue, 23 Aug 2022 at 16:30, David Hastings <hastings.recurs...@gmail.com>
>> wrote:
>> 
>> I’m not certain of course of your tokenizer but shouldn’t it be
>> “terms-with-hyphens”~1
>> 
>> ? Just a syntax thing that may not have translated over email but curious
>> 
>> On Tue, Aug 23, 2022 at 10:12 AM Julian Hugo <julian.h...@data4life.care>
>> wrote:
>> 
>>> Hello,
>>> 
>>> I am getting peculiar results when querying for a term containing hyphens
>>> and add fuzzy search
>>> <
>>> 
>> https://solr.apache.org/guide/6_6/the-standard-query-parser.html#TheStandardQueryParser-FuzzySearches
>>>> 
>>> .
>>> 
>>> I have indexed two items (1) "term-with-hyphens" and (2) "term with
>>> hyphens". When I query ("q") for "term-with-hyphens" or "term with
>> hyphens"
>>> both items are returned as expected. The same is the case for escaped
>>> hyphens "term\-with\-hyphens".
>>> 
>>> The problem: When I add the fuzzy search parameter (i.e.,
>>> "term-with-hyphens~1" or "term\-with\-hyphens~1"). I get zero results
>> back.
>>> 
>>> I struggle to understand the results, or how to solve this problem. My
>>> intuition tells me that adding a fuzzy search parameter should surely
>>> increase the size of the set of results. I am happy for any help on this!
>>> 
>>> Our current setup is using the "Extended DisMax Query Parser"
>>> <https://solr.apache.org/guide/6_6/the-extended-dismax-query-parser.html
>>> 
>>> however we observe the same behaviour using the "Standard Query Parser
>>> <https://solr.apache.org/guide/6_6/the-standard-query-parser.html>". We
>>> are
>>> using the "Standard Tokenizer
>>> <
>>> 
>> https://solr.apache.org/guide/6_6/tokenizers.html#Tokenizers-StandardTokenizer
>>>> ",
>>> which splits at hyphens. Does this relate to this problem?
>>> 
>>> Thank you!
>>> 
>>> --
>>> 
>>> *Julian Hugo*
>>> 
>>> Working Student
>>> Backend Development
>>> 
>>> (he/his)
>>> 
>>> 
>>> julian.h...@data4life.care
>>> 
>>> 
>>> D4L data4life gGmbH
>>> Charlottenstraße 109
>>> 14467 Potsdam, Germany
>>> 
>>> www.data4life.care
>>> 
>>> 
>>> Amtsgericht Potsdam, HRB 30667
>>> 
>>> Managing Director: Christian-Cornelius Weiß
>>> 
>>> 
>>> We are Data4Life. We've been certified by the German Federal Office for
>>> Information Security (BSI) in accordance with ISO 27001 on the basis of
>>> "IT-Grundschutz".
>>> 
>>> 
>>> Diversity is the driving force behind our work towards a society where
>>> digital health improves quality of life for everyone.
>>> Data4Life warmly welcomes applicants from the LGBTQI+ community, people
>>> with a migration background, People of Color, and individuals with
>>> disabilities or chronic illnesses to the team.
>>> 
>>> 
>>> Climate neutral since 2019 <https://wtca.lfca.earth/e/data4life>
>>> 
>> 
> 
> 
> -- 
> 
> *Morten Ernebjerg, Ph.D.*
> 
> Senior Developer
> 
> 
> morten.ernebj...@data4life.care
> 
> D4L data4life gGmbH
> 
> Charlottenstraße 109
> 
> 14467 Potsdam, Germany
> 
> www.data4life.care
> 
> Amtsgericht Potsdam, HRB 30667
> 
> Managing Director: Christian-Cornelius Weiß
> 
> 
> We are Data4Life. We've been certified by the German Federal Office for
> Information Security (BSI) in accordance with ISO 27001 on the basis of
> "IT-Grundschutz".
> 
> 
> Climate neutral since 2019 <https://wtca.lfca.earth/e/data4life>

Reply via email to