Re: unexpected wildcard results

Andy C Fri, 18 Feb 2022 14:02:46 -0800

I think the issue is the doublequotes around your query string. Try
searching for text:witch* instead.


It appears that when surrounded by doublequotes the * is treated as text
and not a syntax character (wildcard). You can verify this by performing
the query in the Solr Admin UI and checking the "debugQuery" box.

This results in it actually searching just for 'witch' as * characters are
not indexed by your field type.

The Porter stemmer indexes both 'witch' and 'witches' as 'witch' but
'witchcraft' as 'witchcraft'. So when it searches for 'witch' it matches
the original text of 'witch' and 'witches' but not 'witchcraft'

- Andy -

On Fri, Feb 18, 2022 at 4:48 PM Joel Bernstein <joels...@gmail.com> wrote:

> This is a great tool for understanding how analyzers are handling specific
> terms: https://solr.apache.org/guide/8_8/analysis-screen.html
>
> You'll be able to see how witchcraft was added to the index.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, Feb 18, 2022 at 3:47 PM Matthew Roth <mgrot...@gmail.com> wrote:
>
> > Hi List,
> >
> > We are noting unexpected wildcard results. For example, the following
> query
> >
> >  text:"witch*"
> >
> > will match witch, witches, but not witchcraft. We would
> > anticipate witchcraft would also be matched. I suspect the issue may lie
> > with the field definition.
> >
> >  <fieldType name="text_en" class="solr.TextField"
> > positionIncrementGap="100">
> >       <analyzer type="index">
> >         <tokenizer class="solr.StandardTokenizerFactory"/>
> >           <filter class="solr.StopFilterFactory"
> >                 ignoreCase="true"
> >                 words="lang/stopwords_en.txt"
> >                 />
> >         <filter class="solr.LowerCaseFilterFactory"/>
> >         <filter class="solr.EnglishPossessiveFilterFactory"/>
> >         <filter class="solr.KeywordMarkerFilterFactory"
> > protected="protwords.txt"/>
> >         <filter class="solr.PorterStemFilterFactory"/>
> >       </analyzer>
> >       <analyzer type="query">
> >         <tokenizer class="solr.StandardTokenizerFactory"/>
> >         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> > ignoreCase="true" expand="true"/>
> >         <filter class="solr.StopFilterFactory"
> >                 ignoreCase="true"
> >                 words="lang/stopwords_en.txt"
> >                 />
> >         <filter class="solr.LowerCaseFilterFactory"/>
> >   <filter class="solr.EnglishPossessiveFilterFactory"/>
> >         <filter class="solr.KeywordMarkerFilterFactory"
> > protected="protwords.txt"/>
> >          <filter class="solr.PorterStemFilterFactory"/>
> >       </analyzer>
> >     </fieldType>
> >
> >
> > May anyone offer any insight.
> >
> > Best,
> > Matt
> >
>

Re: unexpected wildcard results

Reply via email to