This is a great tool for understanding how analyzers are handling specific
terms: https://solr.apache.org/guide/8_8/analysis-screen.html

You'll be able to see how witchcraft was added to the index.

Joel Bernstein
http://joelsolr.blogspot.com/


On Fri, Feb 18, 2022 at 3:47 PM Matthew Roth <mgrot...@gmail.com> wrote:

> Hi List,
>
> We are noting unexpected wildcard results. For example, the following query
>
>  text:"witch*"
>
> will match witch, witches, but not witchcraft. We would
> anticipate witchcraft would also be matched. I suspect the issue may lie
> with the field definition.
>
>  <fieldType name="text_en" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>           <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="lang/stopwords_en.txt"
>                 />
>         <filter class="solr.LowerCaseFilterFactory"/>
>         <filter class="solr.EnglishPossessiveFilterFactory"/>
>         <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>         <filter class="solr.PorterStemFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>         <filter class="solr.StopFilterFactory"
>                 ignoreCase="true"
>                 words="lang/stopwords_en.txt"
>                 />
>         <filter class="solr.LowerCaseFilterFactory"/>
>   <filter class="solr.EnglishPossessiveFilterFactory"/>
>         <filter class="solr.KeywordMarkerFilterFactory"
> protected="protwords.txt"/>
>          <filter class="solr.PorterStemFilterFactory"/>
>       </analyzer>
>     </fieldType>
>
>
> May anyone offer any insight.
>
> Best,
> Matt
>

Reply via email to