According to the javadoc https://lucene.apache.org/core/8_9_0/analyzers-phonetic/org/apache/lucene/analysis/phonetic/BeiderMorseFilterFactory.html
BeiderMorseFilterFactory is supposed to be used after the StandardTokenizer.

Most likely GermanNormalizationFilterFactory and GermanLightStemFilterFactory
shouldn't be used with BeiderMorseFilterFactory. After stems are cut, stems' pronunciation can't be matched.

On the other hand, if you just want to match the German word spelled using different standards (ß <-> ss), GermanNormalizationFilterFactory should be enough. You don't need BeiderMorseFilterFactory.

p.s. I'm not a German speaker and I haven't actually tested the above claim. I'm just speculating.


On 6/28/21 7:25 AM, Christian Havel wrote:
Hi,

I am using Solr 8.8.1 and want to use the Phonetic Search option. Because
of this I modified my schema.xml file, rebuild the index.













*   <!-- German -->    <dynamicField name="*_txt_de" type="text_de"
  indexed="true"  stored="true"/>    <fieldType name="text_de"
class="solr.TextField" positionIncrementGap="100">      <analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>        <filter
class="solr.LowerCaseFilterFactory"/>        <filter
class="solr.StopFilterFactory" ignoreCase="true"
words="lang/stopwords_de.txt" format="snowball" />        <filter
class="solr.GermanNormalizationFilterFactory"/>        <filter
class="solr.GermanLightStemFilterFactory"/> <filter
class="solr.BeiderMorseFilterFactory" nameType="GENERIC" ruleType="APPROX"
concat="true" languageSet="auto" />        <!-- less aggressive: <filter
class="solr.GermanMinimalStemFilterFactory"/> -->        <!-- more
aggressive: <filter class="solr.SnowballPorterFilterFactory"
language="German2"/> -->      </analyzer>*
     </fieldType>

Well I hope that searching for "mueller" finds contacts with "müller", too.
But it seems that it has no effect.
Do you have any idea what could be missing?

Thanks,
Christian


Reply via email to