Re: How to configure PhoneticSearch ?

Thomas Corthals Mon, 19 Jul 2021 08:01:40 -0700

Hi Christian

German is not my native language, but I believe Mayr vs. Meier is better
suited for Phonetic Matching. You could make a SpellChecker lenient enough
to catch it, but that's probably not the best choice.


There's a specific GermanNormalizationFilter too. It's used for the text_de
fieldType in the techproducts example.

https://lucene.apache.org/core/8_9_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilterFactory.html

Thomas

Op ma 19 jul. 2021 om 16:24 schreef Christian Havel <
christian.ha...@gmail.com>:

> Hi everyone,
>
> thank you all very much for your help !!!!
> From the customer I have some more examples.
> Do I interpret yours feedback correctly, that all the examples should be
> resolvable by using the SpellChecker or by the ASCII folder?
>
> Mayr vs. Meier,
> Moét vs. Moet,
> Cuvée vs. Cuvee,
> Strasse vs. Straße,
> Kudne vs. Kunde
>
> Sorry for my stupid questions.
> Christian
>
> Am Mo., 19. Juli 2021 um 10:14 Uhr schrieb Thomas Corthals <
> tho...@klascement.net>:
>
> > If you need support for "typewriter umlauts" as well, look into Unicode
> > normalization.
> >
> >
> >
> https://solr.apache.org/guide/8_9/filter-descriptions.html#icu-folding-filter
> >
> > Thomas
> >
> > Op zo 18 jul. 2021 om 19:04 schreef Walter Underwood <
> > wun...@wunderwood.org
> > >:
> >
> > > For the André/Andre case, the ASCII folding filter will do the job.
> > >
> > >
> > >
> >
> https://solr.apache.org/guide/8_9/filter-descriptions.html#ascii-folding-filter
> > > <
> > >
> >
> https://solr.apache.org/guide/8_9/filter-descriptions.html#ascii-folding-filter
> > > >
> > >
> > > It does not do a conversion for “typewriter umlauts”, so you might
> want a
> > > character
> > > replacement filter for those. That would convert ä to ae, ö to oe, and
> ü
> > > to ue.
> > >
> > > wunder
> > > Walter Underwood
> > > wun...@wunderwood.org
> > > http://observer.wunderwood.org/  (my blog)
> > >
> > > > On Jul 18, 2021, at 9:14 AM, Jörn Franke <jornfra...@gmail.com>
> wrote:
> > > >
> > > > Hi Christian,
> > > >
> > > > the examples you gave are not the target use case of phonetic
> matching.
> > > > What you want is the SpellChecker
> > > > https://solr.apache.org/guide/8_4/spell-checking.html.
> > > >
> > > > While the problem of phonetic matching partially may serve you it is
> > more
> > > > for queries that want to have results that SOUND like what you have
> > > typed.
> > > > So it would not find Testkudne (sounds completely different from
> > > Testkunde.
> > > >
> > > > Best regards
> > > >
> > > > On Sun, Jul 18, 2021 at 2:43 PM Christian Havel <
> > > christian.ha...@gmail.com>
> > > > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I found how to add PhoneticSearch to my field definition.
> > > >> Well, that is ok. But how can I configure this one?
> > > >> For example if I search for "Testkudne" that a document is found
> that
> > > has
> > > >> the value "Testkunde" or if I search for "Andre" that "André“ is
> > found,
> > > >> too?
> > > >> The following is my definition that is used for index and query:
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> * <dynamicField name="*_txt_de" type="text_de"  indexed="true"
> > > >> stored="true"/>    <fieldType name="text_de" class="solr.TextField"
> > > >> positionIncrementGap="100">      <analyzer>*
> > > >> <tokenizer class="solr.StandardTokenizerFactory"/>
> > > >>        <filter class="solr.LowerCaseFilterFactory"/>
> > > >>        <filter class="solr.StopFilterFactory" ignoreCase="true"
> > > >> words="lang/stopwords_de.txt" format="snowball" />
> > > >>        <filter class="solr.GermanNormalizationFilterFactory"/>
> > > >>
> > > >> *<filter class="solr.BeiderMorseFilterFactory" nameType="GENERIC"
> > > >> ruleType="APPROX" concat="true" languageSet="auto" />*
> > > >>
> > > >> Christian
> > > >>
> > >
> > >
> >
>

Re: How to configure PhoneticSearch ?

Reply via email to