Hoi Jan,

ICUFoldingFilter and ASCIIFoldingFilter i think do not respect the
keyword=true attribute when i last checked. If you use
KeywordRepeatFilter and modify the said TokenFilters to respect the
keyword attribute, the problem seems solved.

Regards,
Markus

2021-08-25 16:32 GMT+02:00, André Widhani <a...@stibodx.com.invalid>:
> Not with ICUFoldingFilter, but with the MappingCharFilter.
>
> There you can supply a mapping file and skip baseletter mappings for the
> users' native language, because in their own language, they know the correct
> spelling ... most of the time ... sometimes.
>
> This does really help with multiple languages and you lose the convenience
> of ICUFoldingFilter.
>
> André
> ________________________________
> From: Jan Høydahl <jan....@cominvent.com>
> Sent: Wednesday, 25 August 2021 15:43
> To: users@solr.apache.org <users@solr.apache.org>
> Subject: ICUFoldingFilter with preserveOriginal option?
>
> External e-mail.
>
>
> Hi,
>
> I'm looking at using ICUFoldingFilter for a customer, to fold e.g. Genéve to
> Geneve and thus get better recall.
> However, for some common Norwegian words, the folding makes them clash with
> super-common words so it becomes impossible to find exactly what you want.
> I imagined if ICUFoldingFilter had a preserverOriginal=true option, then it
> could leave the original word in the index on the same position, and an
> exact match for "Genéve" would score better than the normalized one. But
> this filter does not support this.
>
> Have anyone found a workaround for this, except from duplicating all content
> in different fields with different analysis and search across them with
> different weights?
>
> Jan
>

Reply via email to