Hoi Jan, ICUFoldingFilter and ASCIIFoldingFilter i think do not respect the keyword=true attribute when i last checked. If you use KeywordRepeatFilter and modify the said TokenFilters to respect the keyword attribute, the problem seems solved.
Regards, Markus 2021-08-25 16:32 GMT+02:00, André Widhani <a...@stibodx.com.invalid>: > Not with ICUFoldingFilter, but with the MappingCharFilter. > > There you can supply a mapping file and skip baseletter mappings for the > users' native language, because in their own language, they know the correct > spelling ... most of the time ... sometimes. > > This does really help with multiple languages and you lose the convenience > of ICUFoldingFilter. > > André > ________________________________ > From: Jan Høydahl <jan....@cominvent.com> > Sent: Wednesday, 25 August 2021 15:43 > To: users@solr.apache.org <users@solr.apache.org> > Subject: ICUFoldingFilter with preserveOriginal option? > > External e-mail. > > > Hi, > > I'm looking at using ICUFoldingFilter for a customer, to fold e.g. Genéve to > Geneve and thus get better recall. > However, for some common Norwegian words, the folding makes them clash with > super-common words so it becomes impossible to find exactly what you want. > I imagined if ICUFoldingFilter had a preserverOriginal=true option, then it > could leave the original word in the index on the same position, and an > exact match for "Genéve" would score better than the normalized one. But > this filter does not support this. > > Have anyone found a workaround for this, except from duplicating all content > in different fields with different analysis and search across them with > different weights? > > Jan >