Hi all,
Lari Strand pointed out on Bug 36947 that Koha doesn’t take into account diacritics when it sorts facet names. He talks about Elasticsearch there although it appears Zebra also has the same issue. The first proposed solution was to strip out the diacritics using Unicode::Normalize. This worked pretty well, but on Mattermost paxed pointed out that this doesn’t work well for Finnish where “⟨å⟩, ⟨ä⟩ and ⟨ö⟩ are regarded as distinct letters and collated after ⟨z⟩” (as per Wikipedia’s “Finnish orthography” entry. So I looked at some locale-based Perl core options like “use locale” and “Unicode::Collate::Locale”, and I really really like “Unicode::Collate::Locale”. It leverages the Linux locale files to perfectly sort the text. (The only gotcha is that it’s based off the system locale. There are ways we could use the UI-chosen language instead, but I figure that’s a future development.) Anyway, I just want to get more eyes on this code, because it’s super interesting. The patch is very small and easy to understand. I just want to get more opinions about what we should be doing with it. Cheers! David Cook Senior Software Engineer Prosentient Systems Suite 7.03 6a Glen St Milsons Point NSW 2061 Australia Office: 02 9212 0899 Online: 02 8005 0595
_______________________________________________ Koha-devel mailing list Koha-devel@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : https://www.koha-community.org/ git : https://git.koha-community.org/ bugs : https://bugs.koha-community.org/