Hi Mathieu:
I think many of us think certain things happen in Zebra when they actually happen in Koha before the query ever reaches Zebra ;). As for stemming, theoretically the language obtained via C4::Templates::getlanguage($cgi, 'intranet'); should filter down into the Snowball stemming. If it isnt working in French, it might be because the right locale isnt being passed to Snowball correctly. Thats very possible as I think were using arbitrary language codes rather than standard locales in some cases. It looks like there is a fallback to English in C4::Templates::getlanguage() as well. If its not working for French, it probably just needs a tweak! Yeah, I first heard about Snowball when reading through Zebra docs, and I was pleasantly surprised when I saw that Lingua::Stem::Snowball existed as a Perl interface for the C program. David Cook Systems Librarian Prosentient Systems 72/330 Wattle St, Ultimo, NSW 2007 From: koha-devel-boun...@lists.koha-community.org [mailto:koha-devel-boun...@lists.koha-community.org] On Behalf Of Mathieu Saby Sent: Wednesday, 27 August 2014 7:30 PM To: koha-devel@lists.koha-community.org Subject: Re: [Koha-devel] Stemming and zebra Hi I had always thought stemming was made by Zebra, and only in english! In fact the algorithm for french language is here: http://snowball.tartarus.org/algorithms/french/stemmer.html (Lingua::Stem::Snowball is a Perl interface to the C version of the Snowball stemmers) Mathieu Saby Le 27/08/2014 10:22, David Cook a écrit : Hi Francois: I wrote an email earlier on my tablet, but not 100% sure if it got sent. In any case, Im writing again now! Youll want to look at C4::Search::_build_stemmed_operand(). Zebra doesnt actually do any stemming itself. If you read through the Zebra docs (if youre masochistic), youll notice that they say explicitly that Zebra doesnt do any stemming, but that you can do stemming (using a stemmer like Snowball) while building a query. Thats exactly what we do in Koha. The Perl module that does the stemming is Lingua::Stem::Snowball. However, you might notice that your querys operands arent always stemmed properly. I havent looked in a while, but I think its because we dont build our queries very well at all (when not using QueryParser). If you want to understand why youre getting skills and fishxsdfe in your results, I would suggest running some tests ( using Data::Dumper and warn ) so that you can see your query as it is built. I have a lot of work I want to do on C4::Search::buildQuery, but just dont have the time :/. Unfortunately, at the moment, there is no stemming when using the QueryParser. However, fortunately, using Lingua::Stem::Snowball with QueryParser would be really really easy. I think that Ive written a note on how to do that somewhere on Bugzilla or maybe on Trello I hope that helps! Feel free to send me an email or shout at me on IRC if you want any clarification. I know I probably didnt make it any clearer but hopefully this might help you on your path to understanding. David Cook Systems Librarian Prosentient Systems 72/330 Wattle St, Ultimo, NSW 2007 From: koha-devel-boun...@lists.koha-community.org <mailto:koha-devel-boun...@lists.koha-community.org> [mailto:koha-devel-boun...@lists.koha-community.org] On Behalf Of Francois Charbonnier Sent: Wednesday, 27 August 2014 2:09 AM To: koha-devel@lists.koha-community.org <mailto:koha-devel@lists.koha-community.org> Subject: [Koha-devel] Stemming and zebra Hello, I have tested the QueryStemming system preference on Koha 3.14 (my local installation) and I'm wondering, does zebra just right truncate the words or is there an algorithm to find the stems? I use ICU and I have enabled "QueryWeightFields". I don't have automatic truncation or fuzzy search on. I use these words for my tests:  ski, skiing, skills  fish, fished, fishing, fisher, fishxsdfe Each time, with QueryStemming on, skills and fishxsdfe come out in the search results. Is it what I should expect? "Skills", maybe but "fishxsdfe"? Do you know how it works? or have a good example that would help me to understand? Thanks! -- François Charbonnier, Bibl. prof. / Chef de produits Tél. : (888) 604-2627 <mailto:francois.charbonn...@inlibro.com> francois.charbonn...@inlibro.com inLibro | pour esprit libre | <http://www.inLibro.com> www.inLibro.com _______________________________________________ Koha-devel mailing list Koha-devel@lists.koha-community.org <mailto:Koha-devel@lists.koha-community.org> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
_______________________________________________ Koha-devel mailing list Koha-devel@lists.koha-community.org http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/