Hi all:
I've opened a bug for preventing normalization during the matching/import process: http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15541 I was trying to use a URL field as a matchpoint, and it was going horribly badly. 1) By default, punctuation is stripped, leading/trailing spaces are trimmed, and more than one space is condensed down to one space. This makes a URL into a string without any spaces or punctuation. 2) So, I had to add "Identifier-other:u" to biblio-zebra-indexdefs.xsl but I couldn't access it until I tried "id-other,st-urx" as my match point. The st-urx is necessary to make it use the ":u" register. 3) I also added some code to C4/Matcher.pm so that a match point normalizer of "None" would disable the normalization from #1. 4) I also plan to add a flag to C4::Search::SimpleSearch to disable the s/:/=/g normalization since that also destroys the URL in the query and makes it fail to match. I've only tested this so far with CHR but it works well. I'll probably look at ICU tomorrow. I'm sure there are probably other cases than just URLs where we will want to skip the default normalizing when doing matching. or normalize it in a way that accords with the way Zebra normalizes the data in records. For instance, Zebra will replace punctuation with a space for "phrase" indexes rather than just stripping it out and leaving nothing behind. David Cook Systems Librarian Prosentient Systems 72/330 Wattle St, Ultimo, NSW 2007
_______________________________________________ Koha-devel mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
