On Mon, Feb 1, 2016 at 5:54 PM, David Cook <dc...@prosentient.com.au> wrote:
> Hi Nicole, > > I keep meaning to look over and revise the search documentation, but I > always seem preoccupied with other work. > > I'm not sure whether or not the list at > http://manual.koha-community.org/3.24/en/kohasearchindexes.html is > complete at a glance. To be honest, while I think it's a valuable list, I > think it would be more valuable for end users to have a list of CCL > qualifiers (and their corresponding registers). While an index may exist in > Zebra, it's the CCL qualifier that the user needs to know in order to > access it, and sometimes the qualifier is difference than the index name. > > There are 3 vital files for Zebra indexing and Koha searching: > bib1.att > biblio-zebra-indexdefs.xsl > ccl.properties > > bib1.att defines which indexes may exist. > Biblio-zebra-indexdefs.xsl decides what MARC data goes into which indexes. > ccl.properties provides a query language for accessing those indexes > through search queries. > > Paul asked about the suffixes :n, :p, :w, :u, and :s. These are called > "registers". :n is numeric, :p is phrase, :w is word, :u is URL, and :s is > sorting. > > Different types of CCL qualifiers allow us to access different types of > registers. "st-numeric" provides access to the :n register. "st-phrase" and > "phr" access :p. "st-word", "st-word-list", and "wrdl" access ":w", > "st-urx" accesses :u, and generally we don't need to access :s when > searching as that's a behind-the-scenes thing for Koha to worry about. > > Different registers have different normalization rules. > > If we look at biblio-zebra-indexdefs.xsl, we can see that MARC 245 is > indexed into Title:w and Title:p. That means "Harry Potter and the > Philosopher's Stone" would be indexed something like so: > > <title:w>Harry</title:w> > <title:w>Potter</title:w> > <title:w>Phllosopher's</title:w> > <title:w>Stone</title:w> > <title:p>Harry Potter and the Philosopher's Stone</title:p> > > So if we did a search like... "title,wrdl=Harry", we'd get a hit for that > MARC record. If we did a search like 'title,phr="Harry Potter and the > Philosopher's Stone"', we'd get a hit for that MARC record. > > I'll draw your attention now to 952$u. It's indexed as uri:u (although it > would also show up in the Any:w and Any:p keyword indexes). In order to > access uri:u, we'd need to search for 'uri,st-urx=" > http://koha-community.org"'. The "st-urx" maps to the ":u", and we see > "uri" in "ccl.properties" which maps to "uri" in bib1.att. > > If we tried to do a search for 'uri,wrdl="http://koha-community.org"', it > would fail, because nothing is indexed in the "uri:w" index:register combo. > > I have to run to an appointment, but hopefully that helps a bit. > > One day, I'd like to write a program which parses ccl.properties to > provide a list of qualifies that cross-references with > biblio-zebra-indexdefs.xsl to see which registers are available for which > qualifier/index pair. The register system is a bit complicated but it can > be useful. I've recently started doing more with the ":u" register... > > David, This isn't *quite* what you describe, but it does set up at least a couple of the linkages that you'd need: https://github.com/bywatersolutions/koha-script-zebra-config-report I think that Marcus Enger wrote something similar, but I don't know if it fist the current code base, and I don't remember the URL. --Barton _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz https://lists.katipo.co.nz/mailman/listinfo/koha