So doing some further research, it definitely looks like we have duplicate control numbers (001). This is a data entry mistake and it looks like the cataloger copied the biblios for similar entries. I have gone back and altered the control numbers to be unique, but rebuild_zebra.pl -b -r is not adding the new entries. Any idea what else we might need to do?
-Doug- On 1 September 2012 15:32, Ian Bays <ian.b...@ptfs-europe.com> wrote: > Hi. > The 3.8 upgrade offers the dom indexing by default and if you have taken > that option (as seen in $KOHA_CONF) the xsl used instead of record.abs > (~/koha-dev/etc/zebradb/marc_**defs/marc21/biblios/biblio-**zebra-indexdefs.xsl) > uses a construct (z:id) for the 001 which uses that (if it exists) as the > zebra unique id. This means if you have more than one bib record with the > same 001 (as you get if you duplicate a bib for instance) it will only > index the last one and it won't complain at all about it. > Not sure if it's a hangover from using the xml used by authorities which > stores the auth_id in the 001 or UNIMARC which might use 001 as the bib > number. Either way I bet if you remove the 001 or make it unique then it > will index OK. > The better solution is to fix the xsl to probably not use the z:id for > biblios or maybe get it to use the 999$c, but the zebra config scares me. > It took ages to find the cause so I hope this helps someone. > Ian > > On 01/09/2012 18:11, Doug Kingston wrote: > >> On 1 September 2012 09:46, Jared Camins-Esakov >> <jcam...@cpbibliography.com>**wrote: >> >> Doug, >>> >>> So environment variables are not the issue. We are carefully managing >>> >>>> those. >>>> >>>> Make sure when you are using cron jobs that you set the environment >>> variables IN YOUR CRONTAB. Setting environment variables elsewhere is a >>> recipe for confusion and misery down the road. However, this is -- as you >>> say -- not the problem. >>> >>> >>> I have tried using the new tool checkNonIndexedBiblios.pl (from patch >>>> 6566) >>>> and it indeed finds a few recent biblios that are not indexed. Using >>>> the >>>> -z option to mark them for indexing followed by a manual run of >>>> rebuild_zebra -b -v -z did not get the biblios indexed. I cranked up >>>> the >>>> debugging on zebraidx (by modifying rebuild_zebra.pl and using -v -v) >>>> and >>>> did not see any obvious errors in the output that would suggest why >>>> indexing was failing. >>>> >>>> Did you change your bibliographic frameworks? It could be a matter of >>> the >>> biblionumber not being stored properly. The other thing to do is to >>> confirm >>> that the non-indexed biblios are *actually* getting added to the >>> zebraqueue >>> by the 6566 script. It's kind of a long shot, but it could be an issue >>> with >>> the zebraqueue table getting corrupted. I've seen this happen when the >>> zebraqueue table got too large, and disk space was low. >>> >>> So I think this is working as expected. Disk space is ample on the >> system >> in question, and the catalogue is small by most standards (about 2500 >> biblios). I ran rebuild_zebra.pl with the -k flag so it left the >> exported >> records and here's the tree I got. >> >> library:/tmp# ls -altR p6tjtKrrK3/ >> p6tjtKrrK3/: >> total 0 >> drwxrwxrwt 6 root root 1040 Sep 1 17:50 .. >> drwx------ 5 koha koha 100 Sep 1 06:36 . >> drwxr-xr-x 2 koha koha 60 Sep 1 06:36 upd_biblio >> drwxr-xr-x 2 koha koha 60 Sep 1 06:36 del_biblio >> drwxr-xr-x 2 koha koha 40 Sep 1 06:36 biblio >> >> p6tjtKrrK3/upd_biblio: >> total 16 >> -rw-r--r-- 1 koha koha 12670 Sep 1 06:36 exported_records >> drwxr-xr-x 2 koha koha 60 Sep 1 06:36 . >> drwx------ 5 koha koha 100 Sep 1 06:36 .. >> >> p6tjtKrrK3/del_biblio: >> total 0 >> drwx------ 5 koha koha 100 Sep 1 06:36 .. >> drwxr-xr-x 2 koha koha 60 Sep 1 06:36 . >> -rw-r--r-- 1 koha koha 0 Sep 1 06:36 exported_records >> >> p6tjtKrrK3/biblio: >> total 0 >> drwx------ 5 koha koha 100 Sep 1 06:36 .. >> drwxr-xr-x 2 koha koha 40 Sep 1 06:36 . >> >> Using marcprint.py, a small python program built around pymarc package, I >> decoded this file and find 13 MARC records, as expected. >> Example: >> =LDR 00871nam a22002417a 4500 >> =001 201112071555.ls >> =003 UkLoVW >> =005 20111209110116.0 >> =008 111207t1982\\\\enkg\\\\r\\\\\**001\0\eng\d >> =040 \\$aUkLoVW$cUkLoVW >> =099 \\$aQS 40 >> =100 1\$aSheffield, Ken$92330 >> =245 \0$aTen country dances :$bmainly from Thompson, Wright & Wilson. >> =260 \\$aOxford :$b[The Author],$c1982. >> =300 \\$a12 p. :$bmusic ;$c30 cm. >> =490 1\$aFrom two barns ;$vv. 1 >> =650 \\$9117$aCountry dances >> =650 \\$9127$aDance music >> =830 \5$aFrom two barns$92331 >> =942 \\$2VWML$cBK$hQS 40$n0$6QS_00040 >> =999 \\$c14879$d14879 >> =952 \\$w2011-12-07$p10914$r2011-**12-07$40$00$6QS_00040$915083$** >> bVWML$10$oQS >> 40$d2011-12-07$70$cBOX$2VWML$**yBK$aVWML >> =952 \\$w2011-12-07$p11121$r2011-**12-07$40$00$6QS_00040$915084$** >> bVWML$10$oQS >> 40$d2011-12-07$71$cBOX$2VWML$**yBK$aVWML >> >> I have attached an ascii printout of all 13 records in case someone wants >> to look for a pattern in these records. >> >> The problem is either in the format/contents of those records, or in >> zebraidx/zebrasrv or their config files. My suspicion is with the later >> since we have already had to fix one problem there with for bug 6566. >> >> -Doug- >> >> Regards, >>> Jared >>> >>> -- >>> Jared Camins-Esakov >>> Bibliographer, C & P Bibliography Services, LLC >>> (phone) +1 (917) 727-3445 >>> (e-mail) jcam...@cpbibliography.com >>> (web) http://www.cpbibliography.com/ >>> >>> >>> >>> >>> ______________________________**_________________ >>> Koha mailing list http://koha-community.org >>> Koha@lists.katipo.co.nz >>> http://lists.katipo.co.nz/**mailman/listinfo/koha<http://lists.katipo.co.nz/mailman/listinfo/koha> >>> >> > -- > Ian Bays > Director of Projects, PTFS Europe Limited > Content Management and Library Solutions > +44 (0) 800 756 6803 (phone) > +44 (0) 7774 995297 (mobile) > +44 (0) 800 756 6384 (fax) > skype: ian.bays > email: ian.b...@ptfs-europe.com > > > ______________________________**_________________ > Koha mailing list http://koha-community.org > Koha@lists.katipo.co.nz > http://lists.katipo.co.nz/**mailman/listinfo/koha<http://lists.katipo.co.nz/mailman/listinfo/koha> > _______________________________________________ Koha mailing list http://koha-community.org Koha@lists.katipo.co.nz http://lists.katipo.co.nz/mailman/listinfo/koha