Thanks for your ideas. I will try your suggestions. John
From: Timothy Prettyman [mailto:timo...@umich.edu] Sent: Friday, May 30, 2014 11:39 AM To: perl4lib Subject: Re: sending marc records into a script that uses MARC::Batch I think you have to check for warnings as you read each record, so try moving your error handing code right after the batch->next() call. But Robin's suggestion is good advice, and is probably a more robust way to handle the crud that can show up in a file of marc records. -Tim On Fri, May 30, 2014 at 5:20 AM, Stefano Bargioni <bargi...@pusc.it<mailto:bargi...@pusc.it>> wrote: If I'm not wrong, $batch->strict_off(); will avoid your loop to print warnings and stop processing records. HTH. Stefano On 29/mag/2014, at 23.13, John E Guillory wrote: Thanks Timothy for your help. When processing about 5 million records I would expect some crazy records. The new script (incorporating Timothy’s suggestions) exited prematurely on record 85,877 with: “Warnings detected: Entirely empty subfield found in tag 260”. I know 260 is publication stuff but it’s not “required”. I’m deliberately printing warnings but again the script exited prematurely. Thanks for assistance. John From: Timothy Prettyman [mailto:timo...@umich.edu<mailto:timo...@umich.edu>] Sent: Thursday, May 29, 2014 11:23 AM To: John E Guillory Cc: perl4lib@perl.org<mailto:perl4lib@perl.org> Subject: Re: sending marc records into a script that uses MARC::Batch For your first question, instead of: $batch = MARC::Batch->new(‘USMARC’,<STDIN>); use: $batch = MARC::Batch->new(‘USMARC’,STDIN); For your second, the error is likely caused when a field you're using as_string() on doesn't exist in the record. So, you could do something like the following: $field = $record->field('008'); $field or do { # check for existence of field print "no 008 field for record\n"; # no field next; # skip the field (or whatever) }; $field_008 = $field->as_string(); Hope this helps -Tim Timothy Prettyman LIT/Library Systems University of Michigan On Thu, May 29, 2014 at 12:08 PM, John E Guillory <jo...@lsu.edu<mailto:jo...@lsu.edu>> wrote: Hello, Two questions please: 1. I’ve written a script that opens a marc file for reading using this syntax: $file = $ARGV[0]; $batch = MARC::Batch->new('USMARC',$file); It then loops thru the records using this syntax: while ( $record = $batch->next()) { …..check position 6, 7 of leader and position 23 of 008 and make some changes } This works great. However, instead of accessing the file this way, I want to pipe the output of a previously run marc dump command directly into this script via the pipe. I understand that this can be done using this syntax: while ($line =<STDIN>){ …}, but I don’t understand how to use that STDIN with “MARC::Batch->new(‘USMARC’,$file);” This does not work: $batch = MARC::Batch->new(‘USMARC’,<STDIN>); 2. My current script successfully reads and processes a marc file of over 5 gigs!....but exits entirely on record 160,585 with the error from MARC::Batch, “Can't call method "as_string" on an undefined value at ./marc_batch.pl<http://marc_batch.pl/>”. Documentation on using MARC::Batch says that to tell it to continue processing even when errors are encountered one should use strict_off(), then print/report warnings at the bottom of the script. I don’t think my particular error is being handled by the strict_off() setting. Doesn’t anybody know what causes/how to fix “Can’t call method as_string?” error? Full script below—it’s pretty short, thanks to MARC::Batch. Thanks for ensights! use MARC::Batch; $file = $ARGV[0]; chomp($file); $batch = MARC::Batch->new('USMARC',$file); $batch->strict_off(); # otherwise script exits when encounters errors open(OUT,'>new_marc'); while ( $record = $batch->next()) { $leader = $record->leader(); $leader_pos_6 = substr($leader,6,1); $leader_pos_7 = substr($leader,7,1); $field = $record->field('008'); $field_008 = $field->as_string(); $field_008_position_23 = substr($field_008,23,1); if ( ($leader_pos_6 eq "a") && ($leader_pos_7 eq "m") && ($field_008_position_23 eq "o") || ($field_008_position_23 eq "s") ) { $control_num = $record->field('001'); $control_num = $control_num->as_string(); print "008 position 23: $field_008_position_23 \n"; print "OLD leader: $leader \n"; $old_leader = $leader; substr($leader,6,1) = 'm'; print "NEW leader: $leader \n"; print OUT $record->as_usmarc(); print "$control_num|$old_leader|$leader|$field_008\n"; } else { # not a match so just print this one unchanged… print OUT $record->as_usmarc(); } } # handles errors: if (@warnings = $batch->warnings()) { print "\n Warnings detected: \n", @warnings; } close(OUT); close(LOG); John Guillory Louisiana Library Network 225.578.3758<tel:225.578.3758> __________________________________________________ Il tuo 5x1000 al Patronato di San Girolamo della Carità è un gesto semplice ma di grande valore. Una tua firma aiuterà i sacerdoti ad essere più vicini alle esigenze di tutti noi. Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti indicando nella dichiarazione dei redditi il codice fiscale 97023980580.