Hi All, We get a lot of updates from our client libraries, to keep our union cat up to date. There isn't a lot of consistancy between the libraries regarding which ILS they use (they're all small rural libs), or which versions of a given ILS. The files of MARC records we get are often... strange. Things like missing/extra end-of-field markers, "bad" 008 fields, repetitions of non-repeatable fields, etc.
Rather than adding a bunch of special cases to our conversion routines, I thought I'd write a pre-processor to eliminate as many problems as possible first. The easiest way to do that is to read each record and then create a brand new record using its data - letting MARC::Record create it "properly" rather than trying to fix the original record. This works extremely well (kudos to Andy and Ed for making my life so much easier!), except that for some reason, I'm getting an extra x0D x0E after each new record I print to stdout. I'm stumped. Can anyone tell me what I'm missing here? #!/usr/bin/perl use MARC::Batch; my $cnt = 0; my $batch = MARC::Batch->new( 'USMARC', @ARGV ); $batch->strict_off(); $batch->warnings_off(); while ( my $oldmarc = $batch->next ) { last if $cnt > 5; # Just for testing.... $cnt++; next unless $oldmarc->title(); # if this is a garbage record, skip it. my $oldleader = $oldmarc->leader(); my $newmarc = new MARC::Record; $newmarc->leader( $oldleader ); my @oldfields = $oldmarc->fields(); my @newfields = (); foreach my $oldfield (@oldfields) { my $newfield = undef; if ($oldfield->is_control_field()) { my $tag = $oldfield->tag(); my $data = $oldfield->data(); $newfield = MARC::Field->new( $tag, $data ); } else { my $tag = $oldfield->tag(); my $ind1 = $oldfield->indicator(1) || ' '; my $ind2 = $oldfield->indicator(2) || ' '; my @oldsubfields = $oldfield->subfields(); my @newsubfields = (); foreach $oldsubfield (@oldsubfields) { push @newsubfields, $oldsubfield->[0]; push @newsubfields, $oldsubfield->[1]; } $newfield = MARC::Field->new( $tag, $ind1, $ind2, @newsubfields ); } if ($newfield) { push @newfields, $newfield; } } $newmarc->insert_fields_ordered( @newfields ); print $newmarc->as_usmarc(); } Of course, if I just run this routine twice (once on the original file, and once on the output), it eliminates those "extra" empty non-records. But I'd really like to figure out why they are getting in there in the first place (that is, why the extra x0D x0E gets written). Thanks, and sorry for the long post, -David