On Tue, Dec 07, 2004 at 12:53:44PM -0600, John Hammer wrote: > Attached are the two files. The Marc file seems to be using a Windows font > (1251?). As for the program, the same changes occur if I just read the Marc > file and write it back out with no changes. The Perl I am using is 5.8.3
Ok, I've confirmed that simply reading this record in and writing it out will yield a different file. The unix diff program confirms this, but does not isolate the difference, since MARC records are not multiline documents. Using diff with hexdump provides some more concrete data. First hexdump the original file and the processed file like so: % hexdump -C original.dat > original.dump % hexdump -C processed.dat > processed.dump Then compare these two files with diff: % diff original.dump processed.dump You should see this: 148,149c148,149 < 00000930 73 20 1e 1d 0a 0a |s ....| < 00000936 --- > 00000930 73 20 1e 1d |s ..| > 00000934 What this shows is that the original file has two trailing 0a bytes at the end of the record, and that the processed file does not. This makes sense because MARC::Record was adjusted back in v1.24 (Apr 2003) to remove certain illegal characters between records that some library systems place there. See line 58 in MARC::File::USMARC in the latest version of the MARC-Record distribution if you are curious :-) So unless you are unable to reproduce this I think this mystery is solved. //Ed