On Tue, Dec 07, 2004 at 12:53:44PM -0600, John Hammer wrote:
> Attached are the two files. The Marc file seems to be using a Windows font 
> (1251?). As for the program, the same changes occur if I just read the Marc 
> file and write it back out with no changes. The Perl I am using is 5.8.3

Ok, I've confirmed that simply reading this record in and writing it out
will yield a different file. The unix diff program confirms this, but
does not isolate the difference, since MARC records are not multiline
documents. 

Using diff with hexdump provides some more concrete data. First hexdump the
original file and the processed file like so:

    % hexdump -C original.dat > original.dump
    % hexdump -C processed.dat > processed.dump

Then compare these two files with diff:

    % diff original.dump processed.dump

You should see this:

148,149c148,149
< 00000930  73 20 1e 1d 0a 0a                                 |s ....|
< 00000936
---
> 00000930  73 20 1e 1d                                       |s ..|
> 00000934

What this shows is that the original file has two trailing 0a bytes at
the end of the record, and that the processed file does not. This makes
sense because MARC::Record was adjusted back in v1.24 (Apr 2003) to
remove certain illegal characters between records that some library
systems place there. See line 58 in MARC::File::USMARC in the latest
version of the MARC-Record distribution if you are curious :-)

So unless you are unable to reproduce this I think this mystery is solved.

//Ed

Reply via email to