That's different from what I get. What I get is: 1c1 < 00000000 30 32 33 35 36 63 61 6d 20 20 32 32 30 30 34 38 |02356cam 220048| --- > 00000000 30 32 33 36 34 63 61 6d 20 20 32 32 30 30 34 38 |02364cam 220048| 21,30c21,30 <differences in the directory not shown> 105,149c105,149 < 00000680 20 1f 61 42 69 73 e5 61 f2 74 e5 69 2c 20 4d 75 | .aBis_________, Mu| < 00000690 f2 68 61 6d 6d 61 64 2e 1f 74 43 6f 6e 76 65 72 |___ammad..tConver| < ... not shown> < 00000930 73 20 1e 1d 0a 0a |s ....| < 00000936 --- > 00000680 20 1f 61 42 69 73 ef bf bd 61 ef bf bd 74 ef bf | .aBis___a___t___ > 00000690 bd 69 2c 20 4d 75 ef bf bd 68 61 6d 6d 61 64 2e |i, Mu___hammad.| < ... not shown> > 00000930 69 61 20 47 61 6c 65 27 73 20 1e 1d |ia Gale's ..| > 0000093c
How would deleting the illegal characters cause changes to the characters in lines 680 and 690 above? John On Wed, 8 Dec 2004 10:23:38 -0600 Ed Summers <[EMAIL PROTECTED]> wrote: > On Tue, Dec 07, 2004 at 12:53:44PM -0600, John Hammer wrote: > > Attached are the two files. The Marc file seems to be using a Windows font > > (1251?). As for the program, the same changes occur if I just read the Marc > > file and write it back out with no changes. The Perl I am using is 5.8.3 > > Ok, I've confirmed that simply reading this record in and writing it out > will yield a different file. The unix diff program confirms this, but > does not isolate the difference, since MARC records are not multiline > documents. > > Using diff with hexdump provides some more concrete data. First hexdump the > original file and the processed file like so: > > % hexdump -C original.dat > original.dump > % hexdump -C processed.dat > processed.dump > > Then compare these two files with diff: > > % diff original.dump processed.dump > > You should see this: > > 148,149c148,149 > < 00000930 73 20 1e 1d 0a 0a |s ....| > < 00000936 > --- > > 00000930 73 20 1e 1d |s ..| > > 00000934 > > What this shows is that the original file has two trailing 0a bytes at > the end of the record, and that the processed file does not. This makes > sense because MARC::Record was adjusted back in v1.24 (Apr 2003) to > remove certain illegal characters between records that some library > systems place there. See line 58 in MARC::File::USMARC in the latest > version of the MARC-Record distribution if you are curious :-) > > So unless you are unable to reproduce this I think this mystery is solved. > > //Ed