On Mar 27, 2013, at 2:20 PM, Eric Lease Morgan <emor...@nd.edu> wrote:

> A number of people have alluded to the problem of double encoding, and I'm 
> beginning to think this is true. 

When it calls as_usmarc, I think MARC::Batch tries to honor the value set in 
position #9 of the leader. In other words, if the leader is empty, then it 
tries to output records as MARC-8, and when the leader is a value of "a", it 
tries to encode the data as UTF-8.

If I employ binmode( OUTFILE, ":utf8"), and the output is already UTF-8, then 
double encoding happens. 

To test this theory, I fixed a number records in my batch. Specifically, I 
inserted the letter "a" in position #9 of the leader. I then ran my processing 
file WITHOUT the employment of binmode, and my output was correct. For example, 
look at all the glorious characters in the following URL:

  http://www.catholicresearch.net/vufind/Record/undmarc_001906501

--
Eric Lease Morgan
Hesburgh Libraries
University of Notre Dame

574/631-8604



Reply via email to