Opening & writing to UTF-8 files; copyright symbol again

2015-11-13 Thread Highsmith, Anne L
This is related to my previous post (9/17/2015) about deleting 035 fields after RDA-ification. Jon Gorman solved that one for me by pointing out that I probably had a problem with my perl libraries. But now, instead of creating the record from the database and writing it back to the database, I

Re: Opening & writing to UTF-8 files; copyright symbol again

2015-11-13 Thread Jon Gorman
I'll ask the easiest solution first ;). Are you sure the file 4788022.bib is in unicode and not marc-8? If it is in unicode, is the leader 09 byte set to a? I'm a bit rusty on the as_usmarc() call as well, you might want to check the docs to make sure that doesn't do something like convert it to

Re: Opening & writing to UTF-8 files; copyright symbol again

2015-11-13 Thread Jon Gorman
Ack, sorry, various copying and pasting apparently caused Google Mail to have issues. As I as was saying,before I must have hit some keystroke that I'm sure makes sense in whatever editor I was just susing: Instead of having: open(OUTPUT, ">$outfile"); ... (whole bunch o code) print OUTPUT

Opening & writing to UTF-8 files; copyright symbol again -- solution

2015-11-13 Thread Highsmith, Anne L
I should probably say, "apparent solution" 'cause character set issues never seem to end. However, combining Jon Gorman's recommendation with some Googling, I get: my $outfile='4788022.edited.bib'; open (my $output_marc, '>', $outfile) or die "Couldn't open file $!" ; binmode($output_marc, ':utf

RE: Opening & writing to UTF-8 files; copyright symbol again -- solution

2015-11-13 Thread Shelley Doljack
Hey, that’s my post! Anyways, I haven’t really looked into what your problem is, but when you said that the copyright character is getting transformed to A9 even though it is supposedly stored as C2 A9 in the database, it made me think of how there can be two UTF-8 representations for the same c