Hi, Jan Eden wrote on 18.03.2005:
>Hi, > >I have a bunch of files in the iso-8859-1 text encoding which I want >to save (in an edited form) as UTF-8. > >I use the following line: > >use open IN => ':encoding(iso-8859-1)', OUT => ':utf8'; > >and it does not work. > >This is strange, as I use this pragma all the time, and it always >worked. > >When I open the original files, they are Latin-1 encoded. When I >comment the line above, the output files are also Latin-1 encoded. This is driving me nuts. I built a minimal example now: ___ #!/usr/bin/perl -w use strict; use HTML::Entities; use open IN => ':encoding(iso-8859-1)', OUT => ':utf8'; # This file is ISO-8859-1 encoded! my $filename = "input.htm"; open PAGE, $filename or die "Cannot open $filename"; my $content = join '', <PAGE>; close PAGE or die "Cannot close $filename"; return unless $content; $content = decode_entities($content); my $newfile = "test2.html"; open FILE, "> $newfile" or die "Cannot open $newfile"; print $content; print FILE $content; close FILE or die "Cannot close $newfile"; ___ I call the script like this ./test.pl > test.html But only test.html contains a valid UTF-8 text, test2.html has garbled non-ASCII characters. Really confusing, especially since I was somehow able to create several hundred correctly formatted files with my script earlier. I have no idea what changed, and how I was able to output UTF-8 files earlier. In the end, I will use a DBI method to store the content of my input files in a database (which still works, I just tested), but I am curious why print $content; would do something other than print FILE $content; Can someone shed a light on this? Thanks, Jan -- I'd never join any club that would have the likes of me as a member. - Groucho Marx -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>