On Fri, 2008-08-15 at 13:32 -0700, Siegfried Heintze (Aditi) wrote:
> I did a little google searching and found 
> http://perl-xml.sourceforge.net/faq/#encoding_conversion. This does not look 
> like it would work for multi-byte Chinese characters.
> 
> Does anyone have some perl (or emacs lisp) code that will convert the UTF-8 
> chinese text in my XML file to entities that look like this: 安?
> 
> Thanks!
> siegfried


#!/usr/bin/perl

use strict;
use warnings;

for my $file ( @ARGV ){
  open my $fh, '<:utf8', $file or die "cannot open file $file: $!";
  while( <$fh> ){
    s/([\x7f-\x{ffffff}])/'&#'.ord($1).';'/ge;
    print;
  }
}

__END__


-- 
Just my 0.00000002 million dollars worth,
  Shawn

"Where there's duct tape, there's hope."

"Perl is the duct tape of the Internet."
        Hassan Schroeder, Sun's first webmaster


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to