>>>>> "Chas" == Chas Owens <[EMAIL PROTECTED]> writes:

Chas> #replace anything not in lower ASCII, Damn Americans
Chas> for (my $i = 0; $i < length($file); $i++) {
Chas>         my $char = ord(substr($file, $i, 1));

Chas>         if ($char > 128) {
Chas>                 print "replacing ", chr($char), " with &#$char;\n";
Chas>                 substr($file, $i, 1) = "&#$char;";
Chas>         }
Chas> }

The problem is not the characters in [\200-\377], the problem
is that XML defaults to UTF8 unless tagged as ISO-8859-1.  So,
rather than do all this messy processing, just use the proper encoding
line:

  <?xml version='1.0' encoding='ISO-8859-1'?>

Then every compliant XML parse will handle non-american characters
just fine.  No editing required.

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

Reply via email to