>>>>> "Chas" == Chas Owens <[EMAIL PROTECTED]> writes:
Chas> #replace anything not in lower ASCII, Damn Americans
Chas> for (my $i = 0; $i < length($file); $i++) {
Chas> my $char = ord(substr($file, $i, 1));
Chas> if ($char > 128) {
Chas> print "replacing ", chr($char), " with &#$char;\n";
Chas> substr($file, $i, 1) = "&#$char;";
Chas> }
Chas> }
The problem is not the characters in [\200-\377], the problem
is that XML defaults to UTF8 unless tagged as ISO-8859-1. So,
rather than do all this messy processing, just use the proper encoding
line:
<?xml version='1.0' encoding='ISO-8859-1'?>
Then every compliant XML parse will handle non-american characters
just fine. No editing required.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!