Parag Kalra:
I am curious to know more on UTF and understand related issues that may
creep in my algorithm. Could someone please shed some light on it.
Can I use following:
use Encode;
while(<$sample_file_fh>){
# Encoding into utf data
$utf_data = encode("utf8", $_);
For the line above, I may think it's not right.
What you got from <$sample_file_fh> is maybe different encoding chunk,
for example,iso-8859-1,gb2312 or UTF-8 etc.
You want to translate them to Perl's internal utf8 format firstly,which
includes a utf8 flag and the data part.After translation,utf8 flag
should be on and the data part is the chunk with utf8 encoding.You do it
with the decode() function from Encode module:
my $internal_utf8 = decode("gb2312",$_); # given the data was gb2312
encoding originally
After that,you could translate the $internal_utf8 to any encoding string
you want, use the encode() function from Encode module as well:
my $output = encode("utf8",$internal_utf8); # output with UTF-8 encoding
HTH.
$data_string = $data_string.$utf_data;
}
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/